Bcs Higher Education Qualifications BCS Level 5 Diploma in IT
Bcs Higher Education Qualifications BCS Level 5 Diploma in IT
Bcs Higher Education Qualifications BCS Level 5 Diploma in IT
(10 marks) The marks given in brackets are indicative of the weight given to each part of the question.
i) Consistency;
ii) Availability;
iii) Partition tolerance.
(9 marks)
c) Explain why only TWO of the THREE properties of the CAP theorem can be
simultaneously supported in a distributed database cluster.
(6 marks)
B6.
a) Describe the principal goals of machine learning.
(5 marks)
c) Explain, with an example, how a supervised machine learning algorithm can be used
in a data classification task.
(15 marks)
End of Examination
(page 4)
Section A A3.
Answer Section A questions in Answer Book A a) Describe the Comprehensive R Archive Network (CRAN).
(5 marks)
A1. b) It is a common view that the R platform is unsuited to dealing directly with Big Data.
a) Explain the defining characteristics of the following TWO data types: Briefly explain why this view might be taken.
(5 marks)
i) Structured data;
ii) Unstructured data. c) A vector of nine numbers is created in R, by the following script:
(10 marks)
xvar <- c (-20,7,3.5, -7,16,31, -1,11,30).
b) Explain the main issues to be considered in the processing of large volumes of fast
real time streamed data. For this vector xvar. Write R scripts using base R functions to compute the
(5 marks) following statistics:
c) Describe the basic components of the Kafka event data streaming platform. i) The Median of xvar;
(10 marks) ii) The Mean of xvar using the Trim option to remove the leading and
trailing two numbers in the vector.
(5 marks)
A2.
a) Describe TWO advantages and TWO disadvantages to outsourcing a Big Data d) Write an R script that implements your own user function to compute the statistical
project to an external supplier. Mode of a data vector. Make use of any other base R functions within your script.
(10 marks) (10 marks)
b) Big Data network infrastructure requirements have identified properties that are
crucial to effective handling of Big Data. Explain the following THREE network
properties and state how they can be optimised for handling Big Data processing:
i) Network resilience;
ii) Network partitioning;
iii) Network application awareness.
(15 marks)
[Turn Over]
(page 2) (page 3)