454U8-Big Data Analytics

Dr.G.R.
Damodaran College of Science

(Autonomous, affiliated to the Bharathiar University, recognized by the UGC)Re-
accredited at the 'A' Grade Level by the NAAC and ISO 9001:2008 Certified
CRISL rated 'A' (TN) for MBA and MIB Programmes
II MCA [2018-2021 Batch]

Semester IV
Elective II: Big Data Analytics - 454U8
Multiple Choice Questions.
1. Facebook Tackles Big Data With _______ based on Hadoop

A. Project Prism
B. Prism
C. Project Data
D. Project Bid
ANSWER: A
2. What are the 3v's of Big Data?

A. Volume
B. Variety
C. Velocity
D. all the above
ANSWER: D
3. What license is Hadoop distributed under ?

A. Apache License 2.0
B. Mozilla
C. Shareware
D. Middleware
ANSWER: A
4. Sun also has the Hadoop Live CD ________ project, which allows running a fully functional Hadoop
cluster using a live CD
A. OpenOffice.org
B. OpenSolaris
C. OpenSolaris
D. Linux
ANSWER: C
5. Which of the following genres does Hadoop produce ?

A. Distributed file system
B. JAX-RS
C. Java Message Service
D. JSP
ANSWER: A
6. What was Hadoop written in ?

A. C
B. C++
C. Java
D. JSP
ANSWER: C
7. Which of the following platforms does Hadoop run on ?

A. Bare metal
B. Debian
C. Cross-platform
D. Unix-Like
ANSWER: C
8. Hadoop achieves reliability by replicating the data across multiple hosts, and hence does not require
________ storage on hosts.
A. RAID
B. ZFS
C. Operating System
D. DFS
ANSWER: A
9. Above the file systems comes the ________ engine, which consists of one Job Tracker, to which client
applications submit MapReduce jobs.
A. MapReduce
B. Google
C. Functional Programming
D. Facebook
ANSWER: A
10. The Hadoop list includes the HBase database, the Apache Mahout ________ system, and matrix
operations.
A. Machine learning
B. Pattern recognition
C. Statistical classification
D. Artificial intelligence
ANSWER: A
11. ________ is a platform for constructing data flows for extract, transform, and load (ETL) processing
and analysis of large datasets.
A. Pig Latin
B. Oozie
C. Pig
D. Hive
ANSWER: C
12. Point out the correct statement

A. Hive is not a relational database, but a query engine that supports the parts of SQL specific to
querying data
B. Hive is a relational database with SQL support
C. Pig is a relational database with SQL support
D. All of the mentioned
ANSWER: A
13. _________ hides the limitations of Java behind a powerful and concise Clojure API for Cascading.
A. Scalding
B. HCatalog
C. Cascalog
ANSWER: C
14. Hive also support custom extensions written in :

A. C
B. C++
C. C#
D. Java
ANSWER: D
15. Point out the wrong statement

A. Amazon Web Service Elastic MapReduce (EMR) is Amazon packaged Hadoop offering
B. Elastic MapReduce (EMR) is Facebook packaged Hadoop offering
C. Scalding is a Scala API on top of Cascading that removes most Java boilerplate
ANSWER: B
16. ________ is the most popular high-level Java API in Hadoop Ecosystem
A. Scalding
B. HCatalog
C. Cascalog
D. Cascading
ANSWER: D
17. ___________ is general-purpose computing model and runtime system for distributed data analytics.
A. Mapreduce
B. Drill
C. Oozie
D. None of the mentioned
ANSWER: A
18. The Pig Latin scripting language is not only a higher-level data flow language but also has operators
similar to :
A. JSON
B. XML
C. XSL
D. SQL
ANSWER: D
19. _______ jobs are optimized for scalability but not latency
A. Mapreduce
B. Drill
C. Hive
D. Chuckro
ANSWER: C
20. ______ is a framework for performing remote procedure calls and data serialization.
A. Mapreduce
B. Dril
C. Avro
D. Chuckro
ANSWER: C
21. As companies move past the experimental phase with Hadoop, many cite the need for additional
capabilities, including
A. As companies move past the experimental phase with Hadoop, many cite the need for additional
capabilities, including
B. Improved extract, transform and load features for data integration
C. Improved data warehousing functionality
D. Improved security, workload management and SQL support
ANSWER: D

A. Hadoop do need specialized hardware to process the data
B. Hadoop 2.0 allows live stream processing of real time data
C. In Hadoop programming framework output files are divided in to lines or records
ANSWER: B
23. According to analysts, for what can traditional IT systems provide a foundation when they are
integrated with big data technologies like Hadoop ?
A. Big data management and data mining
B. Data warehousing and business intelligence
C. Management of Hadoop clusters
D. Collecting and storing unstructured data
ANSWER: A
24. Hadoop is a framework that works with a variety of related tools. Common cohorts include
A. MapReduce, MySQL and Google Apps
B. MapReduce, Hive and HBase
C. MapReduce, Hummer and Iguana
D. MapReduce, Heron and Trumpet
ANSWER: B
25. Which of the following is not an input format in Hadoop ?

A. TextInputFormat
B. ByteInputFormat
C. SequenceFileInputformat
D. KepInputFormat
ANSWER: B
26. What was Hadoop named after?

A. Creator Doug Cutting favorite circus act
B. Cutting high school rock band
C. The toy elephant of Cutting son
D. A sound Cutting laptop made during Hadoop development
ANSWER: C
27. All of the following accurately describe Hadoop, EXCEPT

A. Open source
B. Real-time
C. Java-based
D. Distributed computing approach
ANSWER: B
28. __________ can best be described as a programming model used to develop Hadoop-based
applications that can process massive amounts of data.
A. MapReduce
B. Mahout
C. Oozie
ANSWER: A
29. __________ has the world's largest Hadoop cluster.

A. Apple
B. Datamatics
C. Facebook
ANSWER: C
30. Facebook Tackles Big Data With _______ based on Hadoop.

A. Prism
B. Project Prism
C. Project Big
D. Project Data
ANSWER: B
31. A ________ node acts as the Slave and is responsible for executing a Task assigned to it by the
JobTracker.
A. MapReduce
B. Mapper
C. TaskTracker
D. JobTracker
ANSWER: C

A. Map Task in MapReduce is performed using the Mapper() function
B. Reduce Task in MapReduce is performed using the Map() function
C. All of the mentioned
D. MapReduce tries to place the data and the compute as close as possible
ANSWER: D
33. ___________ part of the MapReduce is responsible for processing one or more chunks of data and
producing the output results.
A. Maptask
B. Mapper
C. Task execution
ANSWER: A
34. _________ function is responsible for consolidating the results produced by each of the Map()
functions/tasks.
A. Map
B. Reduce
C. Reducer
D. Reduced
ANSWER: B

A. A MapReduce job usually splits the input data-set into independent chunks which are processed by
the map tasks in a completely parallel manner
B. The MapReduce framework operates exclusively on pairs
C. Applications typically implement the Mapper and Reducer interfaces to provide the map and reduce
methods
ANSWER: D
36. Although the Hadoop framework is implemented in Java ,MapReduce applications need not be written
in
A. C
B. C++
C. Java
D. VB
ANSWER: C
37. ________ is a utility which allows users to create and run jobs with any executables as the mapper
and/or the reducer.
A. HadoopStrdata
B. Hadoop Streaming
C. Hadoop Stream
ANSWER: B
38. __________ maps input key/value pairs to a set of intermediate key/value pairs.
A. Mapper
B. Reducer
C. Both Mapper and Reducer
ANSWER: A
39. The number of maps is usually driven by the total size of

A. task
B. output
C. input
D. none
ANSWER: C
40. _________ is the default Partitioner for partitioning key space

A. HashPar
B. Partitioner
C. HashPartitioner
ANSWER: C
41. Mapper implementations are passed the JobConf for the job via the ________ method
A. JobConfigure.configure
B. JobConfigurable.configure
C. JobConfigurable.configureable
ANSWER: B

A. Applications can use the Reporter to report progress
B. The HadoopMapReduce framework spawns one map task for each InputSplit generated by the
InputFormat for the job
C. The intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format
ANSWER: D
43. Input to the _______ is the sorted output of the mappers.

A. Reducer
B. Mapper
C. Shuffle
ANSWER: A
44. The right number of reduces seems to be :

A. 0.65
B. 0.55
C. 0.95
D. 0.68
ANSWER: C

A. Reducer has 2 primary phases
B. Increasing the number of reduces increases the framework overhead, but increases load balancing and
lowers the cost of failures
C. It is legal to set the number of reduce-tasks to zero if no reduction is desired
D. The framework groups Reducer inputs by keys (since different mappers may have output the same
key) in sort stage
ANSWER: A
46. The output of the _______ is not sorted in the Mapreduce framework for Hadoop.
A. Mapper
B. Cascader
C. Scalding
ANSWER: D
47. Which of the following phases occur simultaneously ?
A. Reduce and Sort
B. Shuffle and Sort
C. Shuffle and Map
ANSWER: B
48. Mapper and Reducer implementations can use the ________ to report progress or just indicate that they
are alive.
A. Partitioner
B. OutputCollector
C. Reporter
ANSWER: C
49. __________ is a generalization of the facility provided by the MapReduce framework to collect data
output by the Mapper or the Reducer
A. Partitioner
B. OutputCollector
C. Reporter
ANSWER: B
50. _________ is the primary interface for a user to describe a MapReduce job to the Hadoop framework
for execution.
A. Map Parameters
B. JobConf
C. MemoryConf
ANSWER: B
51. A ________ serves as the master and there is only one NameNode per cluster
A. Data Node
B. NameNode
C. Data block
D. Replication
ANSWER: B

A. DataNode is the slave/worker node and holds the user data in the form of Data Blocks
B. Each incoming file is broken into 32 MB by default
C. Data blocks are replicated across different nodes in the cluster to ensure a low degree of fault
tolerance
ANSWER: A
53. HDFS works in a __________ fashion

A. master-worker
B. master-slave
C. worker/slave.
ANSWER: A
54. ________ NameNode is used when the Primary NameNode goes down.
A. Rack
B. Data
C. Secondary
D. None
ANSWER: C

A. Replication Factor can be configured at a cluster level (Default is set to 3) and also at a file level
B. Block Report from each DataNode contains a list of all the blocks that are stored on that DataNode
C. User data is stored on the local file system of DataNodes
D. DataNode is aware of the files to which the blocks stored on it belong to
ANSWER: D
56. Which of the following scenario may not be a good fit for HDFS?
A. HDFS is not suitable for scenarios requiring multiple/simultaneous writes to the same file
B. HDFS is suitable for storing data related to applications requiring low latency data access
C. HDFS is suitable for storing data related to applications requiring high latency data access
ANSWER: A
57. The need for data replication can arise in various scenarios like :
A. Replication Factor is changed
B. DataNode goes down
C. Data Blocks get corrupted
ANSWER: D
58. ________ is the slave/worker node and holds the user data in the form of Data Blocks
A. DataNode
B. NameNode
C. Data block
D. Replication
ANSWER: A
59. HDFS provides a command line interface called __________ used to interact with HDFS.
A. HDFS Shell
B. FS Shell
C. DFSA Shell
D. None
ANSWER: B
60. HDFS is implemented in _____________ programming language

A. C++
B. Java
C. Scala
D. None
ANSWER: B
61. ___________ is the world's most complete, tested, and popular distribution of Apache Hadoop and
related projects.
A. MDH
B. CDH
C. ADH
D. BDH
ANSWER: B

A. Cloudera is also a sponsor of the Apache Software Foundation
B. CDH is 100% Apache-licensed open source and is the only Hadoop solution to offer unified batch
processing, interactive SQL, and interactive search, and role-based access controls
C. More enterprises have downloaded CDH than all other such distributions combined
ANSWER: D
63. Cloudera ___________ includes CDH and an annual subscription license (per node) to Cloudera
Manager and technical support.
A. Enterprise
B. Express
C. Standard
D. All the above
ANSWER: A
64. Cloudera Express includes CDH and a version of Cloudera ___________ lacking enterprise features
such as rolling upgrades and backup/disaster recovery
A. Enterprise
B. Express
C. Standard
D. Manager
ANSWER: D

A. CDH contains the main, core elements of Hadoop
B. In October 2012, Cloudera announced the Cloudera Impala project
C. CDH may be downloaded from Cloudera's website at no charge
ANSWER: D
66. Cloudera Enterprise comes in ___________ edition .

A. One
B. Two
C. Three
D. Four
ANSWER: C
67. __________ is a online NoSQL developed by Cloudera.

A. HCatalog
B. Hbase
C. Imphala
D. Oozie
ANSWER: B
68. _______ is an open source set of libraries, tools, examples, and documentation engineered.
A. Kite
B. Kize
C. Ookie
ANSWER: A
69. To configure short-circuit local reads, you will need to enable ____________ on local Hadoop.
A. librayhadoop
B. libhadoop
C. libhad
D. hadoop
ANSWER: B
70. CDH process and control sensitive data and facilitate

A. multi-tenancy
B. flexibilty
C. scalabilty
D. resuability
ANSWER: A
71. _______ can change the maximum number of cells of a column family
A. set
B. reset
C. alter
D. connect
ANSWER: C

A. You can add a column family to a table using the method addColumn()
B. Using alter, you can also create a column family
C. Using disable-all, you can truncate a column family
ANSWER: A
73. Which of the following is not a table scope operator ?

A. MEMSTORE_FLUSH
B. MEMSTORE_FLUSHSIZE
C. MAX_FILESIZE
ANSWER: A
74. You can delete a column family from a table using the method _________ of HBAseAdmin class.
A. delColumn()
B. removeColumn()
C. deleteColumn()
ANSWER: A
A. To read data from an HBase table, use the get() method of the HTable class
B. You can retrieve data from the HBase table using the get() method of the HTable class
C. While retrieving data, you can get a single row by id, or get a set of rows by a set of row ids, or scan
an entire table or a subset of rows
ANSWER: D
76. __________ class adds HBase configuration files to its object.

A. Configuration
B. Collector
C. Component
ANSWER: A
77. The ________ class provides the getValue() method to read the values from its instance
A. Get
B. Result
C. Put
D. Value
ANSWER: B
78. ________ communicate with the client and handle data-related operations.
A. Master Server
B. Region Server
C. Htable
ANSWER: B
79. _________ is the main configuration file of HBase.

A. hbase.xml
B. hbase-site.xml
C. hbase-site-conf.xml
ANSWER: B
80. HBase uses the _______ File System to store its data
A. Hive
B. Impala
C. Hadoop
D. Scala
ANSWER: C
81. Which of the following is a principle of analytic graphics?

A. Don't plot more than two variables at at time
B. Make judicious use of color in your scatterplots
C. Show box plots (univariate summaries)
D. Show causality, mechanism, explanation
ANSWER: D
82. What is the role of exploratory graphs in data analysis ?

A. They are made for formal presentations.
B. They are typically made very quickly.
C. Axes, legends, and other details are clean and exactly detailed.
D. They are used in place of formal modeling.
ANSWER: B
83. Which of the following is true about the base plotting system ?
A. Margins and spacings are adjusted automatically depending on the type of plot and the data
B. Plots are typically created with a single function call
C. Plots are created and annotated with separate functions
D. The system is most useful for conditioning plots
ANSWER: C
84. Which of the following is an example of a valid graphics device in R?

A. A socket connection
B. A Microsoft Word document
C. A PDF file
D. A file folder
ANSWER: C
85. Which of the following is an example of a vector graphics device in R?

A. JPEG
B. GIF
C. PNG
D. SVG
ANSWER: D
86. Bitmapped file formats can be most useful for

A. Plots that may need to be resized
B. Plots that require animation or interactivity
C. Plots that are not scaled to a specific resolution
D. Scatterplots with many many points
ANSWER: D
87. Which of the following functions is typically used to add elements to a plot in the base graphics system
A. lines()
B. hist()
C. plot()
D. boxplot()
ANSWER: D
88. Which function opens the screen graphics device for the Mac ?
A. bitmap()
B. quartz()
C. pdf()
D. png()
ANSWER: B
89. What does the 'pch' option to par() control ?

A. the size of the plotting symbol in a scatterplot
B. the line width in the base graphics system
C. the orientation of the axis labels on the plot
D. the plotting symbol/character in the base graphics system
ANSWER: D
90. MapReduce was devised by______________

A. Apple
B. Google
C. Facebook
D. Samsung
ANSWER: B
91. _____ programming language is a dialect of S.

A. B
B. C
C. R
D. K
ANSWER: C
92. Point out the WRONG statement

A. Early versions of the S language contain functions for statistical modeling
B. The book Programming with Data by John Chambers documents S version of the language
C. In 1993 Bell Labs gave StatSci (later Insightful Corp.) an exclusive license to develop and sell the S
language
ANSWER: A
93. In 2004, ________ purchased the S language from Lucent for $2 million
A. Insightful
B. Amazon
C. IBM
D. All the above
ANSWER: A
94. In 1991, R was created by Ross Ihaka and Robert Gentleman in the Department of Statistics at the
University of _________.
A. John Hopkins
B. California
C. Harvard
D. Auckland
ANSWER: D

A. R is a language for data analysis and graphics
B. K is language for statistical modelling and graphics
C. One key limitation of the S language was that it was only available in a commercial package, S-PLUS
ANSWER: B
96. Finally, in _________ R version 1.0.0 was released to the public.

A. 2000
B. 2005
C. 2010
D. 2012
ANSWER: A
97. R is technically much closer to the Scheme language than it is to the original _____ language.
A. B
B. C
C. R
D. S
ANSWER: D
98. The R-help and _____ mailing lists have been highly active for over a decade now
A. R-mail
B. R-devel
C. R-dev
D. R-d
ANSWER: B
99. Which of the following describes R language ?

A. Free
B. Paid
C. Available for free trial only
D. None of the above
ANSWER: A
100. The copyright for the primary source code for R is held by the ______ Foundation.
A. A
B. C
C. C++
D. R
ANSWER: D
101. They primary R system is available from the ______

A. CRWO
B. CRAN
C. CZOO
D. GNU
ANSWER: B

A. Key feature of R was that its syntax is very similar to S
B. R has been reported to be running on modern tablets, phones, PDAs, and game consoles
C. R runs only on Windows computing platform and operating system
ANSWER: C
103. R functionality is divided into a number of ________

A. Packages
B. Functions
C. Domains
D. Sub Domains
ANSWER: A
104. The _________ R system contains, among other things, the base package which is required to run R
A. root
B. child
C. base
D. none of the above
ANSWER: C

A. One nice feature that R shares with many popular open source projects is frequent releases
B. R has sophisticated graphics capabilities
C. S's base graphics system allows for very fine control over essentially every aspect of a plot or graph.
D. All the above
ANSWER: C
106. Which of the following is a base package for R language ?

A. util
B. lang
C. tools
D. all the above
ANSWER: C
107. Which of the following is "Recommended" package in R ?

A. util
B. lang
C. stats
D. spatial
ANSWER: D
108. How many packages exist in R language for statistics ?

A. 2000
B. 3000
C. 4000
D. all the above
ANSWER: D
109. Advanced users can write ___ code to manipulate R objects directly.
A. C
B. C++
C. Java
ANSWER: A
110. Which of the following is used for Statistical analysis in R language ?

A. RStudio
B. Studio
C. Heck
ANSWER: A
111. The most convenient way to use R is at a graphics workstation running a ________ system.
A. windowing
B. running
C. interfacing
ANSWER: A

A. Setting up a workstation to take full advantage of the customizable features of R is a straightforward
thing
B. q() is used to quit the R program
C. R has an inbuilt help facility similar to the man facility of UNIX
ANSWER: B
113. Which of the following is default prompt for UNIX environment ?

A. >
B. <<
C. <<
D. <
ANSWER: A
114. . Point out the wrong statement

A. Windows versions of R have other optional help system also
B. The help.search command (alternatively ??) allows searching for help in various ways
C. R is case insensitive as are most UNIX based packages, so A and a are different symbols and would
refer to different variables
ANSWER: C
115. Which of the following statement is alternative to ?solve

A. help(solve)
B. man(solve)
C. hel(solve)
ANSWER: A
116. Elementary commands in R consist of either _______ or assignments.

A. utilstats
B. language
C. expressions
ANSWER: C
117. If a command is not complete at the end of a line, R will give a different prompt, by default it is :
A. *
B. -
C. +
D. All the above
ANSWER: C
118. Command lines entered at the console are limited to about ________ bytes
A. 3000
B. 4095
C. 5000
D. None
ANSWER: B
119. ._____ text editor provides more general support mechanisms via ESS for working interactively with
R.
A. EAC
B. Emac
C. Shell
D. None
ANSWER: B
120. What would be the result of following R code ? > x <- 1 >print(x)
A. 1
B. 2
C. 3
D. 4
ANSWER: A

A. The grammar of the language determines whether an expression is complete or not
B. The <- symbol is the assignment operator in R
C. The ## character indicates a comment
ANSWER: C
122. ___________ is the world's largest Hadoop Cluster.

A. Apple
B. Datamatics
C. Facebook
D. None of the above
ANSWER: C
123. Files containing R scripts ends with extension :

A. S
B. R
C. B
D. P
ANSWER: B

A. : operator is used to create integer sequences
B. The numbers in the square brackets are part of the vector itself
C. There is a difference between the actual R object and the manner in which that R object is printed to
the console
D. The numbers in the curve brackets are part of the vector itself
ANSWER: B
125. If commands are stored in an external file, say commands.R in the working directory work, they may
be executed at any time in an R session with the command :
A. source("commands.R")
B. exec("commands.R")
C. execute("commands.R")
ANSWER: A
126. _______ will divert all subsequent output from the console to an external file.
A. sink
B. div
C. dip
D. exp
ANSWER: A
127. The entities that R creates and manipulates are known as ________
A. task
B. objects
C. function
D. expression
ANSWER: B
128. Which of the following can be used to display the names of (most of) the objects which are currently
stored within R ?
A. object()
B. objects()
C. list()
D. none of the above
ANSWER: B
129. Collection of objects currently stored in R is called as :

A. package
B. workspace
C. list
D. array
ANSWER: B
130. What will be the output of following code snippet ? > paste("a", "b", se = ":")
A. a+b
B. a-b
C. ab
D. none
ANSWER: D

A. In R, a function is an object which has the mode function
B. R interpreter is able to pass control to the function, along with arguments that may be necessary for
the function to accomplish the actions that are desired
C. Functions are also often written when code must be shared with others or the public
ANSWER: D
132. The __________ function returns a list of all the formal arguments of a function
A. formals()
B. funct()
C. formal()
ANSWER: A

A. A formal argument can be a symbol, a statement of the form 'symbol = expression', or the special
formal argument
B. The first component of the function declaration is the keyword function
C. The value returned by the call to function is not a function
ANSWER: A
134. You can check to see whether an R object is NULL with the _________ function.
A. is.nullobj()
B. null()
C. is.null()
D. obj.null()
ANSWER: C
135. Which of the following code will print NULL ?

A. > args(pastebin)
B. > args(paste)
C. > arg(paste)
D. > argc(paste)
ANSWER: B
136. What are the main components of Big Data?

A. MapReduce
B. HDFS
C. YARN
D. all the above
ANSWER: D
137. What are the different features of Big Data Analytics?

A. Open Source
B. Data Recovery
C. Scalability
D. all of the above
ANSWER: D
138. For YARN, the ___________ Manager UI provides host and port information.
A. Data Node
B. NameNode
C. Resource
D. Replication
ANSWER: C
A. The Hadoop framework publishes the job flow status to an internally running web server on the
master nodes of the Hadoop cluster
B. Each incoming file is broken into 32 MB by default
C. Data blocks are replicated across different nodes in the cluster to ensure a low degree of fault
tolerance
ANSWER: A
140. For ________, the HBase Master UI provides information about the HBase Master uptime.
A. Oozie
B. HBase
C. Kafka
D. Afka
ANSWER: B
141. __________ is a standard Java API for monitoring and managing applications.
A. JVM
B. JVN
C. JMX
D. JMY
ANSWER: C
142. __________ Manager's Service feature monitors dozens of service health and performance metrics
about the services and role instances running on your cluster.
A. Microsoft
B. Cloudera
C. Amazon
D. None of the abovc
ANSWER: B
143. The IBM _____________ Platform provides all the foundational building blocks of trusted
information, including data integration, data warehousing, master data management, big data and
information governance.
A. InfoStream
B. InfoSphere
C. InfoSurface
D. InfoSurface
ANSWER: A

A. IBM InfoSphere DataStage is an ETL tool
B. IBM InfoSphere DataStage is an ETL tool
C. InfoSphere uses a graphical notation to construct data integration solutions
ANSWER: D
145. InfoSphere DataStage has __________ levels of Parallelism

A. 1
B. 2
C. 3
D. 4
ANSWER: C
146. InfoSphere DataStage uses a client/server design where jobs are created and administered via a
________ client against central repository on a server
A. Ubuntu
B. Windows
C. Debian
D. Solaris
ANSWER: B
147. ___________ is used for processing complex transactions and messages,

A. PS
B. Server Edition
C. MVS Edition
D. TX
ANSWER: C
148. DataStage originated at __________, a company that developed two notable products: UniVerse
database and the DataStage ETL tool.
A. VMark
B. Vzen
C. Hatez
D. SMark
ANSWER: A
149. DataStage RTI is real time integration pack for :

A. STD
B. ISD
C. EXD
D. FSD
ANSWER: B
150. NameNode is monitored and upgraded in a __________ transition.

A. safemode
B. securemode
C. servicemode
D. servicemonitor
ANSWER: B
Staff Name
Suguna M .

454U8-Big Data Analytics

Uploaded by

Copyright:

Available Formats

454U8-Big Data Analytics

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

454U8-Big Data Analytics

Uploaded by

Copyright:

Available Formats

Dr.G.R.

Damodaran College of Science

II MCA [2018-2021 Batch]

1. Facebook Tackles Big Data With _______ based on Hadoop

2. What are the 3v's of Big Data?

3. What license is Hadoop distributed under ?

5. Which of the following genres does Hadoop produce ?

6. What was Hadoop written in ?

7. Which of the following platforms does Hadoop run on ?

12. Point out the correct statement

14. Hive also support custom extensions written in :

15. Point out the wrong statement

22. Point out the correct statement

25. Which of the following is not an input format in Hadoop ?

26. What was Hadoop named after?

27. All of the following accurately describe Hadoop, EXCEPT

29. __________ has the world's largest Hadoop cluster.

30. Facebook Tackles Big Data With _______ based on Hadoop.

32. Point out the correct statement

35. Point out the wrong statement

39. The number of maps is usually driven by the total size of

40. _________ is the default Partitioner for partitioning key space

42. Point out the correct statement

43. Input to the _______ is the sorted output of the mappers.

44. The right number of reduces seems to be :

45. Point out the wrong statement

52. Point out the correct statement

53. HDFS works in a __________ fashion

55. Point out the wrong statement

60. HDFS is implemented in _____________ programming language

62. Point out the correct statement

65. Point out the wrong statement

66. Cloudera Enterprise comes in ___________ edition .

67. __________ is a online NoSQL developed by Cloudera.

70. CDH process and control sensitive data and facilitate

72. Point out the correct statement

73. Which of the following is not a table scope operator ?

76. __________ class adds HBase configuration files to its object.

79. _________ is the main configuration file of HBase.

81. Which of the following is a principle of analytic graphics?

82. What is the role of exploratory graphs in data analysis ?

84. Which of the following is an example of a valid graphics device in R?

85. Which of the following is an example of a vector graphics device in R?

86. Bitmapped file formats can be most useful for

89. What does the 'pch' option to par() control ?

90. MapReduce was devised by______________

91. _____ programming language is a dialect of S.

92. Point out the WRONG statement

95. Point out the wrong statement

96. Finally, in _________ R version 1.0.0 was released to the public.

99. Which of the following describes R language ?

101. They primary R system is available from the ______

102. Point out the wrong statement

103. R functionality is divided into a number of ________

105. Point out the wrong statement

106. Which of the following is a base package for R language ?

107. Which of the following is "Recommended" package in R ?

108. How many packages exist in R language for statistics ?

110. Which of the following is used for Statistical analysis in R language ?