Big Data Analytics Quiz For AKTU Part 2

Big Data Solutions MCQ With Answer

1. Which one of the following is false about Hadoop?
a. It is a distributed framework
b. The main algorithm used in it is
Map Reduce
c. It runs with commodity hardware
d. All are true
Answer: (d)

2. What license is Apache Hadoop distributed under?
a. Apache License 2.0
b. Shareware
c. Mozilla Public License
d. Commercial
Answer: (a)

3. Which of the following platforms does Apache Hadoop run on ?
a. Bare metal
b. Unix-like
c. Cross-platform
d. Debian
Answer: (c)

4. Apache Hadoop achieves reliability by replicating the data across multiple hosts and hence does not require ________ storage on
hosts.
a. Standard RAID levels
b. RAID
c. ZFS
d. Operating system
Answer: Option (b)

5. Hadoop works in
a. master-worker fashion
b. master – slave fashion
c. worker/slave fashion
d. All of the mentioned
Answer: (b)

6. Which type of data Hadoop can deal with is
a. Structured
b. Semi-structured
c. Unstructured
d. All of the above
Answer: (d)

7. Which statement is false about Hadoop
a. It runs with commodity hardware
b. It is a part of the Apache project
sponsored by the ASF
c. It is best for live streaming of data
d. None of the above
Answer: (c)

8. As compared to RDBMS, Apache Hadoop
a. Has higher data Integrity
b. Does ACID transactions
c. Is suitable for read and write
many times
d. Works better on unstructured
and semi-structured data.
Answer: (d)

9. Hadoop can be used to create distributed clusters, based on commodity servers, that provide low-cost processing and storage for
unstructured data
a. True
b. False
Answer: (a)

10. ______ is a framework for performing remote procedure calls and data serialization.
a. Drill
b. BigTop
c. Avro
d. Chukwa
Answer: (c)

11. IBM and ________ have announced a major initiative to use Hadoop to support university courses in distributed computer programming.
a. Google Latitude
b. Android (operating system)
c. Google Variations
d. Google
Answer: (d)

12. What was Hadoop written in?
a. Java (software platform)
b. Perl
c. Java (programming language)
d. Lua (programming language)
Answer: (c)

13. Apache _______ is a serialization framework that produces data in a compact binary format.
a. Oozie
b. Impala
c. Kafka
d. Avro
Answer: (d)

14. Avro schemas describe the format of the message and are defined using ______________
a. JSON
b. XML
c. JS
d. All of the mentioned
Answer: (a)

15. In which all languages you can code in Hadoop
a. Java
b. Python
c. C++
d. All of the above
Answer: (d)

16. All of the following accurately describe Hadoop, EXCEPT
a. Open source
b. Real-time
c. Java-based
d. Distributed computing approach
Answer: (b)

17. __________ has the world’s largest Hadoop cluster.
a. Apple
b. Datamatics
c. Facebook
d. None of the mentioned
Answer: (c)

18. Which among the following is the default OutputFormat?
a. SequenceFileOutputFormat
b. LazyOutputFormat
c. DBOutputFormat
d. TextOutputFormat
Answer: (d)

19. Which of the following is not an input format in Hadoop?
a. ByteInputFormat
b. TextInputFormat
c. SequenceFileInputFormat
d. KeyValueInputFormat
Answer: (a)

20. What is the correct sequence of data flow in MapReduce?
a. InputFormat
b. Mapper
c. Combiner
d. Reducer
e. Partitioner

21. In which InputFormat tab character (‘/t’) is used
a. KeyValueTextInputFormat
b. TextInputFormat
c. FileInputFormat
d. SequenceFileInputFormat
Answer: (a)

22. Which is key and value in TextInputFormat
a. Key- byte offset Value- It is the
contents of the line
b. Key- Everything up to tab
character Value- Remaining part
of the line after tab character
c. Key and value- Both are userdefined
d. None of the above
Answer: (a)

23. Which of the following are Built-In Counters in
Hadoop?
a. FileSystem Counters
b. FileInputFormat Counters
c. FileOutputFormat counters
d. All of the above
Answer: (d)

24. Which of the following is not an output format
in Hadoop?
a. TextoutputFormat
b. ByteoutputFormat
c. SequenceFileOutputFormat
d. DBOutputFormat
Answer: (b)

25. Is it mandatory to set input and output
type/format in Hadoop MapReduce?
a. Yes
b. No
Answer: (b)

26. The parameters for Mappers are:
a. text (input)
b. LongWritable(input)
c. text (intermediate output)
d. All of the above
Answer: (d)

27. For 514 MB file how many InputSplit will be
created
a. 4
b. 5
c. 6
d. 10
Answer: (b)

28. Which among the following is used to provide
multiple inputs to Hadoop?
a. MultipleInputs class
b. MultipleInputFormat
c. FileInputFormat
d. DBInputFormat
Answer: (a)

29. The Mapper implementation processes one line
at a time via _________ method.
a. map
b. reduce
c. mapper
d. reducer
Answer: (a)

30. The Hadoop MapReduce framework spawns
one map task for each __________ generated
by the InputFormat for the job.
a. OutputSplit
b. InputSplit
c. InputSplitStream
d. All of the mentioned
Answer: (b)

31. __________ can best be described as a
programming model used to develop Hadoopbased applications that can process massive
amounts of data.
a. MapReduce
b. Mahout
c. Oozie
d. All of the mentioned
Answer: (a)

32. ___________ part of the MapReduce is
responsible for processing one or more chunks
of data and producing the output results.
a. Maptask
b. Mapper
c. Task execution
d. All of the mentioned
Answer: (a)

33. ________ function is responsible for
consolidating the results produced by each of
the Map() functions/tasks.
a. Map
b. Reduce
c. Reducer
d. Reduced
Answer: (b)

34. The number of maps is usually driven by the
total size of
a. task
b. output
c. input
d. none
Answer: (c)

35. The right number of reduces seems to be :
a. 0.65
b. 0.55
c. 0.95
d. 0.68
Answer: (c)

36. Mapper and Reducer implementations can use
the ________ to report progress or just indicate
that they are alive.
a. Partitioner
b. OutputCollector
c. Reporter
d. All of the mentioned
Answer: (c)

37. The major components in the Hadoop 2.0 are:
a. 2
b. 3
c. 4
d. 5
Answer: (b)

38. Which of the statement is true about PIG.
a. Pig is also a data ware house system used
for analysing the Big Data Stored in the
HDFS
b. .It uses the Data Flow Language for
analysing the data
c. a and b
d. Relational Database Management System
Answer: (c)

39. Which of the following platforms does Hadoop
run on?
a. Bare metal
b. Debian
c. Cross-platform
d. Unix-like
Answer: (c)

40. The Hadoop list includes the HBase database,
the Apache Mahout ________ system, and
matrix operations.
a. Machine learning
b. Pattern recognition
c. Statistical classification
d. Artificial intelligence
Answer: (a)

41. Which of the Node serves as the master and
there is only one NameNode per cluster.
a. Data Node
b. NameNode
c. Data block
d. Replication
Answer: (b)

42. HDFS consists as the
a. master-worker
b. master node and slave node
c. worker/slave
d. all of the mentioned
Answer: (b)

43. The name node used, when the secondary node
get failed is .
a. Rack
b. Data node
c. Secondary node
d. None of the mentioned
Answer: (c)

44. Which of the following scenario may not be a
good fit for HDFS?
a. HDFS is not suitable for scenarios
requiring multiple/simultaneous writes
to the same file
b. HDFS is suitable for storing data related to
applications requiring low latency data
access
c. HDFS is suitable for storing data related to
applications requiring low latency data
access
d. None of the mentioned
Answer: (a)

45. The need for data replication occurs:
a. Replication Factor is changed
b. DataNode goes down
c. Data Blocks get corrupted
d. All of the mentioned
Answer: (d)

46. HDFS uses only one language for
implementation:
a. C++
b. Java
c. Scala
d. None of the Above
Answer: (d)

47. In YARN which node is responsible for
managing the resources
a. Data Node
b. NameNode
c. Resource Manager
d. Replication
Answer: (c)

48. As Hadoop framework is implemented in Java,
MapReduce applications are required to be
written in Java Language
a. True
b. False
Answer: (b)

49. _________ maps input key/value pairs to a set
of intermediate key/value pairs.
a. Mapper
b. Reducer
c. Both Mapper and Reducer
d. None of the mentioned
Answer: (d)

50. The number of maps is usually driven by the
total size of ___________
a. Inputs
b. Outputs
c. Tasks
d. None of the mentioned
Answer: (a)

51. which of the File system is used by HBase
a. Hive
b. Imphala
c. Hadoop
d. Scala
Answer: (c)

52. The information mapping data blocks with their
corresponding files is stored in
a. Namenode
b. Datanode
c. Job Tracker
d. Task Tracker
Answer: (a)

53. In HDFS the files cannot be
a. read
b.deleted
c. excuted
d.archived
Answer: (d)

54. The datanode and namenode are, respectiviley,
which of the following?
a.Slave and Master nodes
b.Master and Worker nodes
c. Both worker nodes
d.both master nodes
Answer: (a)

55. Hadoop is a framework that works with a
variety of related tools. Common cohorts
include
a. MapReduce, Hive and HBase
b.MapReduce, MySQL and Google Apps
c. MapReduce, Hummer and Iguana
d.MapReduce, Heron and Trumpet
Answer: (a)

56. Hadoop was named after?
a. Creator Doug Cuttings favorite circus act
b.The toy elephant of Cuttings son
c. Cuttings high school rock band
d.A sound Cuttings laptop made during
Hadoops development
Answer: (b)

57. All of the following accurately describe
Hadoop, EXCEPT:
a. Open source
b.Java-based
c. Distributed computing approach
d.Real-time
Answer: (d)

58. Hive also support custom extensions written in
:
a. C
b.C#
c. C++
d.Java
Answer: (d)

59. The Pig Latin scripting language is not only a
higher-level data flow language but also has
operators similar to :
a. JSON
b. XML
c. SQL
d.Jquer
Answer: (c)

60. In comparison to Rational DBMS, Hadoop
a. A – Has higher data In
b. B – Does ACID transactions
c. C – IS suitable for read and write many
times
d. D – Works better on unstructured and
semi-structured data.
Answer: (d)

61. The Files in HDFS are ment for
a. Low latency data access
b. Multiple writers and modifications at
arbitrary offsets.
c. Only append at the end of file
d. Writing into a file only once.
Answer: (b)

62. The main role of the secondary namenode is
to
a. Copy the filesystem metadata from
primary namenode.
b. Copy the filesystem metadata from
NFS stored by primary namenode
c. Monitor if the primary namenode is up
and running.
d. Periodically merge the namespace image
with the edit log.
Answer: (b)

63. The MapReduce algorithm contains three
important tasks, namely __________.
a. Splitting, mapping, reducing
b.scanning, mapping, Reduction
c. Map, Reduction, decluttering
d. Cleaning, Map, Reduce
Answer: (a)

64. In how many stages the MapReduce program
executes?
a. 2
b. 3
c. 4
d. 5
Answer: (d)

65. What is the function of Mapper in the
MapReduce?
a. Splitting the Data File
b. Job
c. Scanning the subblock of files
d. PayLoad
Answer: (c)

66. Although the Hadoop framework is
implemented in Java, MapReduce applications
need be written in _______
a. C
b. C#
c. Java
d. None of the above
Answer: (d)

67. What is the meaning of commodity Hardware in
Hadoop
a. Very cheap hardware
b. Industry standard hardware
c. Discarded hardware
d. Low specifications Industry grade
hardware
Answer: (d)

68. Which of the following are true for Hadoop?
a. It’s a tool for Big Data analysis
b. It supports structured and unstructured
data analysis
c. It aims for vertical scaling out/in scenarios
d. Both (a) and (b)
Answer: (d)

69. Which of the following are the core components
of Hadoop 2.0?
a. HDFS
b. Map Reduce
c. YARN
d. all the above
Answer: (d)

70. Pogramming Language is used for real time
queries.
a. TRUE
b. FALSE
Answer: (b)

71. What is the default HDFS block size for Hadoop
2.0?
a. 32 MB
b. 128 MB
c. 128 KB
d. 64 MB
Answer: (b)

72. Which of the following phases occur
simultaneously ?
a. Shuffle and Sort
b. Reduce and Sort
c. Shuffle and Map
d. All of the mentioned
Answer: (a)

73. Major Components of Hadoop 1.0 are:
a. HDFS and MapReduce
b. Map Reduce, HDFS and YARN
c. YARN and HDFS
d. None of Above
Answer: (a)

PART 3