A Learning Portal from Recruitment India
You need a distributed, scalable, data Store that allows you random, realtime read/write access to hundreds of terabytes of data. Which of the following would you use?
Hue
Pig
HBase
Flume
Answer with explanation
Answer: Option CExplanation
HBase
Workspace
What is HBase used as?
Fast MapReduce layer in Hadoop
MapReduce alternative in Hadoop
Tool for Random and Fast Read/Write operations in Hadoop
Faster Read only query engine in Hadoop
Answer with explanation
Answer: Option CExplanation
Tool for Random and Fast Read/Write operations in Hadoop
Workspace
Which property is used to enable/disable speculative execution
mapred.map.tasks.speculative.execution
mapred.reduce.tasks.speculative.execution
Both the above
None of the above
Answer with explanation
Answer: Option CExplanation
mapred.map.tasks.speculative.execution, mapred.reduce.tasks.speculative.execution
Workspace
Which of the following best describes the workings of TextInputFormat?
The input file is split exactly at the line breaks, so each Record Reader will read a series of complete lines.
Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReaders of both splits containing the brokenlin
Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader of the split that contains the beginningof the broken line.
Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader of the split that contains the end of the brokenline.
Answer with explanation
Answer: Option DExplanation
Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader of the split that contains the end of the brokenline.
Workspace
The input split used in MapReduce indicates
The average size of the data blocks used as input for the program
The location details of where the first whole record in a block begins and the last whole record in the block ends.
Splitting the input data to a MapReduce program into a size already configured in the mapred-site.xml
None of these
Answer with explanation
Answer: Option BExplanation
The location details of where the first whole record in a block begins and the last whole record in the block ends.
Workspace
Which of the following are example(s) of Real Time Big Data Processing?
Complex Event Processing (CEP) platforms
Stock market data analysis
Bank fraud transactions detection
both (A) and (C)
Answer with explanation
Answer: Option DExplanation
Complex Event Processing (CEP) platforms, Bank fraud transactions detection
Workspace
Which of the below property gets configured on mapred-site.xml?
Replication factor
Directory names to store hdfs files
Host and port where MapReduce task runs
Java Environment variables.
Answer with explanation
Answer: Option CExplanation
Host and port where MapReduce task runs
Workspace
Is there a map input format?
Yes, but only in Hadoop 0.22+.
Yes, there is a special format for map files.
No, but the sequence file input format can read map files.
Both A and B
Answer with explanation
Answer: Option CExplanation
No, but the sequence file input format can read map files.
Workspace
Hadoop was named after?
Creator Doug Cuttings favorite circus act
The toy elephant of Cuttings son
Cuttings high school rock band
A sound Cuttings laptop made during Hadoops development
Answer with explanation
Answer: Option BExplanation
The toy elephant of Cuttings son
Workspace
What mechanisms of Hadoop uses to make the name node resilient to failure?
Take a backup of filesystem metadata to a local disk and a remote NFS mount.
Store the filesystem metadata in the cloud.
Use a machine with at least 12 CPUs
Using expensive and reliable hardware.
Answer with explanation
Answer: Option AExplanation
Take a backup of filesystem metadata to a local disk and a remote NFS mount.
Workspace
Under HDFS federation
Each name node manages metadata of the entire filesystem.
Each name node manages metadata of a portion of the filesystem.
Failure of one name node causes loss of some metadata availability from the entire filesystem.
Each data node registers with each name node.
Answer with explanation
Answer: Option BExplanation
Each name node manages metadata of a portion of the filesystem.
Workspace
What is SequenceFile?
A SequenceFilecontains a binary encoding of an arbitrary number of homogeneous writable objects.
ASequenceFilecontains a binary encoding of an arbitrary number of WritableComparable objects, in sorted order.
ASequenceFilecontains a binary encoding of an arbitrary number of key-value pairs. Each key must be the same type. Each value must be the same type.
ASequenceFilecontains a binary encoding of an arbitrary number of heterogeneous writable objects.
Answer with explanation
Answer: Option CExplanation
ASequenceFilecontains a binary encoding of an arbitrary number of key-value pairs. Each key must be the same type. Each value must be the same type.
Workspace
Hadoop is a framework that works with a variety of related tools. Do Common cohorts include?
MapReduce, Hive, and HBase
MapReduce, MySQL, and Google Apps
MapReduce, Hummer, and Iguana
MapReduce, Heron, and Trumpet
Answer with explanation
Answer: Option AExplanation
MapReduce, Hive, and HBase
Workspace
Data locality feature in Hadoop means
Store the same data across multiple nodes.
Relocate the data from one node to another.
Co-locate the data with the computing nodes.
Distribute the data across multiple nodes.
Answer with explanation
Answer: Option CExplanation
Co-locate the data with the computing nodes.
Workspace
What does commodity Hardware in the Hadoop world mean?
Very cheap hardware
Industry standard hardware
Discarded hardware
Low specifications Industry grade hardware
Answer with explanation
Answer: Option DExplanation
Low specifications Industry grade hardware
Workspace
In a Hadoop cluster, what is true for a HDFS block that is no longer available due to disk corruption or machine failure?
It is lost forever
It can be replicated form its alternative locations to other live machines.
The name node allows new client requests to keep trying to read it.
The MapReduce job process runs ignoring the block and the data stored in it.
Answer with explanation
Answer: Option BExplanation
It can be replicated form its alternative locations to other live machines.
Workspace