A Learning Portal from Recruitment India
Which of the following characteristic of big data is relatively more concerned to data science?
Raw data is the data obtained after processing steps
Raw data is original source of data
Preprocessed data is original source of data
None of the Mentioned
Answer with explanation
Answer: Option BExplanation
Raw data is original source of data
Workspace
In NLP, stemming is a technique to:
split words, phrases, idioms
determine where one word ends and other begins
discover topics in a collection of documents
map valid word root
Answer with explanation
Answer: Option DExplanation
map valid word root
Workspace
Which of the following form the basis for frequency interpretation of probabilities ?
Asymptotics
Symptotics
Asymmetry
All of the Mentioned
Answer with explanation
Answer: Option AExplanation
Asymptotics is the term for the behavior of statistics as the sample size.
Workspace
Who is a data scientist?
Software programmer
Statistician
Mathematician
All of the above
Answer with explanation
Answer: Option DExplanation
Data scientists work on a huge amount of data points (unstructured and structured), and use their math, statistics, and programming skills to clean and organize them. So, essentially, they are all these.
Workspace
Point out the correct statement:
shiny project is a directory containing at least three parts
shiny project is a file containing at least three parts
shiny project consist is a directory containing only one part
None of the Mentioned
Answer with explanation
Answer: Option DExplanation
shiny project consist is a directory containing at least two parts.
Workspace
Which function will you use from the NumPy library to convert the angles from degrees to radians?
radians(x[, out])
degree(x[, out])
convert(x[, out])
rad2deg(x[, out])
Answer with explanation
Answer: Option DExplanation
To convert an angle from degrees to radians, a rad2deg function is used.
Workspace
What will be the result in vector addition if labels are not found in a series?
Will be marked as Zeros
Will be skipped
Will be marked as NaN
Will throw an exception, index not found
Answer with explanation
Answer: Option CExplanation
The missing labels will be marked as NaN (Not a Number).
Workspace
Which of the following makes use of pandas and returns data in a Series or DataFrame?
pandaSDMX
OutPy
freedapi
None of the Mentioned
Answer with explanation
Answer: Option CExplanation
freedapi module requires a FRED API key that you can obtain for free on the FRED website.
Workspace
Which of the following is another name for raw data ?
eggy data
destination data
secondary
Machine Learning
Answer with explanation
Answer: Option AExplanation
Although raw data has the potential to become “information,” extraction, organization, and sometimes analysis and formatting for presentation are required for that to occur.
Workspace
In statistics, a Type II error occurs when:
a hypothesis is chosen incorrectly
a test statistic is incorrect
a null hypothesis is rejected but should not be rejected
a null hypothesis is not rejected but should be rejected
Answer with explanation
Answer: Option DExplanation
a null hypothesis is not rejected but should be rejected
Workspace
What type of chart should we use if we have estimated a set of data and want to plot the uncertainty of the estimation?
Heap map
3D surface
Contour plot
Error bar plot
Answer with explanation
Answer: Option DExplanation
An Error bar plot is used to visualize the uncertainty of data. Whereas, 3D surface is for visualizing 3D functions.
Workspace
Which of the following takes a dict of dicts or a dict of array-like sequences and returns a DataFrame?
DataFrame.from_items
DataFrame.from_records
DataFrame.from_dict
All of the Mentioned
Answer with explanation
Answer: Option AExplanation
DataFrame.from_dict operates like the DataFrame constructor except for the orient parameter which is ‘columns’ by default.
Workspace