Free E20-007 Sample Questions — Data Science and Big Data Analytics

Free E20-007 sample questions for the Data Science and Big Data Analytics exam. No account required: study at your own pace.

Want an interactive quiz? Take the full E20-007 practice test

Looking for more? Click here to get the full PDF with 39+ practice questions for $5 for offline study and deeper preparation.

Question 1

Which word or phrase completes the statement? Business Intelligence is to monitoring trends as Data Science is to ________ trends.

  • A. Predicting
  • B. Discarding
  • C. Driving
  • D. Optimizing
Show Answer
Correct Answer:
  • C. Driving
  • D. Optimizing
Question 2

Data visualization is used in the final presentation of an analytics project. For what else is this technique commonly used?

  • A. Data exploration
  • B. Descriptive statistics
  • C. ETLT
  • D. Model selection
Show Answer
Correct Answer:
A. Data exploration
Question 3

Your colleague, who is new to Hadoop, approaches you with a question. They want to know how best to access their data. This colleague has a strong background in data flow languages and programming. Which query interface would you recommend?

  • A. Pig
  • B. Hive
  • C. Howl
  • D. HBase
Show Answer
Correct Answer:
  • C. Howl
  • D. HBase
Question 4

Which type of numeric value does a logistic regression model estimate?

  • A. Probability
  • B. p-value
  • C. Any integer
  • D. Any real number
Show Answer
Correct Answer:
D. Any real number
Question 5

The web analytics team uses Hadoop to process access logs. They now want to correlate this data with structured user data residing in their massively parallel database. Which tool should they use to export the structured data from Hadoop?

  • A. Sqoop
  • B. Pig
  • C. Chukwa
  • D. Scribe
Show Answer
Correct Answer:
D. Scribe
Question 6

You are using the Apriori algorithm to determine the likelihood that a person who owns a home has a good credit score. You have determined that the confidence for the rules used in the algorithm is > 75%. You calculate lift = 1.011 for the rule, "People with good credit are homeowners". What can you determine from the lift calculation?

  • A. Support for the association is low
  • B. Leverage of the rules is low
  • C. The rule is coincidental
  • D. The rule is true
Show Answer
Correct Answer:
B. Leverage of the rules is low
Question 7

You are testing two new weight-gain formulas for puppies. The test gives the results: Control group: 1% weight gain - Formula A. 3% weight gain - Formula B. 4% weight gain - A one-way ANOVA returns a p-value = 0.027 What can you conclude?

  • A. Either Formula A or Formula B is effective at promoting weight gain
  • B. Formula B is more effective at promoting weight gain than Formula A
  • C. Formula A and Formula B are both effective at promoting weight gain
  • D. Formula A and Formula B are about equally effective at promoting weight gain
Show Answer
Correct Answer:
A. Either Formula A or Formula B is effective at promoting weight gain
Question 8

A data scientist plans to classify the sentiment polarity of 10, 000 product reviews collected from the Internet. What is the most appropriate model to use? Suppose labeled training data is available.

  • A. Naïve Bayesian classifier
  • B. Linear regression
  • C. Logistic regression
  • D. K-means clustering
Show Answer
Correct Answer:
D. K-means clustering
Question 9

Review the following code: SELECT pn, vn, sum(prc*qty) FROM sale - GROUP BY CUBE(pn, vn) ORDER BY 1, 2, 3; Which combination of subtotals do you expect to be returned by the query?

  • A. (pn, vn)
  • B. ( (pn, vn), (pn) )
  • C. ( (pn, vn) , (pn), (vn) )
  • D. ( (pn, vn) , (pn), (vn) , ( ) )
Show Answer
Correct Answer:
D. ( (pn, vn) , (pn), (vn) , ( ) )
Question 10

When would you use GROUP BY ROLLUP clause in your OLAP query?

  • A. where all subtotals and grand totals are to be included in the output
  • B. where only the subtotals are to be included in the output
  • C. where only the grand totals are to be included in the output
  • D. where only specific subtotals and grand totals for a combination of variables are to be included in the output
Show Answer
Correct Answer:
A. where all subtotals and grand totals are to be included in the output

Aced these? Get the Full Exam

Download the complete E20-007 study bundle with 39+ questions in a single printable PDF.