Looking for more? Click here to get the full PDF with 45+ practice questions for $10 for offline study and deeper preparation.
Question 1
What describes the variance of a set of values?
A. Variance is a measure of how far a single observed value is from a set of values
B. Variance is a measure of how far an observed value is from the variable’s maximum or minimum value
C. Variance is a measure of central tendency of a set of values
D. Variance is a measure of how far a set of values is spread out from the set's central value
Show Answer
Correct Answer:
C. Variance is a measure of central tendency of a set of values
Question 2
Which of the following statements about adding visual appeal to visualizations in the Visualization Editor is incorrect?
A. Visualization scale can be changed
B. Data Labels can be formatted
C. Colors can be changed
D. Borders can be added
E. Tooltips can be formatted
Show Answer
Correct Answer:
D. Borders can be added
Question 3
A data team has been given a series of projects by a consultant that need to be implemented in the Databricks Lakehouse Platform. Which of the following projects should be completed in Databricks SQL?
A. Testing the quality of data as it is imported from a source
B. Tracking usage of feature variables for machine learning projects
C. Combining two data sources into a single, comprehensive dataset
D. Segmenting customers into like groups using a clustering algorithm
E. Automating complex notebook-based workflows with multiple tasks
Show Answer
Correct Answer:
C. Combining two data sources into a single, comprehensive dataset
Question 4
A data analyst has been asked to produce a visualization that shows the flow of users through a website. Which of the following is used for visualizing this type of flow?
A. Heatmap
B. Choropleth
C. Word Cloud
D. Pivot Table
E. Sankey
Show Answer
Correct Answer:
E. Sankey
Question 5
A data analyst is attempting to drop a table my_table. The analyst wants to delete all table metadata and data. They run the following command: DROP TABLE IF EXISTS my_table; While the object no longer appears when they run SHOW TABLES, the data files still exist. Which of the following describes why the data files still exist and the metadata files were deleted?
A. The table's data was larger than 10 GB
B. The table did not have a location
C. The table was external
D. The table's data was smaller than 10 GB
E. The table was managed
Show Answer
Correct Answer:
C. The table was external
Question 6
Which of the following is a benefit of Databricks SQL using ANSI SQL as its standard SQL dialect?
A. It has increased customization capabilities
B. It is easy to migrate existing SQL queries to Databricks SQL
C. It allows for the use of Photon's computation optimizations
D. It is more performant than other SQL dialects
E. It is more compatible with Spark's interpreters
Show Answer
Correct Answer:
B. It is easy to migrate existing SQL queries to Databricks SQL
Question 7
Which of the following describes how Databricks SQL should be used in relation to other business intelligence (BI) tools like Tableau, Power BI, and looker?
A. As an exact substitute with the same level of functionality
B. As a substitute with less functionality
C. As a complete replacement with additional functionality
D. As a complementary tool for professional-grade presentations
E. As a complementary tool for quick in-platform BI work
Show Answer
Correct Answer:
E. As a complementary tool for quick in-platform BI work
Question 8
A data analyst wants to create a Databricks SQL dashboard with multiple data visualizations and multiple counters. What must be completed before adding the data visualizations and counters to the dashboard?
A. All data visualizations and counters must be created using Queries
B. SQL warehouse (formerly known as SQL endpoint) must be turned on and selected
C. markdown-based tile must be added to the top of the dashboard displaying the dashboard’s name
D. The dashboard owner must also be the owner of the queries, data visualizations, and counters
Show Answer
Correct Answer:
B. SQL warehouse (formerly known as SQL endpoint) must be turned on and selected
Question 9
A data engineering team has created a Structured Streaming pipeline that processes data in micro-batches and populates gold-level tables. The microbatches are triggered every minute. A data analyst has created a dashboard based on this gold-level data. The project stakeholders want to see the results in the dashboard updated within one minute or less of new data becoming available within the gold-level tables. Which of the following cautions should the data analyst share prior to setting up the dashboard to complete this task?
A. The required compute resources could be costly
B. The gold-level tables are not appropriately clean for business reporting
C. The streaming data is not an appropriate data source for a dashboard
D. The streaming cluster is not fault tolerant
E. The dashboard cannot be refreshed that quickly
Show Answer
Correct Answer:
A. The required compute resources could be costly
Question 10
A data analyst needs to share a Databricks SQL dashboard with stakeholders that are not permitted to have accounts in the Databricks deployment. The stakeholders need to be notified every time the dashboard is refreshed. Which approach can the data analyst use to accomplish this task with minimal effort?
A. By granting the stakeholders’ email addresses permissions to the dashboard
B. By adding the stakeholders’ email addresses to the refresh schedule subscribers list
C. By granting the stakeholders’ email addresses to the SQL Warehouse (formerly known as endpoint) subscribers list
D. By downloading the dashboard as a PDF and emailing it to the stakeholders each time it is refreshed
Show Answer
Correct Answer:
B. By adding the stakeholders’ email addresses to the refresh schedule subscribers list
Question 11
A data analyst is working with gold-layer tables to complete an ad-hoc project. A stakeholder has provided the analyst with an additional dataset that can be used to augment the gold-layer tables already in use. Which of the following terms is used to describe this data augmentation?
A. Data testing
B. Ad-hoc improvements
C. Last-mile dashboarding
D. Last-mile ETL
E. Data enhancement
Show Answer
Correct Answer:
E. Data enhancement
Question 12
Which of the following statements about a refresh schedule is incorrect?
A. query can be refreshed anywhere from 1 minute to 2 weeks
B. Refresh schedules can be configured in the Query Editor
C. query being refreshed on a schedule does not use a SQL Warehouse (formerly known as SQL Endpoint)
D. refresh schedule is not the same as an alert
E. You must have workspace administrator privileges to configure a refresh schedule
Show Answer
Correct Answer:
E. You must have workspace administrator privileges to configure a refresh schedule
Question 13
A data analyst needs to use the Databricks Lakehouse Platform to quickly create SQL queries and data visualizations. It is a requirement that the compute resources in the platform can be made serverless, and it is expected that data visualizations can be placed within a dashboard. Which of the following Databricks Lakehouse Platform services/capabilities meets all of these requirements?
A. Delta Lake
B. Databricks Notebooks
C. Tableau
D. Databricks Machine Learning
E. Databricks SQL
Show Answer
Correct Answer:
E. Databricks SQL
Question 14
A data analyst has been asked to configure an alert for a query that returns the income in the accounts_receivable table for a date range. The date range is configurable using a Date query parameter. The Alert does not work. Which of the following describes why the Alert does not work?
A. Alerts don't work with queries that access tables
B. Queries that return results based on dates cannot be used with Alerts
C. The wrong query parameter is being used. Alerts only work with Date and Time query parameters
D. Queries that use query parameters cannot be used with Alerts
E. The wrong query parameter is being used. Alerts only work with dropdown list query parameters, not dates
Show Answer
Correct Answer:
D. Queries that use query parameters cannot be used with Alerts
Question 15
A data analyst runs the following command: INSERT INTO stakeholders.suppliers TABLE stakeholders.new_suppliers; What is the result of running this command?
A. The suppliers table now contains both the data it had before the command was run and the data from the new_suppliers table, and any duplicate data is deleted
B. The command fails because it is written incorrectly
C. The suppliers table now contains both the data it had before the command was run and the data from the new_suppliers table, including any duplicate data
D. The suppliers table now contains the data from the new_suppliers table, and the new_suppliers table now contains the data from the suppliers table
E. The suppliers table now contains only the data from the new_suppliers table
Show Answer
Correct Answer:
C. The suppliers table now contains both the data it had before the command was run and the data from the new_suppliers table, including any duplicate data
Aced these? Get the Full Exam
Download the complete Certified Data Analyst Associate study bundle with 45+ questions in a single printable PDF.