Free Certified Machine Learning Professional Sample Questions — Certified Machine Learning Professional

Free Certified Machine Learning Professional sample questions for the Certified Machine Learning Professional exam. No account required: study at your own pace.

Want an interactive quiz? Take the full Certified Machine Learning Professional practice test

Looking for more? Click here to get the full PDF with 62+ practice questions for $10 for offline study and deeper preparation.

Question 1

A data scientist has written a function to track the runs of their random forest model. The data scientist is changing the number of trees in the forest across each run. Which of the following MLflow operations is designed to log single values like the number of trees in a random forest?

A. mlflow.log_artifact
B. mlflow.log_model
C. mlflow.log_metric
D. mlflow.log_param
E. There is no way to store values like this

Show Answer

Correct Answer:

D. mlflow.log_param

Question 2

A machine learning engineering team has written predictions computed in a batch job to a Delta table for querying. However, the team has noticed that the querying is running slowly. The team has already tuned the size of the data files. Upon investigating, the team has concluded that the rows meeting the query condition are sparsely located throughout each of the data files. Based on the scenario, which of the following optimization techniques could speed up the query by colocating similar records while considering values in multiple columns?

A. Z-Ordering
B. Bin-packing
C. Write as a Parquet file
D. Data skipping
E. Tuning the file size

Show Answer

Correct Answer:

A. Z-Ordering

Question 3

A machine learning engineering team wants to build a continuous pipeline for data preparation of a machine learning application. The team would like the data to be fully processed and made ready for inference in a series of equal-sized batches. Which of the following tools can be used to provide this type of continuous processing?

A. Spark UDFs
B. Structured Streaming
C. MLflow
D. Delta Lake
E. AutoML

Show Answer

Correct Answer:

B. Structured Streaming

Question 4

Which of the following describes the concept of MLflow Model flavors?

A. convention that deployment tools can use to wrap preprocessing logic into a Model
B. convention that MLflow Model Registry can use to version models
C. convention that MLflow Experiments can use to organize their Runs by project
D. convention that deployment tools can use to understand the model
E. convention that MLflow Model Registry can use to organize its Models by project

Show Answer

Correct Answer:

D. convention that deployment tools can use to understand the model

Question 5

A machine learning engineer wants to deploy a model for real-time serving using MLflow Model Serving. For the model, the machine learning engineer currently has one model version in each of the stages in the MLflow Model Registry. The engineer wants to know which model versions can be queried once Model Serving is enabled for the model. Which of the following lists all of the MLflow Model Registry stages whose model versions are automatically deployed with Model Serving?

A. Staging, Production, Archived
B. Production
C. None, Staging, Production, Archived
D. Staging, Production
E. None, Staging, Production

Show Answer

Correct Answer:

D. Staging, Production

Question 6

Which of the following Databricks-managed MLflow capabilities is a centralized model store?

A. Models
B. Model Registry
C. Model Serving
D. Feature Store
E. Experiments

Show Answer

Correct Answer:

B. Model Registry

Question 7

A machine learning engineer is monitoring label values for a production machine learning classification model. The engineer believes that the relative prevalence of the classes is becoming changing in more recent data. Which tool can the machine learning engineer use to assess their theory?

A. One-way Chi-squared Test
B. Jenson-Shannon distance
C. Two-way Chi-squared Test
D. Kolmogorov-Smirnov (KS) test

Show Answer

Correct Answer:

A. One-way Chi-squared Test

Question 8

A data scientist has developed a model to predict ice cream sales using the expected temperature and expected number of hours of sun in the day. However, the expected temperature is dropping beneath the range of the input variable on which the model was trained. Which of the following types of drift is present in the above scenario?

A. Label drift
B. None of these
C. Concept drift
D. Prediction drift
E. Feature drift

Show Answer

Correct Answer:

E. Feature drift

Question 9

A machine learning engineer is monitoring categorical input variables for a production machine learning application. The engineer believes that missing values are becoming more prevalent in more recent data for a particular value in one of the categorical input variables. Which of the following tools can the machine learning engineer use to assess their theory?

A. Kolmogorov-Smirnov (KS) test
B. One-way Chi-squared Test
C. Two-way Chi-squared Test
D. Jenson-Shannon distance
E. None of these

Show Answer

Correct Answer:

B. One-way Chi-squared Test

Question 10

A machine learning engineer has detected that concept drift is occurring in a production machine learning application. Which result is the impact of concept drift?

A. The model's latency will decrease
B. The model's efficacy will increase
C. The model's latency will increase
D. The model's efficacy will decrease

Show Answer

Correct Answer:

D. The model's efficacy will decrease

Question 11

Which of the following describes batch deployment for machine learning projects?

A. Predictions are computed and delivered as soon as feature values are available
B. None of these describe batch deployment for machine learning projects
C. Predictions are computed prior to delivery and stored for later querying
D. Predictions are computed immediately as data arrives and stored for later querying

Show Answer

Correct Answer:

C. Predictions are computed prior to delivery and stored for later querying

Question 12

A machine learning engineer has found drift in a production machine learning application. The engineer has determined that retraining and deploying a new model is necessary. Which statement must be true prior to deploying the new model?

A. All of these statements must be true prior to deploying the new model
B. The new model must perform better than the original model on a random subset of the available data
C. The now model must perform better than the original model on all of the available data
D. The new model must perform better than the original model on the most recently available data

Show Answer

Correct Answer:

D. The new model must perform better than the original model on the most recently available data

Question 13

Which of the following statements describes streaming with Spark as a model deployment strategy?

A. The inference of batch processed records as soon as a trigger is hit
B. The inference of all types of records in real-time
C. The inference of batch processed records as soon as a Spark job is run
D. The inference of incrementally processed records as soon as trigger is hit
E. The inference of incrementally processed records as soon as a Spark job is run

Show Answer

Correct Answer:

D. The inference of incrementally processed records as soon as trigger is hit

Question 14

Which of the following MLflow operations can be used to automatically calculate and log a Shapley feature importance plot?

A. mlflow.shap.log_explanation
B. None of these operations can accomplish the task
C. mlflow.shap
D. mlflow.log_figure
E. client.log_artifact

Show Answer

Correct Answer:

A. mlflow.shap.log_explanation

Question 15

Which or the following statements about built-in library-specific MLflow Model flavors is true?

A. Built-in library-specific flavors are required for model signature use
B. Built-in library-specific flavors allow models to be exported as library objects
C. Built-in library-specific flavors allow models to be used with any library/span>
D. Built-in library-specific flavors can only be used for logging models

Show Answer

Correct Answer:

B. Built-in library-specific flavors allow models to be exported as library objects

Question 16

A Machine Learning Engineer is using joblibspark and MLflowCallback to perform a distributed hyperparameter tuning experiment via Optuna. Assuming they have a single objective they are optimizing for, what will be the default sampler implemented by Optuna?

A. GridSampler
B. RandomSampler
C. NSGAIISampler
D. TPESampler

Show Answer

Correct Answer:

D. TPESampler

Question 17

A data scientist wants to examine the data in the Feature Store table table from the database dev as a Spark DataFrame. They have access to Feature Store Client fs. Which line of code can be used to gel the data from table as a Spark DataFrame?

A. fs.get_table("table)
B. fs.read_table("dev.table")
C. fs.create_table("dev.table")
D. fs.get_table("dev.table")

Show Answer

Correct Answer:

B. fs.read_table("dev.table")

Question 18

A machine learning engineer needs to select a deployment strategy for a new machine learning application. The feature values are not available until the time of delivery, and results are needed exceedingly fast for one record at a time. Which of the following deployment strategies can be used to meet these requirements?

A. Edge/on-device
B. Streaming
C. None of these strategies will meet the requirements
D. Batch
E. Real-time

Show Answer

Correct Answer:

E. Real-time

Question 19

A data scientist has developed and logged a scikit-learn random forest model model, and then they ended their Spark session and terminated their cluster. After starting a new cluster, they want to review the feature_importances_ of the original model object. Which of the following lines of code can be used to restore the model object so that feature_importances_ is available?

A. mlflow.load_model(model_uri)
B. client.list_artifacts(run_id)["feature-importances.csv"]
C. mlflow.sklearn.load_model(model_uri)
D. This can only be viewed in the MLflow Experiments UI
E. client.pyfunc.load_model(model_uri)

Show Answer

Correct Answer:

C. mlflow.sklearn.load_model(model_uri)

Question 20

Which of the following tools can assist in real-time deployments by packaging software with its own application, tools, and libraries?

A. Cloud-based compute
B. None of these tools
C. REST APIs
D. Containers
E. Autoscaling clusters

Show Answer

Correct Answer:

D. Containers

Aced these? Get the Full Exam

Download the complete Certified Machine Learning Professional study bundle with 62+ questions in a single printable PDF.

Purchase Full Exam PDF | $10