Free DP-100 Sample Questions — Designing and Implementing a Data Science Solution on Azure

Free DP-100 sample questions for the Designing and Implementing a Data Science Solution on Azure exam. No account required: study at your own pace.

Want an interactive quiz? Take the full DP-100 practice test

Looking for more? Click here to get the full PDF with 221+ practice questions for $10 for offline study and deeper preparation.

Question 1

You manage an Azure Machine Learning workspace. You experiment with an MLflow model that trains interactively by using a notebook in the workspace. You need to log dictionary type artifacts of the experiments in Azure Machine Learning by using MLflow. Which syntax should you use?

A. mlflow.log_input(my_dict)
B. mlflow.log_metric("my_metric", my_dict)
C. mlflow.log_metrics(my_dict)
D. mlflow.log_text("my_metric", my_dict)

Show Answer

Correct Answer:

C. mlflow.log_metrics(my_dict)

Question 2

You use Azure Machine Learning Studio to build a machine learning experiment. You need to divide data into two distinct datasets. Which module should you use?

A. Split Data
B. Load Trained Model
C. Assign Data to Clusters
D. Group Data into Bins

Show Answer

Correct Answer:

A. Split Data

Question 3

You are creating a new Azure Machine Learning pipeline using the designer. The pipeline must train a model using data in a comma-separated values (CSV) file that is published on a website. You have not created a dataset for this file. You need to ingest the data from the CSV file into the designer pipeline using the minimal administrative effort. Which module should you add to the pipeline in Designer?

A. Convert to CSV
B. Enter Data Manually
C. Import Data
D. Dataset

Show Answer

Correct Answer:

C. Import Data

Question 4

You are a data scientist creating a linear regression model. You need to determine how closely the data fits the regression line. Which metric should you review?

A. Root Mean Square Error
B. Coefficient of determination
C. Recall
D. Precision
E. Mean absolute error

Show Answer

Correct Answer:

B. Coefficient of determination

Question 5

You run a script as an experiment in Azure Machine Learning. You have a Run object named run that references the experiment run. You must review the log files that were generated during the experiment run. You need to download the log files to a local folder for review. Which two code segments can you run to achieve this goal? Each correct answer presents a complete solution. NOTE: Each correct selection is worth one point.

A. run.get_details()
B. run.get_file_names()
C. run.get_metrics()
D. run.download_files(output_directory='./runfiles')
E. run.get_all_logs(destination='./runlogs')

Show Answer

Correct Answer:

D. run.download_files(output_directory='./runfiles')
E. run.get_all_logs(destination='./runlogs')

Question 6

You manage an Azure Machine Learning workspace. You must create and configure a compute cluster for a training job by using Python SDK v2. You need to create a persistent Azure Machine Learning compute resource, specifying the fewest possible properties. Which two properties should you define? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

A. size
B. win_instances
C. type
D. name
E. max_instances

Show Answer

Correct Answer:

A. size
E. max_instances

Question 7

You plan to use a Deep Learning Virtual Machine (DLVM) to train deep learning models using Compute Unified Device Architecture (CUDA) computations. You need to configure the DLVM to support CUDA. What should you implement?

A. Solid State Drives (SSD)
B. Computer Processing Unit (CPU) speed increase by using overclocking
C. Graphic Processing Unit (GPU)
D. High Random Access Memory (RAM) configuration
E. Intel Software Guard Extensions (Intel SGX) technology

Show Answer

Correct Answer:

C. Graphic Processing Unit (GPU)

Question 8

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You are in the process of creating a machine learning model. Your dataset includes rows with null and missing values. You plan to make use of the Clean Missing Data module in Azure Machine Learning Studio to detect and fix the null and missing values in the dataset. Recommendation: You make use of the Custom substitution value option. Will the requirements be satisfied?

A. Yes
B. No

Show Answer

Correct Answer:

A. Yes

Question 9

You create a multi-class image classification deep learning model. You train the model by using PyTorch version 1.2. You need to ensure that the correct version of PyTorch can be identified for the inferencing environment when the model is deployed. What should you do?

A. Save the model locally as a.pt file, and deploy the model as a local web service
B. Deploy the model on computer that is configured to use the default Azure Machine Learning conda environment
C. Register the model with a .pt file extension and the default version property
D. Register the model, specifying the model_framework and model_framework_version properties

Show Answer

Correct Answer:

D. Register the model, specifying the model_framework and model_framework_version properties

Question 10

You create an Azure Machine learning workspace. You must use the Azure Machine Learning Python SDK v2 to define the search space for discrete hyperparameters. The hyperparameters must consist of a list of predetermined, comma-separated integer values. You need to import the class from the azure.ai.ml.sweep package used to create the list of values. Which class should you import?

A. Choice
B. Randint
C. Uniform
D. Normal

Show Answer

Correct Answer:

A. Choice

Question 11

You are with a time series dataset in Azure Machine Learning Studio. You need to split your dataset into training and testing subsets by using the Split Data module. Which splitting mode should you use?

A. Recommender Split
B. Regular Expression Split
C. Relative Expression Split
D. Split Rows with the Randomized split parameter set to true

Show Answer

Correct Answer:

C. Relative Expression Split

Question 12

You manage an Azure Machine Learning workspace. You use Azure Machine Learning Python SDK v2 to configure a trigger to schedule a pipeline job. You need to create a time-based schedule with recurrence pattern. Which two properties must you use to successfully configure the trigger? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

A. interval
B. start_time
C. schedule
D. time_zone
E. frequency

Show Answer

Correct Answer:

A. interval
E. frequency

Question 13

You plan to run a script as an experiment. The script uses modules from the SciPy library and several Python packages that are not typically installed in a default conda environment. You plan to run the experiment on your local workstation for small datasets and scale out the experiment by running it on more powerful remote compute dusters for larger datasets. You need to ensure that the experiment runs successfully on local and remote compute with the least administrative effort. What should you do?

A. Leave the environment unspecified for the experiment. Run the expenment by using the default environment
B. Create a config.yaml file that defines the required conda packages and save the file in the experiment folder
C. Create and register an environment that includes the required packages. Use this environment for all experiment jobs
D. Create a virtual machine (VM) by using the required Python configuration and attach the VM as a compute target. Use this compute target for all experiment runs

Show Answer

Correct Answer:

C. Create and register an environment that includes the required packages. Use this environment for all experiment jobs

Question 14

You are performing a filter-based feature selection for a dataset to build a multi-class classifier by using Azure Machine Learning Studio. The dataset contains categorical features that are highly correlated to the output label column. You need to select the appropriate feature scoring statistical method to identify the key predictors. Which method should you use?

A. Kendall correlation
B. Spearman correlation
C. Chi-squared
D. Pearson correlation

Show Answer

Correct Answer:

C. Chi-squared

Question 15

You are creating a compute target to train a machine learning experiment. The compute target must support automated machine learning, machine learning pipelines, and Azure Machine Learning designer training. You need to configure the compute target. Which option should you use?

A. Azure HDInsight
B. Azure Machine Learning compute cluster
C. Azure Batch
D. Remote VM

Show Answer

Correct Answer:

B. Azure Machine Learning compute cluster

Question 16

You need to implement a Data Science Virtual Machine (DSVM) that supports the Caffe2 deep learning framework. Which of the following DSVM should you create?

A. Windows Server 2012 DSVM
B. Windows Server 2016 DSVM
C. Ubuntu 16.04 DSVM
D. CentOS 7.4 DSVM

Show Answer

Correct Answer:

C. Ubuntu 16.04 DSVM

Question 17

This question is included in a number of questions that depicts the identical set-up. However, every question has a distinctive result. Establish if the recommendation satisfies the requirements. You have been tasked with employing a machine learning model, which makes use of a PostgreSQL database and needs GPU processing, to forecast prices. You are preparing to create a virtual machine that has the necessary tools built into it. You need to make use of the correct virtual machine type. Recommendation: You make use of a Data Science Virtual Machine (DSVM) Windows edition. Will the requirements be satisfied?

A. Yes
B. No

Show Answer

Correct Answer:

B. No

Question 18

You are a data scientist working for a hotel booking website company. You use the Azure Machine Learning service to train a model that identifies fraudulent transactions. You must deploy the model as an Azure Machine Learning online endpoint by using the Azure Machine Learning Python SDK v2. The deployed model must return real-time predictions of fraud based on transaction data input. You need to create the script that is specified as the scoring_script parameter for the CodeConfiguration class used to deploy the model. What should the entry script do?

A. Register the model with appropriate tags and properties
B. Create a Conda environment for the online endpoint compute and install the necessary Python packages
C. Load the model and use it to predict labels from input data
D. Start a node on the inference cluster where the model is deployed
E. Specify the number of cores and the amount of memory required for the online endpoint compute

Show Answer

Correct Answer:

C. Load the model and use it to predict labels from input data

Question 19

You are preparing to train a regression model via automated machine learning. The data available to you has features with missing values, as well as categorical features with little discrete values. You want to make sure that automated machine learning is configured as follows: ✑ missing values must be automatically imputed. ✑ categorical features must be encoded as part of the training task. Which of the following actions should you take?

A. You should make use of the featurization parameter with the 'auto' value pair
B. You should make use of the featurization parameter with the 'off' value pair
C. You should make use of the featurization parameter with the 'on' value pair
D. You should make use of the featurization parameter with the 'FeaturizationConfig' value pair

Show Answer

Correct Answer:

A. You should make use of the featurization parameter with the 'auto' value pair

Question 20

You are in the process of constructing a deep convolutional neural network (CNN). The CNN will be used for image classification. You notice that the CNN model you constructed displays hints of overfitting. You want to make sure that overfitting is minimized, and that the model is converged to an optimal fit. Which of the following is TRUE with regards to achieving your goal?

A. You have to add an additional dense layer with 512 input units, and reduce the amount of training data
B. You have to add L1/L2 regularization, and reduce the amount of training data
C. You have to reduce the amount of training data and make use of training data augmentation
D. You have to add L1/L2 regularization, and make use of training data augmentation
E. You have to add an additional dense layer with 512 input units, and add L1/L2 regularization

Show Answer

Correct Answer:

D. You have to add L1/L2 regularization, and make use of training data augmentation

Aced these? Get the Full Exam

Download the complete DP-100 study bundle with 221+ questions in a single printable PDF.

Purchase Full Exam PDF | $10