Free Certified Generative AI Engineer Associate Sample Questions — Certified Generative AI Engineer Associate

Free Certified Generative AI Engineer Associate sample questions for the Certified Generative AI Engineer Associate exam. No account required: study at your own pace.

Want an interactive quiz? Take the full Certified Generative AI Engineer Associate practice test

Looking for more? Click here to get the full PDF with 80+ practice questions for $10 for offline study and deeper preparation.

Question 1

A Generative AI Engineer received the following business requirements for an internal chatbot. The internal chatbot needs to know what types of questions the user asks and route them to appropriate models to answer the questions. For example, the user might ask about historical failure rates of a specific electrical part. Another user might ask about how to troubleshoot a piece of electrical equipment. Available data sources include a database of electrical equipment PDF manuals and also a table with information on when an electrical part experiences failure. Which workflow supports such a chatbot?

A. Parse the electrical equipment PDF manuals into a table of question and response pairs. That way, the same chatbot can query tables easily to answer questions about both historical failure rates and equipment troubleshooting
B. The chatbot should be implemented as a multi-step LLM workflow. First, identify the type of question asked, then route the question to the appropriate model. If it’s a historical failure rate question, send the query to a text-to-SQL model. If it’s a troubleshooting question, then send the query to another model that summarizes the equipment-specific document and generates the response
C. There should be two different chatbots handling different types of user queries
D. The table with electrical part failures should be converted into a text document first. That way, the same chatbot can use the same document retrieval process to generate answers regardless of question types

Show Answer

Correct Answer:

B. The chatbot should be implemented as a multi-step LLM workflow. First, identify the type of question asked, then route the question to the appropriate model. If it’s a historical failure rate question, send the query to a text-to-SQL model. If it’s a troubleshooting question, then send the query to another model that summarizes the equipment-specific document and generates the response

Question 2

A Generative AI Engineer is building an LLM to generate article summaries in the form of a type of poem, such as a haiku, given the article content. However, the initial output from the LLM does not match the desired tone or style. Which approach will NOT improve the LLM’s response to achieve the desired response?

A. Provide the LLM with a prompt that explicitly instructs it to generate text in the desired tone and style
B. Use a neutralizer to normalize the tone and style of the underlying documents
C. Include few-shot examples in the prompt to the LLM
D. Fine-tune the LLM on a dataset of desired tone and style

Show Answer

Correct Answer:

B. Use a neutralizer to normalize the tone and style of the underlying documents

Question 3

A Generative AI Engineer is using an LLM to classify species of edible mushrooms based on text descriptions of certain features. The model is returning accurate responses in testing and the Generative AI Engineer is confident they have the correct list of possible labels, but the output frequently contains additional reasoning in the answer when the Generative AI Engineer only wants to return the label with no additional text. Which action should they take to elicit the desired behavior from this LLM?

A. Use few shot prompting to instruct the model on expected output format
B. Use zero shot prompting to instruct the model on expected output format
C. Use zero shot chain-of-thought prompting to prevent a verbose output format
D. Use a system prompt to instruct the model to be succinct in its answer

Show Answer

Correct Answer:

A. Use few shot prompting to instruct the model on expected output format

Question 4

A Generative AI Engineer has been asked to design an LLM-based application that accomplishes the following business objective: answer employee HR questions using HR PDF documentation. Which set of high level tasks should the Generative AI Engineer's system perform?

A. Calculate averaged embeddings for each HR document, compare embeddings to user query to find the best document. Pass the best document with the user query into an LLM with a large context window to generate a response to the employee
B. Use an LLM to summarize HR documentation. Provide summaries of documentation and user query into an LLM with a large context window to generate a response to the user
C. Create an interaction matrix of historical employee questions and HR documentation. Use ALS to factorize the matrix and create embeddings. Calculate the embeddings of new queries and use them to find the best HR documentation. Use an LLM to generate a response to the employee question based upon the documentation retrieved
D. Split HR documentation into chunks and embed into a vector store. Use the employee question to retrieve best matched chunks of documentation, and use the LLM to generate a response to the employee based upon the documentation retrieved

Show Answer

Correct Answer:

D. Split HR documentation into chunks and embed into a vector store. Use the employee question to retrieve best matched chunks of documentation, and use the LLM to generate a response to the employee based upon the documentation retrieved

Question 5

Generative AI Engineer is helping a cinema extend its website’s chat bot to be able to respond to questions about specific showtimes for movies currently playing at their local theater. They already have the location of the user provided by location services to their agent, and a Delta table which is continually updated with the latest showtime information by location. They want to implement this new capability in their RAG application. Which option will do this with the least effort and in the most performant way?

A. Create a Feature Serving Endpoint from a FeatureSpec that references an online store synced from the Delta table. Query the Feature Serving Endpoint as part of the agent logic / tool implementation
B. Query the Delta table directly via a SQL query constructed from the user’s input using a text-to-SQL LLM in the agent logic / tool implementation
C. Set up a task in Databricks Workflows to write the information in the Delta table periodically to an external database such as MySQL and query the information from there as part of the agent logic / tool implementation
D. Write the Delta table contents to a text column, then embed those texts using an embedding model and store these in the vector index. Look up the information based on the embedding as part of the agent logic / tool implementation

Show Answer

Correct Answer:

B. Query the Delta table directly via a SQL query constructed from the user’s input using a text-to-SQL LLM in the agent logic / tool implementation

Question 6

A Generative AI Engineer is tasked with deploying an application that takes advantage of a custom MLflow Pyfunc model to return some interim results. How should they configure the endpoint to pass the secrets and credentials?

A. Use spark.conf.set ()
B. Pass variables using the Databricks Feature Store API
C. Add credentials using environment variables
D. Pass the secrets in plain text

Show Answer

Correct Answer:

C. Add credentials using environment variables

Question 7

A Generative AI Engineer who was prototyping an LLM system accidentally ran thousands of inference queries against a Foundation Model endpoint over the weekend. They want to take action to prevent this from unintentionally happening again in the future. What action should they take?

A. Use prompt engineering to instruct the LLM endpoints to refuse too many subsequent queries
B. Require that all development code which interfaces with a Foundation Model endpoint must be reviewed by a Staff level engineer before execution
C. Build a pyfunc model which proxies to the Foundation Model endpoint and add throttling within the pyfune model
D. Configure rate limiting on the Foundation Model endpoints

Show Answer

Correct Answer:

D. Configure rate limiting on the Foundation Model endpoints

Question 8

A Generative AI Engineer wants their finetuned LLMs in their prod Databricks workspace available for testing in their dev workspace as well. All of their workspaces are Unity Catalog enabled and they are currently logging their models into the Model Registry in MLflow. What is the most cost-effective and secure option for the Generative AI Engineer to accomplish their goal?

A. Use an external model registry which can be accessed from all workspaces
B. Use MLflow to log the model directly into Unity Catalog, and enable READ access in the dev workspace to the model
C. Setup a duplicate training pipeline in dev, so that an identical model is available in dev
D. Setup a script to export the model from prod and import it to dev

Show Answer

Correct Answer:

B. Use MLflow to log the model directly into Unity Catalog, and enable READ access in the dev workspace to the model

Question 9

A Generative AI Engineer is developing an agent system using a popular agent-authoring library. The agent comprises multiple parallel and sequential chains. The engineer encounters challenges as the agent fails at one of the steps, making it difficult to debug the root cause. They need to find an appropriate approach to research this issue and discover the cause of failure. Which approach do they choose?

A. Enable MLflow tracing to gain visibility into each agent's behavior and execution step
B. Run MLflow.evaluate to determine root cause of failed step
C. Implement structured logging within the agent's code to capture detailed execution information
D. Deconstruct the agent into independent steps to simplify debugging

Show Answer

Correct Answer:

A. Enable MLflow tracing to gain visibility into each agent's behavior and execution step

Question 10

A Generative AI Engineer is creating an agent-based LLM system for their favorite monster truck team. The system can answer text based questions about the monster truck team, lookup event dates via an API call, or query tables on the team’s latest standings. How could the Generative AI Engineer best design these capabilities into their system?

A. Ingest PDF documents about the monster truck team into a vector store and query it in a RAG architecture
B. Write a system prompt for the agent listing available tools and bundle it into an agent system that runs a number of calls to solve a query
C. Instruct the LLM to respond with “RAG”, “API”, or “TABLE” depending on the query, then use text parsing and conditional statements to resolve the query
D. Build a system prompt with all possible event dates and table information in the system prompt. Use a RAG architecture to lookup generic text questions and otherwise leverage the information in the system prompt

Show Answer

Correct Answer:

B. Write a system prompt for the agent listing available tools and bundle it into an agent system that runs a number of calls to solve a query

Question 11

A Generative AI Engineer has written scalable PySpark code to ingest unstructured PDF documents and chunk them in preparation for storing in a Databricks Vector Search index. Currently, the two columns of their dataframe include the original filename as a string and an array of text chunks from that document. What set of steps should the Generative AI Engineer perform to store the chunks in a ready-to-ingest manner for Databricks Vector Search?

A. Use PySpark’s autoloader to apply a UDF across all chunks, formatting them in a JSON structure for Vector Search ingestion
B. Flatten the dataframe to one chunk per row, create a unique identifier for each row, and enable change feed on the output Delta table
C. Utilize the original filename as the unique identifier and save the dataframe as is
D. Create a unique identifier for each document, flatten the dataframe to one chunk per row and save to an output Delta table

Show Answer

Correct Answer:

B. Flatten the dataframe to one chunk per row, create a unique identifier for each row, and enable change feed on the output Delta table

Question 12

A Generative AI Engineer at an automotive company would like to build a question-answering chatbot to help customers answer specific questions about their vehicles. They have: • A catalog with hundreds of thousands of cars manufactured since the 1960s • Historical searches, with user queries and successful matches • Descriptions of their own cars in multiple languages They have already selected an open source LLM and created a test set of user queries. They need to discard techniques that will not help them build the chatbot. Which do they discard?

A. Setting chunk size to match the model's context window to maximize coverage
B. Implementing metadata filtering based on car models and years
C. Fine-tuning an embedding model on automotive terminology
D. Adding few-shot examples for response generation

Show Answer

Correct Answer:

A. Setting chunk size to match the model's context window to maximize coverage

Question 13

A Generative AI Engineer has been asked to build an LLM-based question-answering application. The application should take into account new documents that are frequently published. The engineer wants to build this application with the least cost and least development effort and have it operate at the lowest cost possible. Which combination of chaining components and configuration meets these requirements?

A. For the application a prompt, a retriever, and an LLM are required. The retriever output is inserted into the prompt which is given to the LLM to generate answers
B. The LLM needs to be frequently with the new documents in order to provide most up-to-date answers
C. For the question-answering application, prompt engineering and an LLM are required to generate answers
D. For the application a prompt, an agent and a fine-tuned LLM are required. The agent is used by the LLM to retrieve relevant content that is inserted into the prompt which is given to the LLM to generate answers

Show Answer

Correct Answer:

A. For the application a prompt, a retriever, and an LLM are required. The retriever output is inserted into the prompt which is given to the LLM to generate answers

Question 14

Generative AI Engineer needs to build an LLM application that can understand medical documents, including recently published ones. They want to select an open model available on HuggingFace’s model hub. Which step is most appropriate for selecting an LLM?

A. Pick any model in the Mistral family, as Mistral models are good with all types of use cases
B. Select a model based on the highest number of downloads, as this indicates popularity, reliability, and general suitability
C. Select a model that is most recently uploaded, as this indicates the model is the newest and highly likely to be the most performant
D. Check for the model and training data description to identify if the model is trained on any medical data

Show Answer

Correct Answer:

D. Check for the model and training data description to identify if the model is trained on any medical data

Question 15

A Generative Al Engineer interfaces with an LLM with prompt/response behavior that has been trained on customer calls inquiring about product availability. The LLM is designed to output “In Stock” if the product is available or only the term “Out of Stock” if not. Which prompt will work to allow the engineer to respond to call classification labels correctly?

A. Respond with “In Stock” if the customer asks for a product
B. You will be given a customer call transcript where the customer asks about product availability. The outputs are either “In Stock” or “Out of Stock”. Format the output in JSON, for example: {“call_id”: “123”, “label”: “In Stock”}
C. Respond with “Out of Stock” if the customer asks for a product
D. You will be given a customer call transcript where the customer inquires about product availability. Respond with “In Stock” if the product is available or “Out of Stock” if not

Show Answer

Correct Answer:

D. You will be given a customer call transcript where the customer inquires about product availability. Respond with “In Stock” if the product is available or “Out of Stock” if not

Question 16

An AI developer team wants to fine tune an open-weight model to have exceptional performance on a code generation use case. They are trying to choose the best model to start with. They want to minimize model hosting costs, and are using Huggingface model cards and spaces to explore models. Which TWO model attributes and metrics should the team focus on to make their selection? (Choose two.)

A. Big Code Models Leaderboard
B. Number of model parameters
C. MTEB Leaderboard
D. Chatbot Arena Leaderboard
E. Number of model downloads last month

Show Answer

Correct Answer:

A. Big Code Models Leaderboard
B. Number of model parameters

Question 17

All of the following are python APIs used to query Databricks foundation models. When running in an interactive notebook, which of the following libraries does not automatically use the current session credentials?

A. OpenAI client
B. REST API via requests library
C. MLflow Deployments SDK
D. Databricks Python SDK

Show Answer

Correct Answer:

B. REST API via requests library

Question 18

When developing an LLM application, it’s crucial to ensure that the data used for training the model complies with licensing requirements to avoid legal risks. Which action is most appropriate to avoid legal risks?

A. Only use data explicitly labeled with an open license and ensure the license terms are followed
B. Any LLM outputs are reasonable to use because they do not reveal the original sources of data directly
C. Reach out to the data curators directly to gain written consent for using their data
D. Use any publicly available data as public data does not have legal restrictions

Show Answer

Correct Answer:

A. Only use data explicitly labeled with an open license and ensure the license terms are followed

Question 19

A Generative AI Engineer is working with a retail company that wants to enhance its customer experience by automatically handling common customer inquiries. They are working on an LLM-powered AI solution that should improve response times while maintaining a personalized interaction. They want to define the appropriate input and LLM task to do this. Which input/output pair will do this?

A. Input: Customer service chat logs; Output: Group the chat logs by users, followed by summarizing each user’s interactions, then respond
B. Input: Customer service chat logs; Output: Find the answers to similar questions and respond with a summary
C. Input: Customer reviews; Output: Classify review sentiment
D. Input: Customer reviews; Output: Group the reviews by users and aggregate per-user average rating, then respond

Show Answer

Correct Answer:

A. Input: Customer service chat logs; Output: Group the chat logs by users, followed by summarizing each user’s interactions, then respond

Question 20

A Generative AI Engineer needs to design an LLM pipeline to conduct multi-stage reasoning that leverages external tools. To be effective at this, the LLM will need to plan and adapt actions while performing complex reasoning tasks. Which approach will do this?

A. Train the LLM to generate a single, comprehensive response without interacting with any external tools, relying solely on its pre-trained knowledge
B. Use a Chain-of-Thought (CoT) prompting technique to guide the LLM through a series of reasoning steps, then manually input the results from external tools for the final answer
C. Implement a framework like ReAct, which allows the LLM to generate reasoning traces and perform task-specific actions that leverage external tools if necessary
D. Encourage the LLM to make multiple API calls in sequence without planning or structuring the calls, allowing the LLM to decide when and how to use external tools spontaneously

Show Answer

Correct Answer:

C. Implement a framework like ReAct, which allows the LLM to generate reasoning traces and perform task-specific actions that leverage external tools if necessary

Aced these? Get the Full Exam

Download the complete Certified Generative AI Engineer Associate study bundle with 80+ questions in a single printable PDF.

Purchase Full Exam PDF | $10