Spring Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: pass65

MLA-C01 AWS Certified Machine Learning Engineer - Associate Questions and Answers

Questions 4

A company uses an Amazon EMR cluster to run a data ingestion process for an ML model. An ML engineer notices that the processing time is increasing.

Which solution will reduce the processing time MOST cost-effectively?

Options:

A.

Use Spot Instances to increase the number of primary nodes.

B.

Use Spot Instances to increase the number of core nodes.

C.

Use Spot Instances to increase the number of task nodes.

D.

Use On-Demand Instances to increase the number of core nodes.

Buy Now
Questions 5

An ML engineer receives datasets that contain missing values, duplicates, and extreme outliers. The ML engineer must consolidate these datasets into a single data frame and must prepare the data for ML.

Which solution will meet these requirements?

Options:

A.

Use Amazon SageMaker Data Wrangler to import the datasets and to consolidate them into a single data frame. Use the cleansing and enrichment functionalities to prepare the data.

B.

Use Amazon SageMaker Ground Truth to import the datasets and to consolidate them into a single data frame. Use the human-in-the-loop capability to prepare the data.

C.

Manually import and merge the datasets. Consolidate the datasets into a single data frame. Use Amazon Q Developer to generate code snippets that will prepare the data.

D.

Manually import and merge the datasets. Consolidate the datasets into a single data frame. Use Amazon SageMaker data labeling to prepare the data.

Buy Now
Questions 6

A company uses a hybrid cloud environment. A model that is deployed on premises uses data in Amazon 53 to provide customers with a live conversational engine.

The model is using sensitive data. An ML engineer needs to implement a solution to identify and remove the sensitive data.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.

Deploy the model on Amazon SageMaker. Create a set of AWS Lambda functions to identify and remove the sensitive data.

B.

Deploy the model on an Amazon Elastic Container Service (Amazon ECS) cluster that uses AWS Fargate. Create an AWS Batch job to identify and remove the sensitive data.

C.

Use Amazon Macie to identify the sensitive data. Create a set of AWS Lambda functions to remove the sensitive data.

D.

Use Amazon Comprehend to identify the sensitive data. Launch Amazon EC2 instances to remove the sensitive data.

Buy Now
Questions 7

A company is using Amazon SageMaker AI to build an ML model to predict customer behavior. The company needs to explain the bias in the model to an auditor. The explanation must focus on demographic data of the customers.

Which solution will meet these requirements?

Options:

A.

Use SageMaker Clarify to generate a bias report. Send the report to the auditor.

B.

Use AWS Glue DataBrew to create a job to detect drift in the model's data quality. Send the job output to the auditor.

C.

Use Amazon QuickSight integration with SageMaker AI to generate a bias report. Send the report to the auditor.

D.

Use Amazon CloudWatch metrics from the SageMaker AI namespace to create a bias dashboard. Share the dashboard with the auditor.

Buy Now
Questions 8

A company has an ML model that is deployed to an Amazon SageMaker AI endpoint for real-time inference. The company needs to deploy a new model. The company must compare the new model’s performance to the currently deployed model's performance before shifting all traffic to the new model.

Which solution will meet these requirements with the LEAST operational effort?

Options:

A.

Deploy the new model to a separate endpoint. Manually split traffic between the two endpoints.

B.

Deploy the new model to a separate endpoint. Use Amazon CloudFront to distribute traffic between the two endpoints.

C.

Deploy the new model as a shadow variant on the same endpoint as the current model. Route a portion of live traffic to the shadow model for evaluation.

D.

Use AWS Lambda functions with custom logic to route traffic between the current model and the new model.

Buy Now
Questions 9

An ML engineer develops a neural network model to predict whether customers will continue to subscribe to a service. The model performs well on training data. However, the accuracy of the model decreases significantly on evaluation data.

The ML engineer must resolve the model performance issue.

Which solution will meet this requirement?

Options:

A.

Penalize large weights by using L1 or L2 regularization.

B.

Remove dropout layers from the neural network.

C.

Train the model for longer by increasing the number of epochs.

D.

Capture complex patterns by increasing the number of layers.

Buy Now
Questions 10

A company needs to ingest data from data sources into Amazon SageMaker Data Wrangler. The data sources are Amazon S3, Amazon Redshift, and Snowflake. The ingested data must always be up to date with the latest changes in the source systems.

Which solution will meet these requirements?

Options:

A.

Use direct connections to import data from the data sources into Data Wrangler.

B.

Use cataloged connections to import data from the data sources into Data Wrangler.

C.

Use AWS Glue to extract data from the data sources. Use AWS Glue also to import the data directly into Data Wrangler.

D.

Use AWS Lambda to extract data from the data sources. Use Lambda also to import the data directly into Data Wrangler.

Buy Now
Questions 11

A company has a team of data scientists who use Amazon SageMaker notebook instances to test ML models. When the data scientists need new permissions, the company attaches the permissions to each individual role that was created during the creation of the SageMaker notebook instance.

The company needs to centralize management of the team's permissions.

Which solution will meet this requirement?

Options:

A.

Create a single IAM role that has the necessary permissions. Attach the role to each notebook instance that the team uses.

B.

Create a single IAM group. Add the data scientists to the group. Associate the group with each notebook instance that the team uses.

C.

Create a single IAM user. Attach the AdministratorAccess AWS managed IAM policy to the user. Configure each notebook instance to use the IAM user.

D.

Create a single IAM group. Add the data scientists to the group. Create an IAM role. Attach the AdministratorAccess AWS managed IAM policy to the role. Associate the role with the group. Associate the group with each notebook instance that the team uses.

Buy Now
Questions 12

A government agency is conducting a national census to assess program needs by area and city. The census form collects approximately 500 responses from each citizen. The agency needs to analyze the data to extract meaningful insights. The agency wants to reduce the dimensions of the high-dimensional data to uncover hidden patterns.

Which solution will meet these requirements?

Options:

A.

Use the principal component analysis (PCA) algorithm in Amazon SageMaker AI.

B.

Use the t-Distributed Stochastic Neighbor Embedding (t-SNE) algorithm in Amazon SageMaker AI.

C.

Use the k-means algorithm in Amazon SageMaker AI.

D.

Use the Random Cut Forest (RCF) algorithm in Amazon SageMaker AI.

Buy Now
Questions 13

A credit card company has a fraud detection model in production on an Amazon SageMaker endpoint. The company develops a new version of the model. The company needs to assess the new model's performance by using live data and without affecting production end users.

Which solution will meet these requirements?

Options:

A.

Set up SageMaker Debugger and create a custom rule.

B.

Set up blue/green deployments with all-at-once traffic shifting.

C.

Set up blue/green deployments with canary traffic shifting.

D.

Set up shadow testing with a shadow variant of the new model.

Buy Now
Questions 14

A company is using an Amazon S3 bucket to collect data that will be used for ML workflows. The company needs to use AWS Glue DataBrew to clean and normalize the data.

Which solution will meet these requirements?

Options:

A.

Create a DataBrew dataset by using the S3 path. Clean and normalize the data by using a DataBrew profile job.

B.

Create a DataBrew dataset by using the S3 path. Clean and normalize the data by using a DataBrew recipe job.

C.

Create a DataBrew dataset by using a JDBC driver to connect to the S3 bucket. Use a profile job.

D.

Create a DataBrew dataset by using a JDBC driver to connect to the S3 bucket. Use a recipe job.

Buy Now
Questions 15

A company has used Amazon SageMaker to deploy a predictive ML model in production. The company is using SageMaker Model Monitor on the model. After a model update, an ML engineer notices data quality issues in the Model Monitor checks.

What should the ML engineer do to mitigate the data quality issues that Model Monitor has identified?

Options:

A.

Adjust the model's parameters and hyperparameters.

B.

Initiate a manual Model Monitor job that uses the most recent production data.

C.

Create a new baseline from the latest dataset. Update Model Monitor to use the new baseline for evaluations.

D.

Include additional data in the existing training set for the model. Retrain and redeploy the model.

Buy Now
Questions 16

An ML engineer is using an Amazon SageMaker AI shadow test to evaluate a new model that is hosted on a SageMaker AI endpoint. The shadow test requires significant GPU resources for high performance. The production variant currently runs on a less powerful instance type.

The ML engineer needs to configure the shadow test to use a higher performance instance type for a shadow variant. The solution must not affect the instance type of the production variant.

Which solution will meet these requirements?

Options:

A.

Modify the existing ProductionVariant configuration in the endpoint to include a ShadowProductionVariants list. Specify the larger instance type for the shadow variant.

B.

Create a new endpoint configuration with two ProductionVariant definitions. Configure one definition for the existing production variant and one definition for the shadow variant with the larger instance type. Use the UpdateEndpoint action to apply the new configuration.

C.

Create a separate SageMaker AI endpoint for the shadow variant that uses the larger instance type. Create an AWS Lambda function that routes a portion of the traffic to the shadow endpoint. Assign the Lambda function to the original endpoint.

D.

Use the CreateEndpointConfig action to define a new configuration. Specify the existing production variant in the configuration and add a separate ShadowProductionVariants list. Specify the larger instance type for the shadow variant. Use the CreateEndpoint action and pass the new configuration to the endpoint.

Buy Now
Questions 17

An ML engineer wants to run a training job on Amazon SageMaker AI by using multiple GPUs. The training dataset is stored in Apache Parquet format.

The Parquet files are too large to fit into the memory of the SageMaker AI training instances.

Which solution will fix the memory problem?

Options:

A.

Attach an Amazon EBS Provisioned IOPS SSD volume and store the files on the EBS volume.

B.

Repartition the Parquet files by using Apache Spark on Amazon EMR and use the repartitioned files for training.

C.

Change to memory-optimized instance types with sufficient memory.

D.

Use SageMaker distributed data parallelism (SMDDP) to split memory usage.

Buy Now
Questions 18

A company runs its ML workflows on an on-premises Kubernetes cluster. The ML workflows include ML services that perform training and inferences for ML models. Each ML service runs from its own standalone Docker image.

The company needs to perform a lift and shift from the on-premises Kubernetes cluster to an Amazon Elastic Kubernetes Service (Amazon EKS) cluster.

Which solution will meet this requirement with the LEAST operational overhead?

Options:

A.

Redesign the ML services to be configured in Kubeflow. Deploy the new Kubeflow managed ML services to the EKS cluster.

B.

Upload the Docker images to an Amazon Elastic Container Registry (Amazon ECR) repository. Configure a deployment pipeline to deploy the images to the EKS cluster.

C.

Migrate the training data to an Amazon Redshift cluster. Retrain the models from the migrated training data by using Amazon Redshift ML. Deploy the retrained models to the EKS cluster.

D.

Configure an Amazon SageMaker AI notebook. Retrain the models with the same code. Deploy the retrained models to the EKS cluster.

Buy Now
Questions 19

A company has significantly increased the amount of data stored as .csv files in an Amazon S3 bucket. Data transformation scripts and queries are now taking much longer than before.

An ML engineer must implement a solution to optimize the data for query performance with the LEAST operational overhead.

Which solution will meet this requirement?

Options:

A.

Configure an AWS Lambda function to split the .csv files into smaller objects.

B.

Configure an AWS Glue job to drop string-type columns and save the results to S3.

C.

Configure an AWS Glue ETL job to convert the .csv files to Apache Parquet format.

D.

Configure an Amazon EMR cluster to process the data in S3.

Buy Now
Questions 20

A travel company has trained hundreds of geographic data models to answer customer questions by using Amazon SageMaker AI. Each model uses its own inferencing endpoint, which has become an operational challenge for the company.

The company wants to consolidate the models' inferencing endpoints to reduce operational overhead.

Which solution will meet these requirements?

Options:

A.

Use SageMaker AI multi-model endpoints. Deploy a single endpoint.

B.

Use SageMaker AI multi-container endpoints. Deploy a single endpoint.

C.

Use Amazon SageMaker Studio. Deploy a single-model endpoint.

D.

Use inference pipelines in SageMaker AI to combine tasks from hundreds of models to 15 models.

Buy Now
Questions 21

An ML engineer needs to use an ML model to predict the price of apartments in a specific location.

Which metric should the ML engineer use to evaluate the model's performance?

Options:

A.

Accuracy

B.

Area Under the ROC Curve (AUC)

C.

F1 score

D.

Mean absolute error (MAE)

Buy Now
Questions 22

An ML engineer has an Amazon Comprehend custom model in Account A in the us-east-1 Region. The ML engineer needs to copy the model to Account В in the same Region.

Which solution will meet this requirement with the LEAST development effort?

Options:

A.

Use Amazon S3 to make a copy of the model. Transfer the copy to Account B.

B.

Create a resource-based IAM policy. Use the Amazon Comprehend ImportModel API operation to copy the model to Account B.

C.

Use AWS DataSync to replicate the model from Account A to Account B.

D.

Create an AWS Site-to-Site VPN connection between Account A and Account В to transfer the model.

Buy Now
Questions 23

An ML engineer wants to run a training job on Amazon SageMaker AI. The training job will train a neural network by using multiple GPUs. The training dataset is stored in Parquet format.

The ML engineer discovered that the Parquet dataset contains files too large to fit into the memory of the SageMaker AI training instances.

Which solution will fix the memory problem?

Options:

A.

Attach an Amazon Elastic Block Store (Amazon EBS) Provisioned IOPS SSD volume to the instance. Store the files in the EBS volume.

B.

Repartition the Parquet files by using Apache Spark on Amazon EMR. Use the repartitioned files for the training job.

C.

Change the instance type to Memory Optimized instances with sufficient memory for the training job.

D.

Use the SageMaker AI distributed data parallelism (SMDDP) library with multiple instances to split the memory usage.

Buy Now
Questions 24

An ML engineer is training a simple neural network model. The model’s performance improves initially and then degrades after a certain number of epochs.

Which solutions will mitigate this problem? (Select TWO.)

Options:

A.

Enable early stopping on the model.

B.

Increase dropout in the layers.

C.

Increase the number of layers.

D.

Increase the number of neurons.

E.

Investigate and reduce the sources of model bias.

Buy Now
Questions 25

An ML engineer is tuning an image classification model that performs poorly on one of two classes. The poorly performing class represents an extremely small fraction of the training dataset.

Which solution will improve the model’s performance?

Options:

A.

Optimize for accuracy. Use image augmentation on the less common images.

B.

Optimize for F1 score. Use image augmentation on the less common images.

C.

Optimize for accuracy. Use SMOTE to generate synthetic images.

D.

Optimize for F1 score. Use SMOTE to generate synthetic images.

Buy Now
Questions 26

An ML engineer is using Amazon SageMaker AI to train an ML model. The ML engineer needs to use SageMaker AI automatic model tuning (AMT) features to tune the model hyperparameters over a large parameter space.

The model has 20 categorical hyperparameters and 7 continuous hyperparameters that can be tuned. The ML engineer needs to run the tuning job a maximum of 1,000 times. The ML engineer must ensure that each parameter trial is built based on the performance of the previous trial.

Which solution will meet these requirements?

Options:

A.

Define the search space as categorical parameters of 1,000 possible combinations. Use grid search.

B.

Define the search space as continuous parameters. Use random search. Set the maximum number of tuning jobs to 1,000.

C.

Define the search space as categorical parameters and continuous parameters. Use Bayesian optimization. Set the maximum number of training jobs to 1,000.

D.

Define the search space as categorical parameters and continuous parameters. Use grid search. Set the maximum number of tuning jobs to 1,000.

Buy Now
Questions 27

A company is using an AWS Lambda function to monitor the metrics from an ML model. An ML engineer needs to implement a solution to send an email message when the metrics breach a threshold.

Which solution will meet this requirement?

Options:

A.

Log the metrics from the Lambda function to AWS CloudTrail. Configure a CloudTrail trail to send the email message.

B.

Log the metrics from the Lambda function to Amazon CloudFront. Configure an Amazon CloudWatch alarm to send the email message.

C.

Log the metrics from the Lambda function to Amazon CloudWatch. Configure a CloudWatch alarm to send the email message.

D.

Log the metrics from the Lambda function to Amazon CloudWatch. Configure an Amazon CloudFront rule to send the email message.

Buy Now
Questions 28

A company has a binary classification model in production. An ML engineer needs to develop a new version of the model.

The new model version must maximize correct predictions of positive labels and negative labels. The ML engineer must use a metric to recalibrate the model to meet these requirements.

Which metric should the ML engineer use for the model recalibration?

Options:

A.

Accuracy

B.

Precision

C.

Recall

D.

Specificity

Buy Now
Questions 29

Case Study

A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a

central model registry, model deployment, and model monitoring.

The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.

The company needs to run an on-demand workflow to monitor bias drift for models that are deployed to real-time endpoints from the application.

Which action will meet this requirement?

Options:

A.

Configure the application to invoke an AWS Lambda function that runs a SageMaker Clarify job.

B.

Invoke an AWS Lambda function to pull the sagemaker-model-monitor-analyzer built-in SageMaker image.

C.

Use AWS Glue Data Quality to monitor bias.

D.

Use SageMaker notebooks to compare the bias.

Buy Now
Questions 30

An ML engineer is setting up an Amazon SageMaker AI pipeline for an ML model. The pipeline must automatically initiate a retraining job if any data drift is detected.

How should the ML engineer set up the pipeline to meet this requirement?

Options:

A.

Use an AWS Glue crawler and an AWS Glue ETL job to detect data drift. Use AWS Glue triggers to automate the retraining job.

B.

Use Amazon Managed Service for Apache Flink to detect data drift. Use an AWS Lambda function to automate the retraining job.

C.

Use SageMaker Model Monitor to detect data drift. Use an AWS Lambda function to automate the retraining job.

D.

Use Amazon QuickSight anomaly detection to detect data drift. Use an AWS Step Functions workflow to automate the retraining job.

Buy Now
Questions 31

A company uses a batching solution to process daily analytics. The company wants to provide near real-time updates, use open-source technology, and avoid managing or scaling infrastructure.

Which solution will meet these requirements?

Options:

A.

Create Amazon Managed Streaming for Apache Kafka (Amazon MSK) Serverless clusters.

B.

Create Amazon MSK Provisioned clusters.

C.

Create Amazon Kinesis Data Streams with Application Auto Scaling.

D.

Create self-hosted Apache Flink applications on Amazon EC2.

Buy Now
Questions 32

An ML engineer is using Amazon SageMaker to train a deep learning model that requires distributed training. After some training attempts, the ML engineer observes that the instances are not performing as expected. The ML engineer identifies communication overhead between the training instances.

What should the ML engineer do to MINIMIZE the communication overhead between the instances?

Options:

A.

Place the instances in the same VPC subnet. Store the data in a different AWS Region from where the instances are deployed.

B.

Place the instances in the same VPC subnet but in different Availability Zones. Store the data in a different AWS Region from where the instances are deployed.

C.

Place the instances in the same VPC subnet. Store the data in the same AWS Region and Availability Zone where the instances are deployed.

D.

Place the instances in the same VPC subnet. Store the data in the same AWS Region but in a different Availability Zone from where the instances are deployed.

Buy Now
Questions 33

A company has an ML model that generates text descriptions based on images that customers upload to the company's website. The images can be up to 50 MB in total size.

An ML engineer decides to store the images in an Amazon S3 bucket. The ML engineer must implement a processing solution that can scale to accommodate changes in demand.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.

Create an Amazon SageMaker batch transform job to process all the images in the S3 bucket.

B.

Create an Amazon SageMaker Asynchronous Inference endpoint and a scaling policy. Run a script to make an inference request for each image.

C.

Create an Amazon Elastic Kubernetes Service (Amazon EKS) cluster that uses Karpenter for auto scaling. Host the model on the EKS cluster. Run a script to make an inference request for each image.

D.

Create an AWS Batch job that uses an Amazon Elastic Container Service (Amazon ECS) cluster. Specify a list of images to process for each AWS Batch job.

Buy Now
Questions 34

A company is developing an ML model by using Amazon SageMaker AI. The company must monitor bias in the model and display the results on a dashboard. An ML engineer creates a bias monitoring job.

How should the ML engineer capture bias metrics to display on the dashboard?

Options:

A.

Capture AWS CloudTrail metrics from SageMaker Clarify.

B.

Capture Amazon CloudWatch metrics from SageMaker Clarify.

C.

Capture SageMaker Model Monitor metrics from Amazon EventBridge.

D.

Capture SageMaker Model Monitor metrics from Amazon SNS.

Buy Now
Questions 35

A company is using ML to predict the presence of a specific weed in a farmer's field. The company is using the Amazon SageMaker linear learner built-in algorithm with a value of multiclass_dassifier for the predictorjype hyperparameter.

What should the company do to MINIMIZE false positives?

Options:

A.

Set the value of the weight decay hyperparameter to zero.

B.

Increase the number of training epochs.

C.

Increase the value of the target_precision hyperparameter.

D.

Change the value of the predictorjype hyperparameter to regressor.

Buy Now
Questions 36

A company wants to host an ML model on Amazon SageMaker. An ML engineer is configuring a continuous integration and continuous delivery (Cl/CD) pipeline in AWS CodePipeline to deploy the model. The pipeline must run automatically when new training data for the model is uploaded to an Amazon S3 bucket.

Select and order the pipeline's correct steps from the following list. Each step should be selected one time or not at all. (Select and order three.)

• An S3 event notification invokes the pipeline when new data is uploaded.

• S3 Lifecycle rule invokes the pipeline when new data is uploaded.

• SageMaker retrains the model by using the data in the S3 bucket.

• The pipeline deploys the model to a SageMaker endpoint.

• The pipeline deploys the model to SageMaker Model Registry.

MLA-C01 Question 36

Options:

Buy Now
Questions 37

An ML engineer is building a generative AI application on Amazon Bedrock by using large language models (LLMs).

Select the correct generative AI term from the following list for each description. Each term should be selected one time or not at all. (Select three.)

• Embedding

• Retrieval Augmented Generation (RAG)

• Temperature

• Token

MLA-C01 Question 37

Options:

Buy Now
Questions 38

Case study

An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.

The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.

Which AWS service or feature can aggregate the data from the various data sources?

Options:

A.

Amazon EMR Spark jobs

B.

Amazon Kinesis Data Streams

C.

Amazon DynamoDB

D.

AWS Lake Formation

Buy Now
Questions 39

A company is creating an ML model to identify defects in a product. The company has gathered a dataset and has stored the dataset in TIFF format in Amazon S3. The dataset contains 200 images in which the most common defects are visible. The dataset also contains 1,800 images in which there is no defect visible.

An ML engineer trains the model and notices poor performance in some classes. The ML engineer identifies a class imbalance problem in the dataset.

What should the ML engineer do to solve this problem?

Options:

A.

Use a few hundred images and Amazon Rekognition Custom Labels to train a new model.

B.

Undersample the 200 images in which the most common defects are visible.

C.

Oversample the 200 images in which the most common defects are visible.

D.

Use all 2,000 images and Amazon Rekognition Custom Labels to train a new model.

Buy Now
Questions 40

An ML engineer is analyzing a classification dataset before training a model in Amazon SageMaker AI. The ML engineer suspects that the dataset has a significant imbalance between class labels that could lead to biased model predictions. To confirm class imbalance, the ML engineer needs to select an appropriate pre-training bias metric.

Which metric will meet this requirement?

Options:

A.

Mean squared error (MSE)

B.

Difference in proportions of labels (DPL)

C.

Silhouette score

D.

Structural similarity index measure (SSIM)

Buy Now
Questions 41

A company is building a conversational AI assistant on Amazon Bedrock. The company is using Retrieval Augmented Generation (RAG) to reference the company's internal knowledge base. The AI assistant uses the Anthropic Claude 4 foundation model (FM).

The company needs a solution that uses a vector embedding model, a vector store, and a vector search algorithm.

Which solution will develop the AI assistant with the LEAST development effort?

Options:

A.

Use Amazon Kendra Experience Builder.

B.

Use Amazon Aurora PostgreSQL with the pgvector extension.

C.

Use Amazon RDS for PostgreSQL with the pgvector extension.

D.

Use the AWS Glue Data Catalog metadata repository.

Buy Now
Questions 42

A company stores training data as a .csv file in an Amazon S3 bucket. The company must encrypt the data and must control which applications have access to the encryption key.

Which solution will meet these requirements?

Options:

A.

Create a new SSH access key and use the AWS Encryption CLI to encrypt the file.

B.

Create a new API key by using Amazon API Gateway and use it to encrypt the file.

C.

Create a new IAM role with permissions for kms:GenerateDataKey and use the role to encrypt the file.

D.

Create a new AWS Key Management Service (AWS KMS) key and use the AWS Encryption CLI with the KMS key to encrypt the file.

Buy Now
Questions 43

An ML engineer needs to use AWS CloudFormation to create an ML model that an Amazon SageMaker endpoint will host.

Which resource should the ML engineer declare in the CloudFormation template to meet this requirement?

Options:

A.

AWS::SageMaker::Model

B.

AWS::SageMaker::Endpoint

C.

AWS::SageMaker::NotebookInstance

D.

AWS::SageMaker::Pipeline

Buy Now
Questions 44

A company has deployed an XGBoost prediction model in production to predict if a customer is likely to cancel a subscription. The company uses Amazon SageMaker Model Monitor to detect deviations in the F1 score.

During a baseline analysis of model quality, the company recorded a threshold for the F1 score. After several months of no change, the model's F1 score decreases significantly.

What could be the reason for the reduced F1 score?

Options:

A.

Concept drift occurred in the underlying customer data that was used for predictions.

B.

The model was not sufficiently complex to capture all the patterns in the original baseline data.

C.

The original baseline data had a data quality issue of missing values.

D.

Incorrect ground truth labels were provided to Model Monitor during the calculation of the baseline.

Buy Now
Questions 45

A company has AWS Glue data processing jobs that are orchestrated by an AWS Glue workflow. The AWS Glue jobs can run on a schedule or can be launched manually.

The company is developing pipelines in Amazon SageMaker Pipelines for ML model development. The pipelines will use the output of the AWS Glue jobs during the data processing phase of model development. An ML engineer needs to implement a solution that integrates the AWS Glue jobs with the pipelines.

Which solution will meet these requirements with the LEAST operational overhead?

Options:

A.

Use AWS Step Functions for orchestration of the pipelines and the AWS Glue jobs.

B.

Use processing steps in SageMaker Pipelines. Configure inputs that point to the Amazon Resource Names (ARNs) of the AWS Glue jobs.

C.

Use Callback steps in SageMaker Pipelines to start the AWS Glue workflow and to stop the pipelines until the AWS Glue jobs finish running.

D.

Use Amazon EventBridge to invoke the pipelines and the AWS Glue jobs in the desired order.

Buy Now
Questions 46

An ML engineer needs to deploy a trained model based on a genetic algorithm. Predictions can take several minutes, and requests can include up to 100 MB of data.

Which deployment solution will meet these requirements with the LEAST operational overhead?

Options:

A.

Deploy on EC2 Auto Scaling behind an ALB.

B.

Deploy to a SageMaker AI real-time endpoint.

C.

Deploy to a SageMaker AI Asynchronous Inference endpoint.

D.

Deploy to Amazon ECS on EC2.

Buy Now
Questions 47

An ML engineer wants to re-train an XGBoost model at the end of each month. A data team prepares the training data. The training dataset is a few hundred megabytes in size. When the data is ready, the data team stores the data as a new file in an Amazon S3 bucket.

The ML engineer needs a solution to automate this pipeline. The solution must register the new model version in Amazon SageMaker Model Registry within 24 hours.

Which solution will meet these requirements?

Options:

A.

Create an AWS Lambda function that runs one time each week to poll the S3 bucket for new files. Invoke the Lambda function asynchronously. Configure the Lambda function to start the pipeline if the function detects new data.

B.

Create an Amazon CloudWatch rule that runs on a schedule to start the pipeline every 30 days.

C.

Create an S3 Lifecycle rule to start the pipeline every time a new object is uploaded to the S3 bucket.

D.

Create an Amazon EventBridge rule to start an AWS Step Functions TrainingStep every time a new object is uploaded to the S3 bucket.

Buy Now
Questions 48

A company is training a deep learning model to detect abnormalities in images. The company has limited GPU resources and a large hyperparameter space to explore. The company needs to test different configurations and avoid wasting computation time on poorly performing models that show weak validation accuracy in early epochs.

Which hyperparameter optimization strategy should the company use?

Options:

A.

Grid search across all possible combinations

B.

Bayesian optimization with early stopping

C.

Manual tuning of each parameter individually

D.

Exhaustive search without early stopping

Buy Now
Questions 49

A company has a Retrieval Augmented Generation (RAG) application that uses a vector database to store embeddings of documents. The company must migrate the application to AWS and must implement a solution that provides semantic search of text files. The company has already migrated the text repository to an Amazon S3 bucket.

Which solution will meet these requirements?

Options:

A.

Use an AWS Batch job to process the files and generate embeddings. Use AWS Glue to store the embeddings. Use SQL queries to perform the semantic searches.

B.

Use a custom Amazon SageMaker AI notebook to run a custom script to generate embeddings. Use SageMaker Feature Store to store the embeddings. Use SQL queries to perform the semantic searches.

C.

Use the Amazon Kendra S3 connector to ingest the documents from the S3 bucket into Amazon Kendra. Query Amazon Kendra to perform the semantic searches.

D.

Use an Amazon Textract asynchronous job to ingest the documents from the S3 bucket. Query Amazon Textract to perform the semantic searches.

Buy Now
Questions 50

A company uses Amazon SageMaker AI to create ML models. The data scientists need fine-grained control of ML workflows, DAG visualization, experiment history, and model governance for auditing and compliance.

Which solution will meet these requirements?

Options:

A.

Use AWS CodePipeline with SageMaker Studio and SageMaker ML Lineage Tracking.

B.

Use AWS CodePipeline with SageMaker Experiments.

C.

Use SageMaker Pipelines with SageMaker Studio and SageMaker ML Lineage Tracking.

D.

Use SageMaker Pipelines with SageMaker Experiments.

Buy Now
Questions 51

A company uses an ML model to recommend videos to users. The model is deployed on Amazon SageMaker AI. The model performed well initially after deployment, but the model's performance has degraded over time.

Which solution can the company use to identify model drift in the future?

Options:

A.

Create a monitoring job in SageMaker Model Monitor. Then create a baseline from the training dataset.

B.

Create a baseline from the training dataset. Then create a monitoring job in SageMaker Model Monitor.

C.

Create a baseline by using a built-in rule in SageMaker Clarify. Monitor the drift in Amazon CloudWatch.

D.

Retrain the model on new data. Compare the retrained model's performance to the original model's performance.

Buy Now
Questions 52

A company is using Amazon SageMaker to create ML models. The company's data scientists need fine-grained control of the ML workflows that they orchestrate. The data scientists also need the ability to visualize SageMaker jobs and workflows as a directed acyclic graph (DAG). The data scientists must keep a running history of model discovery experiments and must establish model governance for auditing and compliance verifications.

Which solution will meet these requirements?

Options:

A.

Use AWS CodePipeline and its integration with SageMaker Studio to manage the entire ML workflows. Use SageMaker ML Lineage Tracking for the running history of experiments and for auditing and compliance verifications.

B.

Use AWS CodePipeline and its integration with SageMaker Experiments to manage the entire ML workflows. Use SageMaker Experiments for the running history of experiments and for auditing and compliance verifications.

C.

Use SageMaker Pipelines and its integration with SageMaker Studio to manage the entire ML workflows. Use SageMaker ML Lineage Tracking for the running history of experiments and for auditing and compliance verifications.

D.

Use SageMaker Pipelines and its integration with SageMaker Experiments to manage the entire ML workflows. Use SageMaker Experiments for the running history of experiments and for auditing and compliance verifications.

Buy Now
Questions 53

A company has developed a new ML model. The company requires online model validation on 10% of the traffic before the company fully releases the model in production. The company uses an Amazon SageMaker endpoint behind an Application Load Balancer (ALB) to serve the model.

Which solution will set up the required online validation with the LEAST operational overhead?

Options:

A.

Use production variants to add the new model to the existing SageMaker endpoint. Set the variant weight to 0.1 for the new model. Monitor the number of invocations by using Amazon CloudWatch.

B.

Use production variants to add the new model to the existing SageMaker endpoint. Set the variant weight to 1 for the new model. Monitor the number of invocations by using Amazon CloudWatch.

C.

Create a new SageMaker endpoint. Use production variants to add the new model to the new endpoint. Monitor the number of invocations by using Amazon CloudWatch.

D.

Configure the ALB to route 10% of the traffic to the new model at the existing SageMaker endpoint. Monitor the number of invocations by using AWS CloudTrail.

Buy Now
Questions 54

An ML engineer is setting up an Amazon SageMaker AI pipeline for an ML model. The pipeline must automatically initiate a re-training job if any data drift is detected.

How should the ML engineer set up the pipeline to meet this requirement?

Options:

A.

Use an AWS Glue crawler and an AWS Glue extract, transform, and load (ETL) job to detect data drift. Use AWS Glue triggers to automate the retraining job.

B.

Use Amazon Managed Service for Apache Flink to detect data drift. Use an AWS Lambda function to automate the re-training job.

C.

Use SageMaker Model Monitor to detect data drift. Use an AWS Lambda function to automate the re-training job.

D.

Use Amazon Quick Suite (previously known as Amazon QuickSight) anomaly detection to detect data drift. Use an AWS Step Functions workflow to automate the re-training job.

Buy Now
Questions 55

An ML engineer needs to use data with Amazon SageMaker Canvas to train an ML model. The data is stored in Amazon S3 and is complex in structure. The ML engineer must use a file format that minimizes processing time for the data.

Which file format will meet these requirements?

Options:

A.

CSV files compressed with Snappy

B.

JSON objects in JSONL format

C.

JSON files compressed with gzip

D.

Apache Parquet files

Buy Now
Questions 56

A healthcare company wants to detect irregularities in patient vital signs that could indicate early signs of a medical condition. The company has an unlabeled dataset that includes patient health records, medication history, and lifestyle changes.

Which algorithm and hyperparameter should the company use to meet this requirement?

Options:

A.

Use the Amazon SageMaker AI XGBoost algorithm. Set max_depth to greater than 100 to regulate tree complexity.

B.

Use the Amazon SageMaker AI k-means clustering algorithm. Set k to determine the number of clusters.

C.

Use the Amazon SageMaker AI DeepAR algorithm. Set epochs to the number of training iterations.

D.

Use the Amazon SageMaker AI Random Cut Forest (RCF) algorithm. Set num_trees to greater than 100.

Buy Now
Questions 57

Case study

An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.

The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies. The algorithm is not capturing all the desired underlying patterns in the data.

The training dataset includes categorical data and numerical data. The ML engineer must prepare the training dataset to maximize the accuracy of the model.

Which action will meet this requirement with the LEAST operational overhead?

Options:

A.

Use AWS Glue to transform the categorical data into numerical data.

B.

Use AWS Glue to transform the numerical data into categorical data.

C.

Use Amazon SageMaker Data Wrangler to transform the categorical data into numerical data.

D.

Use Amazon SageMaker Data Wrangler to transform the numerical data into categorical data.

Buy Now
Questions 58

A company uses the Amazon SageMaker AI Object2Vec algorithm to train an ML model. The model performs well on training data but underperforms after deployment. The company wants to avoid overfitting the model and maintain the model's ability to generalize.

Which solution will meet these requirements?

Options:

A.

Decrease the early_stopping_patience hyperparameter.

B.

Increase the mini_batch_size hyperparameter.

C.

Decrease the dropout rate.

D.

Increase the number of epochs.

Buy Now
Questions 59

A company's ML engineer has deployed an ML model for sentiment analysis to an Amazon SageMaker endpoint. The ML engineer needs to explain to company stakeholders how the model makes predictions.

Which solution will provide an explanation for the model's predictions?

Options:

A.

Use SageMaker Model Monitor on the deployed model.

B.

Use SageMaker Clarify on the deployed model.

C.

Show the distribution of inferences from A/В testing in Amazon CloudWatch.

D.

Add a shadow endpoint. Analyze prediction differences on samples.

Buy Now
Questions 60

Case Study

A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a

central model registry, model deployment, and model monitoring.

The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.

The company is experimenting with consecutive training jobs.

How can the company MINIMIZE infrastructure startup times for these jobs?

Options:

A.

Use Managed Spot Training.

B.

Use SageMaker managed warm pools.

C.

Use SageMaker Training Compiler.

D.

Use the SageMaker distributed data parallelism (SMDDP) library.

Buy Now
Questions 61

Case Study

A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a

central model registry, model deployment, and model monitoring.

The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.

The company needs to use the central model registry to manage different versions of models in the application.

Which action will meet this requirement with the LEAST operational overhead?

Options:

A.

Create a separate Amazon Elastic Container Registry (Amazon ECR) repository for each model.

B.

Use Amazon Elastic Container Registry (Amazon ECR) and unique tags for each model version.

C.

Use the SageMaker Model Registry and model groups to catalog the models.

D.

Use the SageMaker Model Registry and unique tags for each model version.

Buy Now
Questions 62

A company stores historical data in .csv files in Amazon S3. Only some of the rows and columns in the .csv files are populated. The columns are not labeled. An ML

engineer needs to prepare and store the data so that the company can use the data to train ML models.

Select and order the correct steps from the following list to perform this task. Each step should be selected one time or not at all. (Select and order three.)

• Create an Amazon SageMaker batch transform job for data cleaning and feature engineering.

• Store the resulting data back in Amazon S3.

• Use Amazon Athena to infer the schemas and available columns.

• Use AWS Glue crawlers to infer the schemas and available columns.

• Use AWS Glue DataBrew for data cleaning and feature engineering.

MLA-C01 Question 62

Options:

Buy Now
Exam Code: MLA-C01
Exam Name: AWS Certified Machine Learning Engineer - Associate
Last Update: Feb 19, 2026
Questions: 207

PDF + Testing Engine

$63.52  $181.49

Testing Engine

$50.57  $144.49
buy now MLA-C01 testing engine

PDF (Q&A)

$43.57  $124.49
buy now MLA-C01 pdf