Summer Special Sale Limited Time 60% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 713PS592

DP-203 Data Engineering on Microsoft Azure Questions and Answers

Questions 4

You need to implement an Azure Synapse Analytics database object for storing the sales transactions data. The solution must meet the sales transaction dataset requirements.

What solution must meet the sales transaction dataset requirements.

What should you do? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 4

Options:

Buy Now
Questions 5

You need to design a data retention solution for the Twitter teed data records. The solution must meet the customer sentiment analytics requirements.

Which Azure Storage functionality should you include in the solution?

Options:

A.

time-based retention

B.

change feed

C.

soft delete

D.

Iifecycle management

Buy Now
Questions 6

You need to integrate the on-premises data sources and Azure Synapse Analytics. The solution must meet the data integration requirements.

Which type of integration runtime should you use?

Options:

A.

Azure-SSIS integration runtime

B.

self-hosted integration runtime

C.

Azure integration runtime

Buy Now
Questions 7

You need to ensure that the Twitter feed data can be analyzed in the dedicated SQL pool. The solution must meet the customer sentiment analytics requirements.

Which three Transaction-SQL DDL commands should you run in sequence? To answer, move the appropriate commands from the list of commands to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

DP-203 Question 7

Options:

Buy Now
Questions 8

You are designing an enterprise data warehouse in Azure Synapse Analytics that will store website traffic analytics in a star schema.

You plan to have a fact table for website visits. The table will be approximately 5 GB.

You need to recommend which distribution type and index type to use for the table. The solution must provide the fastest query performance.

What should you recommend? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 8

Options:

Buy Now
Questions 9

You are designing an Azure Data Lake Storage Gen2 container to store data for the human resources (HR) department and the operations department at your company. You have the following data access requirements:

• After initial processing, the HR department data will be retained for seven years.

• The operations department data will be accessed frequently for the first six months, and then accessed once per month.

You need to design a data retention solution to meet the access requirements. The solution must minimize storage costs.

Options:

Buy Now
Questions 10

You have an Azure subscription linked to an Azure Active Directory (Azure AD) tenant that contains a service principal named ServicePrincipal1. The subscription contains an Azure Data Lake Storage account named adls1. Adls1 contains a folder named Folder2 that has a URI of https://adls1.dfs.core.windows.net/container1/Folder1/Folder2/.

ServicePrincipal1 has the access control list (ACL) permissions shown in the following table.

DP-203 Question 10

You need to ensure that ServicePrincipal1 can perform the following actions:

  • Traverse child items that are created in Folder2.
  • Read files that are created in Folder2.

The solution must use the principle of least privilege.

Which two permissions should you grant to ServicePrincipal1 for Folder2? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.

Access - Read

B.

Access - Write

C.

Access - Execute

D.

Default-Read

E.

Default - Write

F.

Default - Execute

Buy Now
Questions 11

You are performing exploratory analysis of the bus fare data in an Azure Data Lake Storage Gen2 account by using an Azure Synapse Analytics serverless SQL pool.

You execute the Transact-SQL query shown in the following exhibit.

DP-203 Question 11

What do the query results include?

Options:

A.

Only CSV files in the tripdata_2020 subfolder.

B.

All files that have file names that beginning with "tripdata_2020".

C.

All CSV files that have file names that contain "tripdata_2020".

D.

Only CSV that have file names that beginning with "tripdata_2020".

Buy Now
Questions 12

You have an Azure Stream Analytics job that read data from an Azure event hub.

You need to evaluate whether the job processes data as quickly as the data arrives or cannot keep up.

Which metric should you review?

Options:

A.

InputEventLastPunctuationTime

B.

Input Sources Receive

C.

Late input Events

D.

Backlogged input Events

Buy Now
Questions 13

You are designing a partition strategy for a fact table in an Azure Synapse Analytics dedicated SQL pool. The table has the following specifications:

• Contain sales data for 20,000 products.

• Use hash distribution on a column named ProduclID,

• Contain 2.4 billion records for the years 20l9 and 2020.

Which number of partition ranges provides optimal compression and performance of the clustered columnstore index?

Options:

A.

40

B.

240

C.

400

D.

2,400

Buy Now
Questions 14

You configure version control for an Azure Data Factory instance as shown in the following exhibit.

DP-203 Question 14

Use the drop-down menus to select the answer choice that completes each statement based on the information presented in the graphic.

NOTE: Each correct selection is worth one point.

DP-203 Question 14

Options:

Buy Now
Questions 15

You have a SQL pool in Azure Synapse.

You discover that some queries fail or take a long time to complete.

You need to monitor for transactions that have rolled back.

Which dynamic management view should you query?

Options:

A.

sys.dm_pdw_request_steps

B.

sys.dm_pdw_nodes_tran_database_transactions

C.

sys.dm_pdw_waits

D.

sys.dm_pdw_exec_sessions

Buy Now
Questions 16

A company purchases IoT devices to monitor manufacturing machinery. The company uses an IoT appliance to communicate with the IoT devices.

The company must be able to monitor the devices in real-time.

You need to design the solution.

What should you recommend?

Options:

A.

Azure Stream Analytics cloud job using Azure PowerShell

B.

Azure Analysis Services using Azure Portal

C.

Azure Data Factory instance using Azure Portal

D.

Azure Analysis Services using Azure PowerShell

Buy Now
Questions 17

You have an Azure subscription that contains a Microsoft Purview account.

You need to search the Microsoft Purview Data Catalog to identify assets that have an assetType property of Table or View

Which query should you run?

Options:

A.

assetType IN (Table', 'View')

B.

assetType:Table OR assetType:View

C.

assetType - (Table or view)

D.

assetType:(Table OR View)

Buy Now
Questions 18

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You have an Azure Storage account that contains 100 GB of files. The files contain text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB.

You plan to copy the data from the storage account to an Azure SQL data warehouse.

You need to prepare the files to ensure that the data copies quickly.

Solution: You modify the files to ensure that each row is less than 1 MB.

Does this meet the goal?

Options:

A.

Yes

B.

No

Buy Now
Questions 19

You have two Azure Data Factory instances named ADFdev and ADFprod. ADFdev connects to an Azure DevOps Git repository.

You publish changes from the main branch of the Git repository to ADFdev.

You need to deploy the artifacts from ADFdev to ADFprod.

What should you do first?

Options:

A.

From ADFdev, modify the Git configuration.

B.

From ADFdev, create a linked service.

C.

From Azure DevOps, create a release pipeline.

D.

From Azure DevOps, update the main branch.

Buy Now
Questions 20

You have an Azure subscription that contains an Azure Synapse Analytics workspace and a user named Used.

You need to ensure that User1 can review the Azure Synapse Analytics database templates from the gallery. The solution must follow the principle of least privilege.

Which role should you assign to User1?

Options:

A.

Synapse User

B.

Synapse Contributor

C.

Storage blob Data Contributor

D.

Synapse Administrator

Buy Now
Questions 21

You have an Azure Synapse Analytics dedicated SQL pool named Pool1 and an Azure Data Lake Storage Gen2 account named Account1.

You plan to access the files in Account1 by using an external table.

You need to create a data source in Pool1 that you can reference when you create the external table.

How should you complete the Transact-SQL statement? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 21

Options:

Buy Now
Questions 22

You have an Azure Stream Analytics job named Job1.

The metrics of Job1 from the last hour are shown in the following table.

DP-203 Question 22

The late arrival tolerance for Job1 is set to the five seconds.

You need to optimize Job1.

Which two actions achieve the goal? Each correct answer presents a complete solution.

NOTE: Each correct answer is worth one point.

Options:

A.

Resolution errors in inputs processing.

B.

Parallelize the query

C.

Resolution errors in output processing

D.

Increase the number of SUs.

Buy Now
Questions 23

You have an Azure subscription that contains an Azure Synapse Analytics dedicated SQL pool named Pool1 and an Azure Data Lake Storage account named storage1. Storage1 requires secure transfers.

You need to create an external data source in Pool1 that will be used to read .orc files in storage1.

How should you complete the code? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 23

Options:

Buy Now
Questions 24

You have an Azure Data Lake Storage Gen2 container.

Data is ingested into the container, and then transformed by a data integration application. The data is NOT modified after that. Users can read files in the container but cannot modify the files.

You need to design a data archiving solution that meets the following requirements:

  • New data is accessed frequently and must be available as quickly as possible.
  • Data that is older than five years is accessed infrequently but must be available within one second when requested.
  • Data that is older than seven years is NOT accessed. After seven years, the data must be persisted at the lowest cost possible.
  • Costs must be minimized while maintaining the required availability.

How should you manage the data? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point

DP-203 Question 24

Options:

Buy Now
Questions 25

You are designing database for an Azure Synapse Analytics dedicated SQL pool to support workloads for detecting ecommerce transaction fraud.

Data will be combined from multiple ecommerce sites and can include sensitive financial information such as credit card numbers.

You need to recommend a solution that meets the following requirements:

  • Users must be able to identify potentially fraudulent transactions.
  • Users must be able to use credit cards as a potential feature in models.
  • Users must NOT be able to access the actual credit card numbers.

What should you include in the recommendation?

Options:

A.

Transparent Data Encryption (TDE)

B.

row-level security (RLS)

C.

column-level encryption

D.

Azure Active Directory (Azure AD) pass-through authentication

Buy Now
Questions 26

You have an Azure subscription that contains the resources shown in the following table.

DP-203 Question 26

You need to ingest the Parquet files from storage1 to SQL1 by using pipeline1. The solution must meet the following requirements:

• Minimize complexity.

• Ensure that additional columns in the files are processed as strings.

• Ensure that files containing additional columns are processed successfully.

How should you configure pipeline1? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 26

Options:

Buy Now
Questions 27

You have an Azure Storage account that generates 200,000 new files daily. The file names have a format of {YYYY}/{MM}/{DD}/{HH}/{CustomerID}.csv.

You need to design an Azure Data Factory solution that will load new data from the storage account to an Azure Data Lake once hourly. The solution must minimize load times and costs.

How should you configure the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 27

Options:

Buy Now
Questions 28

You are designing a security model for an Azure Synapse Analytics dedicated SQL pool that will support multiple companies. You need to ensure that users from each company can view only the data of their respective company. Which two objects should you include in the solution? Each correct answer presents part of the solution

NOTE: Each correct selection it worth one point.

Options:

A.

a custom role-based access control (RBAC) role.

B.

asymmetric keys

C.

a predicate function

D.

a column encryption key

E.

a security policy

Buy Now
Questions 29

You are monitoring an Azure Stream Analytics job by using metrics in Azure.

You discover that during the last 12 hours, the average watermark delay is consistently greater than the configured late arrival tolerance.

What is a possible cause of this behavior?

Options:

A.

Events whose application timestamp is earlier than their arrival time by more than five minutes arrive as inputs.

B.

There are errors in the input data.

C.

The late arrival policy causes events to be dropped.

D.

The job lacks the resources to process the volume of incoming data.

Buy Now
Questions 30

You have an Azure subscription that contains an Azure Data Lake Storage Gen2 account named storage1. Storage1 contains a container named container1. Container1 contains a directory named directory1. Directory1 contains a file named file1.

You have an Azure Active Directory (Azure AD) user named User1 that is assigned the Storage Blob Data Reader role for storage1.

You need to ensure that User1 can append data to file1. The solution must use the principle of least privilege.

Which permissions should you grant? To answer, drag the appropriate permissions to the correct resources. Each permission may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

DP-203 Question 30

Options:

Buy Now
Questions 31

You have a partitioned table in an Azure Synapse Analytics dedicated SQL pool.

You need to design queries to maximize the benefits of partition elimination.

What should you include in the Transact-SQL queries?

Options:

A.

JOIN

B.

WHERE

C.

DISTINCT

D.

GROUP BY

Buy Now
Questions 32

You are implementing a star schema in an Azure Synapse Analytics dedicated SQL pool.

You plan to create a table named DimProduct.

DimProduct must be a Type 3 slowly changing dimension (SCO) table that meets the following requirements:

• The values in two columns named ProductKey and ProductSourceID will remain the same.

• The values in three columns named ProductName, ProductDescription, and Color can change.

You need to add additional columns to complete the following table definition.

DP-203 Question 32

A)

DP-203 Question 32

B)

DP-203 Question 32

C)

DP-203 Question 32

D)

DP-203 Question 32

E)

DP-203 Question 32

F)

DP-203 Question 32

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

E.

Option E

F.

Option F

Buy Now
Questions 33

You have an enterprise data warehouse in Azure Synapse Analytics that contains a table named FactOnlineSales. The table contains data from the start of 2009 to the end of 2012.

You need to improve the performance of queries against FactOnlineSales by using table partitions. The solution must meet the following requirements:

  • Create four partitions based on the order date.
  • Ensure that each partition contains all the orders places during a given calendar year.

How should you complete the T-SQL command? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 33

Options:

Buy Now
Questions 34

You are designing 2 solution that will use tables in Delta Lake on Azure Databricks.

You need to minimize how long it takes to perform the following:

*Queries against non-partitioned tables

* Joins on non-partitioned columns

Which two options should you include in the solution? Each correct answer presents part of the solution.

(Choose Correct Answer and Give Explanation and References to Support the answers based from Data Engineering on Microsoft Azure)

Options:

A.

Z-Ordering

B.

Apache Spark caching

C.

dynamic file pruning (DFP)

D.

the clone command

Buy Now
Questions 35

You have an Azure Data Factory pipeline that performs an incremental load of source data to an Azure Data Lake Storage Gen2 account.

Data to be loaded is identified by a column named LastUpdatedDate in the source table.

You plan to execute the pipeline every four hours.

You need to ensure that the pipeline execution meets the following requirements:

  • Automatically retries the execution when the pipeline run fails due to concurrency or throttling limits.
  • Supports backfilling existing data in the table.

Which type of trigger should you use?

Options:

A.

Storage event

B.

on-demand

C.

schedule

D.

tumbling window

Buy Now
Questions 36

You are designing the folder structure for an Azure Data Lake Storage Gen2 account.

You identify the following usage patterns:

• Users will query data by using Azure Synapse Analytics serverless SQL pools and Azure Synapse Analytics serverless Apache Spark pods.

• Most queries will include a filter on the current year or week.

• Data will be secured by data source.

You need to recommend a folder structure that meets the following requirements:

• Supports the usage patterns

• Simplifies folder security

• Minimizes query times

Which folder structure should you recommend?

A)

DP-203 Question 36

B)

DP-203 Question 36

C)

DP-203 Question 36

D)

DP-203 Question 36

E)

DP-203 Question 36

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

E.

Option E

Buy Now
Questions 37

What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?

Options:

A.

a server-level virtual network rule

B.

a database-level virtual network rule

C.

a database-level firewall IP rule

D.

a server-level firewall IP rule

Buy Now
Questions 38

What should you recommend using to secure sensitive customer contact information?

Options:

A.

data labels

B.

column-level security

C.

row-level security

D.

Transparent Data Encryption (TDE)

Buy Now
Questions 39

What should you do to improve high availability of the real-time data processing solution?

Options:

A.

Deploy identical Azure Stream Analytics jobs to paired regions in Azure.

B.

Deploy a High Concurrency Databricks cluster.

C.

Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.

D.

Set Data Lake Storage to use geo-redundant storage (GRS).

Buy Now
Questions 40

Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

DP-203 Question 40

Options:

Buy Now
Questions 41

You need to implement the surrogate key for the retail store table. The solution must meet the sales transaction

dataset requirements.

What should you create?

Options:

A.

a table that has an IDENTITY property

B.

a system-versioned temporal table

C.

a user-defined SEQUENCE object

D.

a table that has a FOREIGN KEY constraint

Buy Now
Questions 42

You need to design a data ingestion and storage solution for the Twitter feeds. The solution must meet the customer sentiment analytics requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area

NOTE: Each correct selection b worth one point.

DP-203 Question 42

Options:

Buy Now
Questions 43

You need to implement versioned changes to the integration pipelines. The solution must meet the data integration requirements.

In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order.

DP-203 Question 43

Options:

Buy Now
Questions 44

You need to design a data retention solution for the Twitter feed data records. The solution must meet the customer sentiment analytics requirements.

Which Azure Storage functionality should you include in the solution?

Options:

A.

change feed

B.

soft delete

C.

time-based retention

D.

lifecycle management

Buy Now
Exam Code: DP-203
Exam Name: Data Engineering on Microsoft Azure
Last Update: Jun 2, 2024
Questions: 331

PDF + Testing Engine

$70  $174.99

Testing Engine

$54  $134.99
buy now DP-203 testing engine

PDF (Q&A)

$48  $119.99
buy now DP-203 pdf