最新的Databricks Certified Professional Data Engineer - Databricks-Certified-Professional-Data-Engineer免費考試真題

A Data Engineer is building a simple data pipeline using Lakeflow Declarative Pipelines (LDP) in Databricks to ingest customer data. The raw customer data is stored in a cloud storage location in JSON format. The task is to create Lakeflow Declarative Pipelines that read the raw JSON data and write it into a Delta table for further processing.
Which code snippet will correctly ingest the raw JSON data and create a Delta table using LDP?

正確答案: A
說明:(僅 Fast2test 成員可見)
The Databricks CLI is used to trigger a run of an existing job by passing the job_id parameter. The response indicating the job run request was submitted successfully includes a field run_id. Which statement describes what the number alongside this field represents?

正確答案: A
說明:(僅 Fast2test 成員可見)
A Delta Lake table representing metadata about content from user has the following schema:
user_id LONG, post_text STRING, post_id STRING, longitude FLOAT, latitude FLOAT, post_time TIMESTAMP, date DATE Based on the above schema, which column is a good candidate for partitioning the Delta Table?

正確答案: D
說明:(僅 Fast2test 成員可見)
A transactions table has been liquid clustered on the columns product_id, user_id, and event_date.
Which operation lacks support for cluster on write?

正確答案: B
說明:(僅 Fast2test 成員可見)
A data architect has designed a system in which two Structured Streaming jobs will concurrently write to a single bronze Delta table. Each job is subscribing to a different topic from an Apache Kafka source, but they will write data with the same schema. To keep the directory structure simple, a data engineer has decided to nest a checkpoint directory to be shared by both streams.
The proposed directory structure is displayed below:

Which statement describes whether this checkpoint directory structure is valid for the given scenario and why?

正確答案: E
說明:(僅 Fast2test 成員可見)
A Spark job is taking longer than expected. Using the Spark UI, a data engineer notes that the Min, Median, and Max Durations for tasks in a particular stage show the minimum and median time to complete a task as roughly the same, but the max duration for a task to be roughly 100 times as long as the minimum.
Which situation is causing increased duration of the overall job?

正確答案: E
說明:(僅 Fast2test 成員可見)
What statement is true regarding the retention of job run history?

正確答案: C
A junior data engineer has manually configured a series of jobs using the Databricks Jobs UI. Upon reviewing their work, the engineer realizes that they are listed as the " Owner " for each job. They attempt to transfer " Owner " privileges to the " DevOps " group, but cannot successfully accomplish this task.
Which statement explains what is preventing this privilege transfer?

正確答案: E
說明:(僅 Fast2test 成員可見)
Which is a key benefit of an end-to-end test?

正確答案: A
說明:(僅 Fast2test 成員可見)
A data organization has adopted Delta Sharing to securely distribute curated datasets from a Unity Catalog- enabled workspace . The data engineering team shares large Delta tables internally via Databricks-to- Databricks and externally via Open Sharing for aggregated reports. While testing, they encounter challenges related to access control, data update visibility, and shareable object types.
What is a limitation of the Delta Sharing protocol or implementation when used with Databricks-to- Databricks or Open Sharing?

正確答案: C
說明:(僅 Fast2test 成員可見)

聯系我們

如果您有任何問題,請留下您的電子郵件地址,我們將在12小時內回复電子郵件給您。

我們的工作時間:( GMT 0:00-15:00 )
週一至週六

技術支持: 立即聯繫 

English 日本語 Deutsch 한국어