Cloudera CDP Data Engineer - Certification - CDP-3002무료 덤프문제 풀어보기
You notice degraded read performance on an Iceberg table after many updates and deletes. What maintenance task should you perform to improve this?
정답: D
설명: (Fast2test 회원만 볼 수 있음)
You're working with a Spark application that processes sensitive dat
a. How can you ensure that persisted data remains secure even if accessed from unauthorized sources?
a. How can you ensure that persisted data remains secure even if accessed from unauthorized sources?
정답: D
설명: (Fast2test 회원만 볼 수 있음)
Why are partitioned tables beneficial in Hive for large datasets?
정답: A
설명: (Fast2test 회원만 볼 수 있음)
You want to debug an issue within your Spark application that interacts with Hive tables. What tools and techniques can you employ for effective debugging?
정답: A,B
설명: (Fast2test 회원만 볼 수 있음)
In Apache Spark, which storage level is recommended for caching data that is accessed frequently but is too large to fit in memory?
정답: B
설명: (Fast2test 회원만 볼 수 있음)
Which feature of Apache Avro facilitates dynamic schema inference during data serialization and deserialization?
정답: D
설명: (Fast2test 회원만 볼 수 있음)
Your Airflow DAG encounters an error during the data transformation stage. What information can you access in the Airflow UI to troubleshoot the issue?
정답: D
설명: (Fast2test 회원만 볼 수 있음)
If you want to set a minimum and maximum number of Executor pods for a Spark application in Kubernetes, which pair of PySpark configuration settings would you use?
정답: A
설명: (Fast2test 회원만 볼 수 있음)
You want to schedule your Airflow DAG to run every hour, starting at midnight (00:00). How can you achieve this scheduling configuration?
정답: A
설명: (Fast2test 회원만 볼 수 있음)
You encounter an error during the execution of your Airflow DAG. How can you identify the root cause of the issue and debug it effectively?
정답: D
설명: (Fast2test 회원만 볼 수 있음)
Your Airflow DAG includes tasks that can potentially fail due to various reasons. How can you handle such failures and ensure the overall workflow continues as intended?
정답: C,D
설명: (Fast2test 회원만 볼 수 있음)
You're tasked with optimizing an existing Airflow DAG that processes large datasets daily. The DAG has multiple tasks, some of which frequently fail due to memory constraints on the worker nodes. Which approach would best mitigate this issue without upgrading hardware?
정답: C
설명: (Fast2test 회원만 볼 수 있음)
As a data engineer, you are working with PySpark to analyze data stored in an HDFS cluster. You need to read a CSV file into a Spark DataFrame. Which of the following code snippets correctly reads the data from HDFS into a DataFrame?
정답: A
설명: (Fast2test 회원만 볼 수 있음)