Ultimate access to all questions.
Upgrade Now 🚀
Sign in to unlock AI tutor
In the context of PySpark, which function is designed to generate a new DataFrame by eliminating duplicate rows, with the option to consider only specific columns for identifying duplicates?
A
pyspark.sql.DataFrame.drop
B
pyspark.sql.DataFrame.distinct
C
pyspark.sql.DataFrame.dropDuplicates
D
pyspark.sql.DataFrame.na.drop
E
pyspark.sql.DataFrame.dropna