
Ultimate access to all questions.
You are developing a Spark ML pipeline for a text classification task. The pipeline includes several stages such as tokenization, removing stop words, TF-IDF transformation, and a logistic regression model. Describe how you would implement this pipeline using Spark ML, including the use of Estimators and Transformers. Additionally, discuss any potential pitfalls in developing such a pipeline and how you would mitigate them.
A
Use only Transformers and skip Estimators.
B
Combine all stages into a single Transformer.
C
Use a Pipeline object with each stage as an individual step.
D
Ignore the order of stages and apply them randomly.