
Ultimate access to all questions.
You are working on a project that requires collaborative filtering for recommender systems. The dataset has a large number of users and items, and you need to make recommendations in real-time. Explain how you would use Apache Spark to build a real-time recommender system using collaborative filtering.
A
To build a real-time recommender system using collaborative filtering with Apache Spark, you would first represent the user-item interactions as an RDD (Resilient Distributed Dataset) or a DataFrame in Spark. Then, you would use Spark's machine learning libraries, such as MLlib, to train a collaborative filtering model, such as matrix factorization, on the historical interaction data. Finally, you would use Spark's in-memory computing capabilities to make recommendations for users in real-time based on the trained model and the latest user interactions.
B
To build a real-time recommender system using collaborative filtering with Apache Spark, you would first represent the user-item interactions as an RDD (Resilient Distributed Dataset) or a DataFrame in Spark. Then, you would use Spark's machine learning libraries, such as MLlib, to train a collaborative filtering model, such as matrix factorization, on the historical interaction data. However, you would not make recommendations for users in real-time based on the trained model and the latest user interactions.
C
To build a real-time recommender system using collaborative filtering with Apache Spark, you would first represent the user-item interactions as an RDD (Resilient Distributed Dataset) or a DataFrame in Spark. Then, you would manually inspect each user's preferences and make recommendations without using any machine learning models or algorithms.
D
To build a real-time recommender system using collaborative filtering with Apache Spark, you would first represent the user-item interactions as an RDD (Resilient Distributed Dataset) or a DataFrame in Spark. Then, you would use a single machine to train a collaborative filtering model and make recommendations for users in real-time, without leveraging the distributed computing power of Spark.