Ultimate access to all questions.
The data science team has deployed a production model in MLflow that takes a list of column names as input and outputs a new column of type DOUBLE.
Given the following code correctly imports the production model, loads the customers
table (containing the customer_id
key column) into a DataFrame, and defines the required feature columns:
model = mlflow.pyfunc.spark_udf(spark, model_uri="models:/churn/prod")
df = spark.table("customers")
columns = ["account_age", "time_since_last_seen", "app_rating"]
Which code block will produce a DataFrame with the schema customer_id LONG, predictions DOUBLE
?