In a Spark MLlib implementation, you are working with a large dataset and need to perform data preprocessing to improve the quality of your machine learning model. Which of the following data preprocessing techniques can be applied in Spark MLlib, and how do they work?