
Ultimate access to all questions.
In the context of developing a machine learning model for a financial services company, you are tasked with preprocessing a large dataset containing customer transaction records. The dataset includes missing values, outliers, and requires extensive manipulation to extract meaningful features. Given the need for efficient data manipulation and exploration, especially with tabular data, which Python library would you primarily use to address these challenges? Choose the best option.
A
Matplotlib, for its superior data visualization capabilities that can help in identifying outliers and trends.
B
Pandas, for its comprehensive data structures and functions designed for data manipulation and analysis.
C
TensorFlow, for its advanced capabilities in building and training deep learning models on the processed data.
D
NumPy, for its efficient numerical computations and array manipulations that can speed up data preprocessing.