6.3. Preprocessing Data — Scikit-learn 1.1.2 Documentation
Maybe your like
7.3.1.1. Scaling features to a range#
An alternative standardization is scaling features to lie between a given minimum and maximum value, often between zero and one, or so that the maximum absolute value of each feature is scaled to unit size. This can be achieved using MinMaxScaler or MaxAbsScaler, respectively.
The motivation to use this scaling includes robustness to very small standard deviations of features and preserving zero entries in sparse data.
Here is an example to scale a toy data matrix to the [0, 1] range:
>>> X_train = np.array([[ 1., -1., 2.], ... [ 2., 0., 0.], ... [ 0., 1., -1.]]) ... >>> min_max_scaler = preprocessing.MinMaxScaler() >>> X_train_minmax = min_max_scaler.fit_transform(X_train) >>> X_train_minmax array([[0.5 , 0. , 1. ], [1. , 0.5 , 0.33333333], [0. , 1. , 0. ]])The same instance of the transformer can then be applied to some new test data unseen during the fit call: the same scaling and shifting operations will be applied to be consistent with the transformation performed on the train data:
>>> X_test = np.array([[-3., -1., 4.]]) >>> X_test_minmax = min_max_scaler.transform(X_test) >>> X_test_minmax array([[-1.5 , 0. , 1.66666667]])It is possible to introspect the scaler attributes to find about the exact nature of the transformation learned on the training data:
>>> min_max_scaler.scale_ array([0.5 , 0.5 , 0.33]) >>> min_max_scaler.min_ array([0. , 0.5 , 0.33])If MinMaxScaler is given an explicit feature_range=(min, max) the full formula is:
X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0)) X_scaled = X_std * (max - min) + minMaxAbsScaler works in a very similar fashion, but scales in a way that the training data lies within the range [-1, 1] by dividing through the largest maximum value in each feature. It is meant for data that is already centered at zero or sparse data.
Here is how to use the toy data from the previous example with this scaler:
>>> X_train = np.array([[ 1., -1., 2.], ... [ 2., 0., 0.], ... [ 0., 1., -1.]]) ... >>> max_abs_scaler = preprocessing.MaxAbsScaler() >>> X_train_maxabs = max_abs_scaler.fit_transform(X_train) >>> X_train_maxabs array([[ 0.5, -1. , 1. ], [ 1. , 0. , 0. ], [ 0. , 1. , -0.5]]) >>> X_test = np.array([[ -3., -1., 4.]]) >>> X_test_maxabs = max_abs_scaler.transform(X_test) >>> X_test_maxabs array([[-1.5, -1. , 2. ]]) >>> max_abs_scaler.scale_ array([2., 1., 2.])Tag » How To Standardize Data In Python
-
2 Easy Ways To Standardize Data In Python For Machine Learning
-
How To Standardize Data In Python - Python-bloggers
-
How And Why To Standardize Your Data: A Python Tutorial
-
2 Easy Ways To Normalize Data In Python - DigitalOcean
-
How To Standardize Data In Python (With Examples) - - Statology
-
How To Standardize Data In A Pandas DataFrame? - GeeksforGeeks
-
How To Use StandardScaler And MinMaxScaler Transforms In Python
-
How To Standardize Your Data ? [Data Standardization With Python]
-
How To Standardise Features In Python? - ProjectPro
-
Standardizing Data | Python - DataCamp
-
Standardizing Data | Python - DataCamp
-
Machine-learning-articles/how-to-normalize-or-standardize ... - GitHub
-
Data Normalization In Python
-
How To Standardize Data Using Z-Score/Standard Scalar | Python