6.3. Preprocessing Data — Scikit-learn 1.1.2 Documentation

Maybe your like

7.3.1.1. Scaling features to a range#

An alternative standardization is scaling features to lie between a given minimum and maximum value, often between zero and one, or so that the maximum absolute value of each feature is scaled to unit size. This can be achieved using MinMaxScaler or MaxAbsScaler, respectively.

The motivation to use this scaling includes robustness to very small standard deviations of features and preserving zero entries in sparse data.

Here is an example to scale a toy data matrix to the [0, 1] range:

>>> X_train = np.array([[ 1., -1., 2.], ... [ 2., 0., 0.], ... [ 0., 1., -1.]]) ... >>> min_max_scaler = preprocessing.MinMaxScaler() >>> X_train_minmax = min_max_scaler.fit_transform(X_train) >>> X_train_minmax array([[0.5 , 0. , 1. ], [1. , 0.5 , 0.33333333], [0. , 1. , 0. ]])

The same instance of the transformer can then be applied to some new test data unseen during the fit call: the same scaling and shifting operations will be applied to be consistent with the transformation performed on the train data:

>>> X_test = np.array([[-3., -1., 4.]]) >>> X_test_minmax = min_max_scaler.transform(X_test) >>> X_test_minmax array([[-1.5 , 0. , 1.66666667]])

It is possible to introspect the scaler attributes to find about the exact nature of the transformation learned on the training data:

>>> min_max_scaler.scale_ array([0.5 , 0.5 , 0.33]) >>> min_max_scaler.min_ array([0. , 0.5 , 0.33])

If MinMaxScaler is given an explicit feature_range=(min, max) the full formula is:

X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0)) X_scaled = X_std * (max - min) + min

MaxAbsScaler works in a very similar fashion, but scales in a way that the training data lies within the range [-1, 1] by dividing through the largest maximum value in each feature. It is meant for data that is already centered at zero or sparse data.

Here is how to use the toy data from the previous example with this scaler:

>>> X_train = np.array([[ 1., -1., 2.], ... [ 2., 0., 0.], ... [ 0., 1., -1.]]) ... >>> max_abs_scaler = preprocessing.MaxAbsScaler() >>> X_train_maxabs = max_abs_scaler.fit_transform(X_train) >>> X_train_maxabs array([[ 0.5, -1. , 1. ], [ 1. , 0. , 0. ], [ 0. , 1. , -0.5]]) >>> X_test = np.array([[ -3., -1., 4.]]) >>> X_test_maxabs = max_abs_scaler.transform(X_test) >>> X_test_maxabs array([[-1.5, -1. , 2. ]]) >>> max_abs_scaler.scale_ array([2., 1., 2.])

Tag » How To Standardize Data In Python

6.3. Preprocessing Data — Scikit-learn 1.1.2 Documentation

7.3.1.1. Scaling features to a range#

2 Easy Ways To Standardize Data In Python For Machine Learning

How To Standardize Data In Python - Python-bloggers

How And Why To Standardize Your Data: A Python Tutorial

2 Easy Ways To Normalize Data In Python - DigitalOcean

How To Standardize Data In Python (With Examples) - - Statology

How To Standardize Data In A Pandas DataFrame? - GeeksforGeeks

How To Use StandardScaler And MinMaxScaler Transforms In Python

How To Standardize Your Data ? [Data Standardization With Python]

How To Standardise Features In Python? - ProjectPro

Standardizing Data | Python - DataCamp

Standardizing Data | Python - DataCamp

Machine-learning-articles/how-to-normalize-or-standardize ... - GitHub

Data Normalization In Python

How To Standardize Data Using Z-Score/Standard Scalar | Python

Contact