更新时间:2023-02-13 13:06:29
查看优秀的 文档 sklearn.
Check out the excellent docs of sklearn.
如您所见,支持 partial_fit()
!这允许在线缩放/小批量缩放,您可以控制小批量!
As you see, there is support for partial_fit()
! This allows online-scaling/minibatch-scaling and you can control the minibatches!
示例:
import numpy as np
from sklearn.preprocessing import MinMaxScaler
a = np.array([[1,2,3]])
b = np.array([[10,20,30]])
c = np.array([[5, 10, 15]])
""" Scale on all datasets together in one batch """
offline_scaler = MinMaxScaler()
offline_scaler.fit(np.vstack((a, b, c))) # fit on whole data at once
a_offline_scaled = offline_scaler.transform(a)
b_offline_scaled = offline_scaler.transform(b)
c_offline_scaled = offline_scaler.transform(c)
print('Offline scaled')
print(a_offline_scaled)
print(b_offline_scaled)
print(c_offline_scaled)
""" Scale on all datasets together in minibatches """
online_scaler = MinMaxScaler()
online_scaler.partial_fit(a) # partial fit 1
online_scaler.partial_fit(b) # partial fit 2
online_scaler.partial_fit(c) # partial fit 3
a_online_scaled = online_scaler.transform(a)
b_online_scaled = online_scaler.transform(b)
c_online_scaled = online_scaler.transform(c)
print('Online scaled')
print(a_online_scaled)
print(b_online_scaled)
print(c_online_scaled)
输出:
Offline scaled
[[ 0. 0. 0.]]
[[ 1. 1. 1.]]
[[ 0.44444444 0.44444444 0.44444444]]
Online scaled
[[ 0. 0. 0.]]
[[ 1. 1. 1.]]
[[ 0.44444444 0.44444444 0.44444444]]