如何对时间序列数据执行 K-means 聚类?

更新时间：2023-12-02 21:52:34

时间序列通常是高维的.您需要专门的距离函数来比较它们的相似性.另外，可能存在异常值.

Time series are usually high-dimensional. And you need specialized distance function to compare them for similarity. Plus, there might be outliers.

k-means 是为具有(有意义的)欧几里得距离的低维空间而设计的.它对异常值不是很稳健，因为它对它们施加了平方权重.

k-means is designed for low-dimensional spaces with a (meaningful) euclidean distance. It is not very robust towards outliers, as it puts squared weight on them.

对我来说，在时间序列数据上使用 k-means 听起来不是一个好主意.尝试研究更现代、更强大的聚类算法.许多将允许您使用任意距离函数，包括时间序列距离，例如 DTW.

Doesn't sound like a good idea to me to use k-means on time series data. Try looking into more modern, robust clustering algorithms. Many will allow you to use arbitrary distance functions, including time series distances such as DTW.

上一篇 : ：使用子类对象访问超类函数下一篇 : 机器Epsilon精度差异

如何对时间序列数据执行 K-means 聚类?

相关阅读

推荐文章