更新时间:2023-11-23 21:07:22
配置文件会有所帮助,但是重做代码的地方是避免数据点(for point = 1:size(data,1)
)数量上的循环.向量化.
Profiling will help, but the place to rework your code is to avoid the loop over the number of data points (for point = 1:size(data,1)
). Vectorize that.
在您的for iteration
循环中,这是一个快速的部分示例,
In your for iteration
loop here is a quick partial example,
[nPoints,nDims] = size(data);
% Calculate all high-dimensional distances at once
kdiffs = bsxfun(@minus,data,permute(mu_k,[3 2 1])); % NxDx1 - 1xDxK => NxDxK
distances = sum(kdiffs.^2,2); % no need to do sqrt
distances = squeeze(distances); % Nx1xK => NxK
% Find closest cluster center for each point
[~,ik] = min(distances,[],2); % Nx1
% Calculate the new cluster centers (mean the data)
mu_k_new = zeros(c,nDims);
for i=1:c,
indk = ik==i;
clustersizes(i) = nnz(indk);
mu_k_new(i,:) = mean(data(indk,:))';
end
这不是唯一(或***)的方法,但它应该是一个不错的例子.
This isn't the only (or the best) way to do it, but it should be a decent example.
其他一些评论:
input
.uigetfile
.max
,min
,sum
,mean
等,您可以指定函数应在其上运行的尺寸.这样,您就可以在矩阵上运行它,并同时计算多个条件/维度的值.ik
的簇与平方欧几里德距离相同.input
, make this script into a function to efficiently handle input arguments.uigetfile
.max
, min
, sum
, mean
, etc., you can specify a dimension over which the function should operate. This way you an run it on a matrix and compute values for multiple conditions/dimensions at the same time.ik
, will be the same with squared Euclidean distance.