且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

加到 100 的随机数:Matlab

更新时间:2023-02-10 12:44:49

我经常看到错误,建议生成具有给定总和的随机数,只需使用统一的随机集,然后缩放它们.但是如果你这样做,结果真的是均匀随机的吗?

I see the mistake so often, the suggestion that to generate random numbers with a given sum, one just uses a uniform random set, and just scale them. But is the result truly uniformly random if you do it that way?

试试这个简单的二维测试.生成一个巨大的随机样本,然后将它们缩放到总和为 1.我将使用 bsxfun 进行缩放.

Try this simple test in two dimensions. Generate a huge random sample, then scale them to sum to 1. I'll use bsxfun to do the scaling.

xy = rand(10000000,2);
xy = bsxfun(@times,xy,1./sum(xy,2));
hist(xy(:,1),100)

如果它们真的是均匀随机的,那么 x 坐标将是均匀的,y 坐标也是如此.任何值都同样有可能发生.实际上,要使两个点之和为 1,它们必须位于 (x,y) 平面中连接 (0,1)、(1,0) 两点的连线上.为了使点一致,沿该线的任何点都必须是等概率的.

If they were truly uniformly random, then the x coordinate would be uniform, as would the y coordinate. Any value would be equally likely to happen. In effect, for two points to sum to 1 they must lie along the line that connects the two points (0,1), (1,0) in the (x,y) plane. For the points to be uniform, any point along that line must be equally likely.

当我使用缩放解决方案时,显然均匀性失败.这条线上的任何一点都不太可能.我们可以在 3 维中看到同样的事情发生.看到这里的 3-d 图中,三角形区域中心的点更密集.这是不均匀性的反映.

Clearly uniformity fails when I use the scaling solution. Any point on that line is NOT equally likely. We can see the same thing happening in 3-dimensions. See that in the 3-d figure here, the points in the center of the triangular region are more densely packed. This is a reflection of non-uniformity.

xyz = rand(10000,3);
xyz = bsxfun(@times,xyz,1./sum(xyz,2));
plot3(xyz(:,1),xyz(:,2),xyz(:,3),'.')
view(70,35)
box on
grid on

同样,简单的缩放解决方案失败了.它根本不会在感兴趣的领域产生真正统一的结果.

Again, the simple scaling solution fails. It simply does NOT produce truly uniform results over the domain of interest.

我们能做得更好吗?嗯,是.2-d 中的一个简单解决方案是生成一个随机数,指定沿连接点 (0,1) 和 1,0 的直线的距离.

Can we do better? Well, yes. A simple solution in 2-d is to generate a single random number that designates the distance along the line connecting the points (0,1) and 1,0).

t = rand(10000000,1);
xy = t*[0 1] + (1-t)*[1 0];
hist(xy(:,1),100)

可以证明,沿着方程 x+y = 1 定义的直线上的任何点,在单位正方形中,现在都同样有可能被选中.漂亮、平坦的直方图反映了这一点.

It can be shown that ANY point along the line defined by the equation x+y = 1, in the unit square, is now equally likely to have been chosen. This is reflected by the nice, flat histogram.

David Schwartz 建议的排序技巧是否适用于 n 维?很明显,它是在 2 维中这样做的,下图表明它是在 3 维中这样做的.没有深入思考这个问题,我相信它适用于这个基本案例,在 n 维.

Does the sort trick suggested by David Schwartz work in n-dimensions? Clearly it does so in 2-d, and the figure below suggests that it does so in 3-dimensions. Without deep thought on the matter, I believe that it will work for this basic case in question, in n-dimensions.

n = 10000;
uv = [zeros(n,1),sort(rand(n,2),2),ones(n,1)];
xyz = diff(uv,[],2);

plot3(xyz(:,1),xyz(:,2),xyz(:,3),'.')
box on
grid on
view(70,35)

还可以下载函数randfixedsum 来自文件交换,Roger Stafford 的贡献.这是在单位超立方体中生成真正均匀随机集的更通用的解决方案,具有任何给定的固定总和.因此,要生成位于单位 3-cube 中的随机点集,受约束,它们的总和为 1.25...

One can also download the function randfixedsum from the file exchange, Roger Stafford's contribution. This is a more general solution to generate truly uniform random sets in the unit hyper-cube, with any given fixed sum. Thus, to generate random sets of points that lie in the unit 3-cube, subject to the constraint they sum to 1.25...

xyz = randfixedsum(3,10000,1.25,0,1)';
plot3(xyz(:,1),xyz(:,2),xyz(:,3),'.')
view(70,35)
box on
grid on