且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

用Java替换加权采样

更新时间:2023-12-05 12:54:34

我敢肯定一个人不存在,但是创建一个可以产生这样的样本的函数很容易.首先,Java确实提供了一个随机数生成器,特别是带有Random.nextDouble()函数的函数,该函数可以生成0.0到1.0之间的随机双精度数.

I'm pretty sure one doesn't exist, but it's pretty easy to make a function that would produce samples like that. First off, Java does come with a random number generator, specifically one with a function, Random.nextDouble() that can produce random doubles between 0.0 and 1.0.

import java.util.Random;

double someRandomDouble = Random.nextDouble();
     // This will be a uniformly distributed
     // random variable between 0.0 and 1.0.

如果要进行替换抽样,如果将输入的pdf转换为cdf,则可以使用Java提供的随机双精度数,通过查看CDf属于哪个部分来创建随机数据集.因此,首先您需要将pdf转换为cdf.

If you have sampling with replacement, if you convert the pdf you have as an input into a cdf, you can use the random doubles Java provides to create a random data set by seeing in which part of the cdf it falls. So first you need to convert the pdf into a cdf.

int [] randsample(int[] values, int numsamples, 
        boolean withReplacement, double [] pdf) {

    if(withReplacement) {
        double[] cdf = new double[pdf.length];
        cdf[0] = pdf[0];
        for(int i=1; i<pdf.length; i++) {
            cdf[i] = cdf[i-1] + pdf[i];
        }

然后,您将适当大小的整数数组存储起来并开始查找随机结果:

Then you make the properly-sized array of ints to store the result and start finding the random results:

        int[] results = new int[numsamples];
        for(int i=0; i<numsamples; i++) {
            int currentPosition = 0;

            while(randomValue > cdf[currentPosition] && currentPosition < cdf.length) {
                currentPosition++; //Check the next one.
            }

            if(currentPosition < cdf.length) { //It worked!
                results[i] = values[currentPosition];
            } else { //It didn't work.. let's fail gracefully I guess.
                results[i] = values[cdf.length-1]; 
                     // And assign it the last value.
            }
        }

        //Now we're done and can return the results!
        return results;
    } else { //Without replacement.
        throw new Exception("This is unimplemented!");
    }
}

有一些错误检查(确保值数组和pdf数组的大小相同)和一些其他功能,可以通过重载此功能以提供其他功能来实现,但希望这足以让您开始.干杯!

There's some error checking (make sure value array and pdf array are the same size) and some other features you can implement by overloading this to provide the other functions, but hopefully this is enough for you to start. Cheers!