且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在python中为我的数据正确获取累积分布函数?

更新时间:2022-04-01 00:39:44

normed=True 时,counts 可以解释为 pdf 值:

When normed=True, the counts can be interpreted as pdf values:

counts, bin_edges = np.histogram(a, bins=num_bins, normed=True)

cdf

dx = bin_edges[1]-bin_edges[0]
cdf = np.cumsum(counts*dx)

bin 边缘之间的距离是均匀的,所以 dx 是恒定的.counts*dx 给出每个 bin 的概率质量.现在np.cumsum的概率质量给出了累积分布函数.

The distance between the bin edges is uniform, so dx is constant. counts*dx gives the probability mass for each bin. Now np.cumsum of the probability masses gives the cumulative distribution function.

assert np.allclose(cdf[-1], 1)