且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在每个组中随机选择仅一行

更新时间:2023-01-31 13:56:49

在普通R语言中,您可以在 tapply()中使用 sample()

In plain R you can use sample() within tapply():

df$Chosen <- 0
df[-tapply(-seq_along(df$Region),df$Region, sample, size=1),]$Chosen <- 1
df
   Region Combo Chosen
1       A     1      0
2       A     2      1
3       A     3      0
4       B     1      1
5       B     2      0
6       C     1      1
7       D     1      0
8       D     2      0
9       D     3      1
10      D     4      0

请注意-(-selected_row_number)避免当一组有单个行号时避免从1采样到n的技巧

Note the -(-selected_row_number) trick to avoid sampling from 1 to n when there is a single row number for one group