且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

R基于另外两列聚合一列中的数据

更新时间:2022-12-10 08:40:54

library(plyr) 

#我正在使用 cut 函数,v1 和 v2 和 ddply 都使用 50 个中断,来自 plyr 包来计算平均值

#I am using cut function with 50 breaks for both v1 and v2 and ddply from plyr package for computing the mean

newdata<-ddply(df,.(cut(v1,50),cut(v2,50)),summarise,mean.v3=mean(v3))
    > head(newdata)
        cut(v1, 50)   cut(v2, 50) mean.v3
    1 (-49.4,-47.5] (-34.7,-32.7]  18.123
    2 (-49.4,-47.5] (-0.576,1.43]  20.887
    3 (-49.4,-47.5]   (15.5,17.5]  20.887
    4 (-47.5,-45.5] (-52.7,-50.7]   9.918
    5 (-47.5,-45.5] (-44.7,-42.7]  14.477
    6 (-47.5,-45.5] (-34.7,-32.7]  16.314

根据评论更新:如果您想要下、中和中点,您可以使用以下函数 或使用以下详细信息(您需要使用 sub函数处理( and ]):

Updated as per the comments: If you want the lower, middle and mid-points, you can use the following function or use with details as follow(you need to use the sub function to deal with ( and ]):

    df$newv1<-with(df,cut(v1,50)) 
    df$newv2<-with(df,cut(v2,50))
    df$lowerv1<-with(df,as.numeric( sub("\((.+),.*", "\1", newv1))) #lower value
    df$upperv1<-with(df,as.numeric( sub("[^,]*,([^]]*)\]", "\1", newv1))) # upper value
    df$midv1<-with(df,(lowerv1+upperv1)/2) #mid value
    df$lowerv2<-with(df,as.numeric( sub("\((.+),.*", "\1",newv2))) #lower value
    df$upperv2<-with(df,as.numeric( sub("[^,]*,([^]]*)\]", "\1", newv2))) # upper value
    df$midv2<-with(df,(lowerv2+upperv2)/2)#mid value
    newdata<-ddply(df,.(newv1,newv2),transform,mean.v3=mean(v3))

   > head(newdata)
       v1      v2     v3         newv1         newv2 lowerv1 upperv1  midv1 lowerv2 upperv2   midv2 mean.v3
1 -47.456 -32.714 18.123 (-49.4,-47.5] (-34.7,-32.7]   -49.4   -47.5 -48.45 -34.700  -32.70 -33.700  18.123
2 -49.329  -0.465 20.887 (-49.4,-47.5] (-0.576,1.43]   -49.4   -47.5 -48.45  -0.576    1.43   0.427  20.887
3 -48.652  16.558 20.800 (-49.4,-47.5]   (15.5,17.5]   -49.4   -47.5 -48.45  15.500   17.50  16.500  20.887
4 -48.323  17.153 20.974 (-49.4,-47.5]   (15.5,17.5]   -49.4   -47.5 -48.45  15.500   17.50  16.500  20.887
5 -45.713 -52.599  9.918 (-47.5,-45.5] (-52.7,-50.7]   -47.5   -45.5 -46.50 -52.700  -50.70 -51.700   9.918
6 -45.805 -43.071 14.477 (-47.5,-45.5] (-44.7,-42.7]   -47.5   -45.5 -46.50 -44.700  -42.70 -43.700  14.477