且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

根据多个其他因子列求和的列

更新时间:2023-02-04 18:36:18

首先,我将数据框转换为一个长格式,其中有3列:值,位置,大小写。案例应指明数据来自哪个案例(例如行)。顺序无关紧要。因此您的数据框将类似于:

First, I'd convert the data frame to a long form in which you have 3 columns: value, location, case. case should indicate from which case (e.g. row) the data came from. order doesn't matter. so your data frame will look something like:

Value    Loc    Case
20       East   1
20       South  2
...
10       East   1

依此类推...
这样做的一种方法是堆叠您的值和位置,然后手动(轻松地)添加案例编号。假设原始数据帧称为df,并且在第2,4列中有值,在第3,5列中有位置

and so forth... one way to do that is to stack your values and locations, and then manually (and easily) add case numbers. suppose your original dataframe is called df, and has values in columns 2,4 and locations in columns 3,5

v.col = stack(df[,c(2,4)])[,1]
v.loc = stack(df[,c(3,5)])[,1]
v.case = rep(1:nrow(df),2)
long.data = data.frame(v.col,v.loc,v.case)    # this is not actually needed, but just so you can view it

现在使用tapply创建所需的列

now use tapply to create the columns you need

s = tapply(v.col,list(v.case,v.loc),sum,na.rm=T)
new.df = cbind(df,s)

您可能需要将NA调整为0或类似的值,但这应该很容易

You'll probably need to adjust NA to 0 or something, but this should be easy.

使用plyr / reshape软件包也可能有更简单的方法,但是我不是这些专家。

There are also probably easier ways to do this using the plyr/reshape packages, but I am not expert on these.

希望这会有所帮助