且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何在R中使用具有特定行和列的循环读取多个xlsx文件

更新时间:2023-11-14 07:54:40

我将阅读每张表格到列表:



获取文件名:

  f = list.files(./)

读取文件:

  dat = lapply(f,function(i){
x = read.xlsx(i,sheetIndex = 1,sheetName = NULL,startRow = 5,
endRow = NULL,as.data.frame = TRUE,header = T)
#获取所需的列,例如1,3,5
x = x [,c(1,3,5)]
#你可能想添加一个列来说哪个文件来自
x $ file = i
#返回数据
x
})

然后,您可以使用以下方式访问列表中的项目:

  dat [[1 ]] 

或者对他们执行相同的任务:

  lapply(dat,colmeans)

转它们变成一个数据框(你的文件列现在变得有用):

  dat = do.call(rbind.data.frame,dat)


I have to read multiple xlsx file with random names into single dataframe. Structure of each file is same. I have to import specific columns only.

I tried this:

dat <- read.xlsx("FILE.xlsx", sheetIndex=1, 
                  sheetName=NULL, startRow=5, 
                  endRow=NULL, as.data.frame=TRUE, 
                  header=TRUE)

But this is for only one file at a time and I couldn't specify my particular columns. I even tried :

site=list.files(pattern='[.]xls')

but after that loop isn't working. How to do it? Thanks in advance.

I would read each sheet to a list:

Get file names:

f = list.files("./")

Read files:

dat = lapply(f, function(i){
    x = read.xlsx(i, sheetIndex=1, sheetName=NULL, startRow=5,
        endRow=NULL, as.data.frame=TRUE, header=T)
    # Get the columns you want, e.g. 1, 3, 5
    x = x[, c(1, 3, 5)]
    # You may want to add a column to say which file they're from
    x$file = i
    # Return your data
    x
})

You can then access the items in your list with:

dat[[1]]

Or do the same task to them with:

lapply(dat, colmeans)

Turn them into a data frame (where your file column now becomes useful):

dat = do.call("rbind.data.frame", dat)