且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

通过列名和工作表名获取列数据

更新时间:2023-02-26 10:32:40

是的,您正在寻找 col_values() 工作表方法.而不是

arrayofvalues = sheet['columnname']

你需要做的

arrayofvalues = sheet.col_values(columnindex)

其中 columnindex 是列的编号(从零开始计数,因此列 A 是索引 0,列 B 是索引 1,等等).如果您在第一行(或前几行)中有一个描述性标题,您可以提供第二个参数来告诉从哪一行开始(同样,从零开始计数).例如,如果您有一个标题行,因此希望值从第二行开始,您可以这样做

arrayofvalues = sheet.col_values(columnindex, 1)

请查看教程以获得合理的xlrd 包的可读讨论.(官方 xlrd 文档 更难阅读.)>

另请注意,(1) 虽然您可以***使用名称 arrayofvalues,但您真正获得的是 Python 列表,从技术上讲,它不是数组,以及 (2) arrayofvaluescode>on_demand 工作簿参数在处理 .xlsx 文件时无效,这意味着 xlrd 将尝试将整个工作簿加载到内存中.(on_demand 功能适用于 .xls 文件.)

Is there a way to access all rows in a column in a specific sheet by using python xlrd.

e.g:

workbook = xlrd.open_workbook('ESC data.xlsx', on_demand=True)
sheet = workbook.sheet['sheetname']
arrayofvalues = sheet['columnname']

Or do i have to create a dictionary by myself?

The excel is pretty big so i would love to avoid iterating over all the colnames/sheets

Yes, you are looking for the col_values() worksheet method. Instead of

arrayofvalues = sheet['columnname']

you need to do

arrayofvalues = sheet.col_values(columnindex)

where columnindex is the number of the column (counting from zero, so column A is index 0, column B is index 1, etc.). If you have a descriptive heading in the first row (or first few rows) you can give a second parameter that tells which row to start from (again, counting from zero). For example, if you have one header row, and thus want values starting in the second row, you could do

arrayofvalues = sheet.col_values(columnindex, 1)

Please check out the tutorial for a reasonably readable discussion of the xlrd package. (The official xlrd documentation is harder to read.)

Also note that (1) while you are free to use the name arrayofvalues, what you are really getting is a Python list, which technically isn't an array, and (2) the on_demand workbook parameter has no effect when working with .xlsx files, which means xlrd will attempt to load the entire workbook into memory regardless. (The on_demand feature works for .xls files.)