且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

是否可以使用Python遍历Amazon S3存储桶并计算其文件/密钥中的行数?

更新时间:2023-11-11 12:44:10

使用 boto3 ,您可以执行以下操作:

Using boto3 you can do the following:

import boto3

# create the s3 resource
s3 = boto3.resource('s3')

# get the file object
obj = s3.Object('bucket_name', 'key')

# read the file contents in memory
file_contents = obj.get()["Body"].read()

# print the occurrences of the new line character to get the number of lines
print file_contents.count('\n')

如果要对存储桶中的所有对象执行此操作,则可以使用以下代码段:

If you want to do this for all objects in a bucket, you can use the following code snippet:

bucket = s3.Bucket('bucket_name')
for obj in bucket.objects.all():
    file_contents = obj.get()["Body"].read()
    print file_contents.count('\n')

此处是对boto3文档的引用,以获取更多功能: http://boto3.readthedocs.io/zh-CN/latest/reference/services/s3.html#object

Here is the reference to boto3 documentation for more functionality: http://boto3.readthedocs.io/en/latest/reference/services/s3.html#object

更新:(使用boto 2)

import boto
s3 = boto.connect_s3()  # establish connection
bucket = s3.get_bucket('bucket_name')  # get bucket

for key in bucket.list(prefix='key'):  # list objects at a given prefix
    file_contents = key.get_contents_as_string()  # get file contents
    print file_contents.count('\n')  # print the occurrences of the new line character to get the number of lines