且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Python:解析成对的日志文件

更新时间:2022-02-05 20:04:02

读取整个文件两次绝对是多余的.相反,在遍历文件时跟踪您之前完成的操作.

Reading the entire file twice is absolutely excessive. Instead, keep track of what you have done previously while traversing the file.

seen_test = False   # state variable for keeping track of what you have done
init_person = None  # note snake_case variable convention pro headlessCamelCase

with open('data.log', 'r') as f:
    for lineno, line in enumerate(f, start=1):
        if 'event:type=test,' in line:
            if seen_test:
                raise ValueError(
                    'line %i: type=test without test2: %s' % (
                        lineno, line))
            init_person = line.split('initiator=')[1].split(',')[0]
            seen_test = True
        elif 'event:type=test2' in line:
            if seen_test:
                # ... do whatever you want with init_person
                # maybe something like
                result = line.rstrip('\n').split(',')
                print('Test by %s got results %s' % (init_person, result[1:]))
            else:
                raise ValueError(
                    'line %i: type=test2 without test: %s' % (
                        lineno, line))
            seen_test = False

enumerate 只是为了在出现故障时在错误信息中获取有用的行号;如果您确定该文件的格式始终良好,则可以将其删除.

The enumerate is just to get a useful line number into the error message when there is a failure; if you are sure that the file is always well-formatted, maybe take that out.

如果 type=test 行不包含 initiator=,这仍然会失败,但我们不知道在这种情况下做什么会有用,所以我我不想解决这个问题.

This will still fail if the type=test line doesn't contain initiator= but we have no idea what would be useful to do in that scenario so I'm not trying to tackle that.

演示:https://repl.it/repls/OverdueFruitfulComputergames#main.py一个>