
且构网 - 分享程序员编程开发的那些事

Python bytearray与字节列表

更新时间:2023-10-27 09:09:04


The first function creates a list of pointers to the same object (NOT a list of bytes), then join would do one memory allocation and COUNT calls to memcpy.


You can make the first function times faster (5x in my test) by dropping temporary list and using itertools.repeat:

def bytes_list_test_opt():  
    tStart = time.clock()
    bs = b''.join(itertools.repeat(MSG, COUNT))
    print('byte list opt time:', time.clock() - tStart)


or, in this particular case, simply use * operator of bytes objects, which does exactly that:

    bs = MSG*COUNT


The second function repeatedly iterates over MSG, stores data byte-by-byte and has to repeatedly reallocate memory as the bytearray grows.


def bytearray_test_opt():
    tStart = time.clock()
    ba = bytearray()
    for i in range(COUNT):
    print('array opt time:', time.clock() - tStart)


After this modification, the second function will be slower than the first one only because of additional reallocations (~15% in my test).


The third function uses bytearray's slice assignment, which accepts iterable and seems to be doing the same byte-by-byte iteration without recognizing that they could just memcpy bytes into the place. This looks like a defect in the standard library that can be fixed.


As you see from the previous optimization, allocations take very small amount time compared to byte-by-byte copying, so preallocating has no visible impact here. You could save some time on doing fewer calculations, but it wont help much either:

def initialized_bytearray_test_opt():
    tStart = time.clock()
    L = len(MSG)
    ba = bytearray(L*COUNT)
    ofs = 0
    for i in range(COUNT):
        ba[ofs : ofs+L] = MSG
        ofs += L
    print('initialized array opt time:', time.clock() - tStart)


byte list time: 0.004823000000000001
byte list opt time: 0.0008649999999999977
array time: 0.043324
array opt time: 0.005505999999999997
initialized array time: 0.05936899999999999
initialized array opt time: 0.040164000000000005


P.S. Use timeit module to perform measures like this, it provides much higher accuracy.