且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

base64使用python 2.7编码,解码成块的文件

更新时间:2023-02-11 11:58:28

遇到填充问题:

>>> open('pianoavatar.jpg').read(8192).encode('base64')[-5:]
'IIE=\n'

Base64解码在遇到=填充标记时停止.您的第二读物在第10924个字符处发现了这样的标记.

Base64 decoding stops when it encounters the = padding marker. Your second read finds such a marker at the 10924th character.

您需要将块大小调整为可以被3整除,以避免在输出文件的中间填充.例如,使用块大小为8190.

You need to adjust your chunk size to be divisible by 3 instead to avoid padding in the middle of your output file. Use a chunk size of 8190, for example.

读取时,您需要使用4的倍数的缓冲区大小,以免也遇到对齐问题. 8192在那里可以很好地工作,但是您必须确保在您的函数中满足此限制.您***将输入块默认为base64扩展块大小. 10920,编码块大小为8190(每3字节编码4个base64字符).

When reading, you need to use a buffersize that's a multiple of 4 to avoid running into alignment issues as well. 8192 would do fine there, but you must ensure this restriction is met in your functions. You'd be better off defaulting to the base64 expanded chunk size for the input chunks; 10920 for an encoding chunk size of 8190 (4 base64 characters for every 3 bytes encoded).

演示:

>>> write_base64_file_from_file('pianoavatar.jpg', 'test.b64', 8190)
bin <type 'str'> data len: 8190
b64 <type 'str'> data len: 10920
bin <type 'str'> data len: 8190
b64 <type 'str'> data len: 10920
bin <type 'str'> data len: 1976
b64 <type 'str'> data len: 2636

即使您原来的块大小为8192,现在阅读也可以正常工作

Reading now works just fine, even at your original chunk size of 8192:

>>> write_file_from_base64_file('test.b64', 'test.jpg', 8192)
b64 <type 'str'> data len: 8192
bin <type 'str'> data len: 6144
b64 <type 'str'> data len: 8192
bin <type 'str'> data len: 6144
b64 <type 'str'> data len: 8092
bin <type 'str'> data len: 6068

您可以使用简单的模数强制将缓冲区大小与函数对齐:

You can force the buffersize to be aligned in your functions with a simple modulus:

def write_base64_file_from_file(src_fname, b64_fname, chunk_size=8190):
    chunk_size -= chunk_size % 3  # align to multiples of 3
    # ...

def write_file_from_base64_file(b64_fname, dst_fname, chunk_size=10920):
    chunk_size -= chunk_size % 4  # align to multiples of 4
    # ...