更新时间:2023-02-11 16:50:59
实际上,无尺寸记录相当容易处理,因为 struct.calcsize()
会告诉你它期望的长度.您可以使用它和数据的实际长度为 unpack()
构造一个新的格式字符串,其中包含正确的字符串长度.
这个函数只是对 unpack()
的一个包装,允许在最后一个位置添加一个新的格式字符,将删除终端 NUL:
导入结构def unpack_with_final_asciiz(fmt, dat):"""解包二进制数据,最后处理以空字符结尾的字符串(并且仅在最后)自动.第一个参数 fmt 是一个 struct.unpack() 格式的字符串以下修改:如果 fmt 的最后一个字符是 'z',则返回的字符串将删除 NUL.如果是没有长度的's',则返回包含NUL的字符串.如果它是带有长度的 's',则行为与正常的 unpack() 相同."""# 如果不需要特殊行为,则直接传递如果 fmt[-1] 不在 ('z', 's') 或 (fmt[-1] == 's' and fmt[-2].isdigit()) 中:返回 struct.unpack(fmt, dat)# 使用格式字符串获取包含的字符串和剩余记录的大小non_str_len = struct.calcsize(fmt[:-1])str_len = len(dat) - non_str_len# 设置新的格式字符串# 如果传入 'z',则将终止 NUL 视为填充字节"如果 fmt[-1] == 'z':str_fmt = "{0}sx".format(str_len - 1)别的:str_fmt = "{0}s".format(str_len)new_fmt = fmt[:-1] + str_fmt返回 struct.unpack(new_fmt, dat)
>>>dat = b'\x02\x1e\x00\x00\x00z\x8eJ\x00\xb1\x7f\x03\x00在河边\x00'>>>unpack_with_final_asciiz("<biiiz", dat)(2, 30, 4886138, 229297, b'在河边')
I am trying to use struct.unpack()
to take apart a data record that ends with an ASCII string.
The record (it happens to be a TomTom ov2 record) has this format (stored little-endian):
unpack()
requires that the string's length be included in the format you pass it. I can use the second field and the known size of the rest of the record -- 13 bytes -- to get the string length:
str_len = struct.unpack("<xi", record[:5])[0] - 13
fmt = "<biii{0}s".format(str_len)
then proceed with the full unpacking, but since the string is null-terminated, I really wish unpack()
would do it for me. It'd also be nice to have this should I run across a struct that doesn't include its own size.
How can I make that happen?
The size-less record is fairly easy to handle, actually, since struct.calcsize()
will tell you the length it expects. You can use that and the actual length of the data to construct a new format string for unpack()
that includes the correct string length.
This function is just a wrapper for unpack()
, allowing a new format character in the last position that will drop the terminal NUL:
import struct
def unpack_with_final_asciiz(fmt, dat):
"""
Unpack binary data, handling a null-terminated string at the end
(and only at the end) automatically.
The first argument, fmt, is a struct.unpack() format string with the
following modfications:
If fmt's last character is 'z', the returned string will drop the NUL.
If it is 's' with no length, the string including NUL will be returned.
If it is 's' with a length, behavior is identical to normal unpack().
"""
# Just pass on if no special behavior is required
if fmt[-1] not in ('z', 's') or (fmt[-1] == 's' and fmt[-2].isdigit()):
return struct.unpack(fmt, dat)
# Use format string to get size of contained string and rest of record
non_str_len = struct.calcsize(fmt[:-1])
str_len = len(dat) - non_str_len
# Set up new format string
# If passed 'z', treat terminating NUL as a "pad byte"
if fmt[-1] == 'z':
str_fmt = "{0}sx".format(str_len - 1)
else:
str_fmt = "{0}s".format(str_len)
new_fmt = fmt[:-1] + str_fmt
return struct.unpack(new_fmt, dat)
>>> dat = b'\x02\x1e\x00\x00\x00z\x8eJ\x00\xb1\x7f\x03\x00Down by the river\x00'
>>> unpack_with_final_asciiz("<biiiz", dat)
(2, 30, 4886138, 229297, b'Down by the river')