且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何使用 Python 计算文件系统目录的哈希值?

更新时间:2023-11-29 14:42:58

这个 Recipe 提供了一个很好的功能来完成您的要求.我已将其修改为使用 MD5 哈希,而不是 SHA1,正如您最初的问题所问

This Recipe provides a nice function to do what you are asking. I've modified it to use the MD5 hash, instead of the SHA1, as your original question asks

def GetHashofDirs(directory, verbose=0):
  import hashlib, os
  SHAhash = hashlib.md5()
  if not os.path.exists (directory):
    return -1

  try:
    for root, dirs, files in os.walk(directory):
      for names in files:
        if verbose == 1:
          print 'Hashing', names
        filepath = os.path.join(root,names)
        try:
          f1 = open(filepath, 'rb')
        except:
          # You can't open the file for some reason
          f1.close()
          continue

        while 1:
          # Read file in as little chunks
          buf = f1.read(4096)
          if not buf : break
          SHAhash.update(hashlib.md5(buf).hexdigest())
        f1.close()

  except:
    import traceback
    # Print the stack traceback
    traceback.print_exc()
    return -2

  return SHAhash.hexdigest()

你可以这样使用它:

print GetHashofDirs('folder_to_hash', 1)

输出看起来像这样,因为它对每个文件进行了哈希处理:

The output looks like this, as it hashes each file:

...
Hashing file1.cache
Hashing text.txt
Hashing library.dll
Hashing vsfile.pdb
Hashing prog.cs
5be45c5a67810b53146eaddcae08a809

这个函数调用的返回值作为散列返回.在这种情况下,5be45c5a67810b53146eaddcae08a809

The returned value from this function call comes back as the hash. In this case, 5be45c5a67810b53146eaddcae08a809