且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何逐行比较两个不同的文件并在第三个文件中写入差异?

更新时间:2023-01-31 11:37:36

如果您使用的是* nix系统,***的通用选择就是使用:

The best general-purpose option if you're on a *nix system is just to use:

sort filea fileb | uniq -u

但是如果您需要使用Python:

But if you need to use Python:

您的代码在外部文件的每次迭代中都会重新打开内部文件.在循环外打开它.

Your code reopens the inner file in every iteration of the outer file. Open it outside the loop.

使用嵌套循环比循环遍历第一个存储找到的值,然后将第二个与这些值进行比较的效率低.

Using a nested loop is less efficient than looping over the first storing the found values, and then comparing the second to those values.

def build_set(filename):
    # A set stores a collection of unique items.  Both adding items and searching for them
    # are quick, so it's perfect for this application.
    found = set()

    with open(filename) as f:
        for line in f:
            # [:2] gives us the first two elements of the list.
            # Tuples, unlike lists, cannot be changed, which is a requirement for anything
            # being stored in a set.
            found.add(tuple(sorted(line.split()[:2])))

    return found

set_more = build_set('100rwsnMore.txt')
set_del = build_set('100rwsnDeleted.txt')

with open('results.txt', 'w') as out_file:
   # Using with to open files ensures that they are properly closed, even if the code
   # raises an exception.

   for res in (set_more - set_del):
      # The - computes the elements in set_more not in set_del.

      out_file.write(" ".join(res) + "\n")