更新时间:2023-08-28 16:29:04
此处是基于正则表达式和书签的解决方案,它适用于排序文件(即每条重复的行后都跟着重复的行):
Here is a solution based on regular Expressions and bookmarks, it works for a sorted file (i.e. each duplicated line is followed by its duplicates):
((.*)\R(\2\R?)+)*\K.*
. matches newline
((.*)\R(\2\R?)+)*\K.*
. matches newline
说明
正则表达式由三部分组成:
The regular expression is made up of three parts:
((.*)\R(\2\R?)+)*
:这是一个可选的重复块,由一个或多个行块组成
((.*)\R(\2\R?)+)*
: this is an optional block of duplicates consisting of one ore more line blocks
( ... )*
匹配零个或多个这样的重复行块(如果在您的示例中,三个4后跟两个5,我们将需要一个重复块序列的概念)(.*)\R(\2\R?)+
:\2
引用了(.*)
的内容:这都是一行的重复项\R
是可选的(由于?
)换行符.因此,如果文件的最后一行不以换行符结尾,则可以匹配该文件的最后一行( ... )*
matches zero or more such blocks of duplicated lines (if in your example the three 4 would be followed by two 5 we will need a concept of sequences of duplicate blocks) (.*)\R(\2\R?)+
: \2
references the content of (.*)
: this are all duplicates of one line\R
is an optional ( due to the ?
) linebreak. Thus it is possible to match a duplicate in the last line of the file if that line does not end with a linebreak 如果从您开始的光标位置后面有一行重复的行,它将与之匹配.
If there is a block of duplicated lines after the cursor position from which you start, this will match it.
现在\K
丢弃到目前为止已匹配的内容(重复项),并在第一行唯一行之前放置光标"
now \K
discards what we have matched so far (the duplicates) and "puts the cursor" before the first unique line
使用全部标记,我们将所有这些独特的行添加为书签,以便我们可以使用搜索"->书签"菜单中的条目"将其删除.
Using Mark All we bookmark all such unique lines, so that we can remove them using the Entry from the Search -> Bookmark menu.