且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

带变量的AWK负正则表达式

更新时间:2023-02-22 14:19:08

仅使用 awk :

$ awk 'NR==FNR{a[$1,$2,$3]; next} !(($1,$2,$3) in a)' file2 file1
chr1    9997    10330   HumanGM18558_peak_1     150     .       10.78887        18.86368        15.08777        100
chr1    15966215        15966638        HumanGM18558_peak_3    81      .       7.61567 11.78841        8.17169 200

  • NR == FNR 这仅适用于第一个文件,在本示例中为 file2
  • a [$ 1,$ 2,$ 3] 根据前三个字段创建键,如果两个文件之间的间距完全相同,则可以简单地使用 $ 0 $ 1,$ 2,$ 3
  • next 跳过其余命令并处理下一行输入
  • ($ 1,$ 2,$ 3)在一个中,以检查 file1 的前三个字段是否作为键出现在数组 a 中.然后反转条件.
    • NR==FNR this will be true only for the first file, which is file2 in this example
    • a[$1,$2,$3] create keys based on first three fields, if spacing is exactly same between the two files, you can simply use $0 instead of $1,$2,$3
    • next to skip remaining commands and process next line of input
    • ($1,$2,$3) in a to check if first three fields of file1 is present as key in array a. Then invert the condition.
    • 这是另一种编写方法(感谢Ed Morton)

      Here's another way to write it (thanks to Ed Morton)

awk '{key=$1 FS $2 FS $3} NR==FNR{a[key]; next} !(key in a)' file2 file1