且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在命令行上解析HTML;如何在< strong></strong>中捕获文本?

更新时间:2023-02-19 16:50:57

在grep中使用Perl regex的后视和超前功能.它应该比使用awk更简单.

Using Perl regex's look-behind and look-ahead feature in grep. It should be simpler than using awk.

grep -oP "(?<=<strong>).*?(?=</strong>)" file

输出:

Target1NoSpaces
Target2 With Spaces

添加:

此Perl的regex Ruby多重匹配实现可以匹配多行中的值:

This implementation of Perl's regex's multi-matching in Ruby could match values in multiple lines:

ruby -e 'File.read(ARGV.shift).scan(/(?<=<strong>).*?(?=<\/strong>)/m).each{|e| puts "----------"; puts e;}' file

输入:

<strong>Target
A
B
C
</strong><strong>Target D</strong><strong>Target E</strong>

输出:

----------
Target
A
B
C
----------
Target D
----------
Target E