且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

用引号内的字符串将CSV行与分号和引号匹配

更新时间:2023-01-14 20:35:40

假设一个正确的csv使用双引号进行转义(" ),即可以逐行读取>

Assuming a proper csv that uses doubled quotes for escaping (""), that is read line by line you can use

"(?:[^"]+|"")*"|[^;]+|(?<=;|^)(?=;|$)

基本上三种不同的方式来匹配列:

Basically three different ways to match a column:

  • (?:[^"] + |")*"开头和结尾的引号之间用非引号或双引号
  • [^;] + 一系列非semikolons
  • (?< =; | ^)(?=; | $)分号之间或分号与开始/结束之间的空字段
  • "(?:[^"]+|"")*" starting and closing quote with non-quotes or double quotes between
  • [^;]+ a series of non-semikolons
  • (?<=;|^)(?=;|$) an empty field between semikolons or between semikolon and start/end

注意:

  • 如果要在多行上下文中使用它,则必须在否定的字符类中添加 \ n
  • 它不处理与引号字段连接的前导或尾随空格

请参见 https://regex101.com/r/twKZVN/1

(尽管regex 101测试PCRE模式,但所有使用的功能也都可以在.net模式中使用.

(While regex 101 tests a PCRE pattern, all features used are also available in a .net pattern.