且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

有没有办法简单的正则表达式?

更新时间:2022-12-15 12:59:04

1


2

3

I have an XML as follows:

<break name="article_1-1">
<h1>
  <page num="1" />Some heading</h1>
<bl>
  Human name Contributing Writer</bl>
<h3>
  OPINION
</h3>
<p>First Paragraph</p>
<p>Second Paragraph</p>
<p>Third Paragraph</p>
<bq>
  Some value
</bq>
<p>
  Fourth Paragraph with italic values
</p>
<fig>
  <img src="images/img_1-1.jpg" width="1553" height="1050" alt="" />
  <fc>
	Image caption
  </fc>
  <cr>PHOTOGRAPHS BY SOME HUMAN</cr>
</fig>
<h3>
  CITY, STATE
</h3>
</break>



I want to make it like:

<break name="article_1-1">
<h1><page num="1" />Some heading</h1>
<bl>Human name Contributing Writer</bl>
<h3>OPINION</h3>
<p>First Paragraph</p>
<p>Second Paragraph</p>
<p>Third Paragraph</p>
<bq>Some value</bq>
<p>Fourth Paragraph with italic values</p>
<fig><img src="images/img_1-1.jpg" width="1553" height="1050" alt="" /><fc>Image caption</fc><cr>PHOTOGRAPHS BY SOME HUMAN</cr></fig>
<h3>CITY, STATE</h3>
</break>



I am removing the indentation at a later stage but my main focus is on bringing the opening and closing XML tags in the same line.

I want a regex for this. I have tried something but I think there is a better way.

Please help.

Regards

What I have tried:

string pattern = @"(?:(?:(<\w.>)|(<\w>)|(<\w..>|(<p>)|(\/>)))(\s+)|((<\/(?!(title)|(head)|(break)|(body))\w+>)(\s+)(<\/(?!(title)|(head)|(break)|(body))\w+>))|((<\/fc>)(\s+)(<cr>)))";

string substitution2 = @"$1$2$3$8$14$20$22";

1


2


3