且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

REGEX仅用于数据和结束标记

更新时间:2022-11-26 15:27:04

/ p>

  public static class XHTMLCleanerUpperThingy 
{
private const string p =< p>;
private const string closingp =< / p>;

public static string CleanUpXHTML(string xhtml)
{
StringBuilder builder = new StringBuilder(xhtml);
for(int idx = 0; idx< xhtml.Length; idx ++)
{
int current;如果((current,xhtml.IndexOf(p,idx))!= -1)
{
int idxofnext = xhtml.IndexOf(p,current + p.Length);
int idxofclose = xhtml.IndexOf(closingp,current);

//如果有下一个< p>标记
if(idxofnext> 0)
{
//如果下一个结束标记比下一个< p>更远,标记
if(idxofnext< idxofclose)
{
for(int j = 0; j< p.Length; j ++)
{
builder [current + j] ='';
}
}
}
//如果没有最终结束标记
else if(idxofclose {
for(int j = 0; j {
builder [current + j] ='';





return builder.ToString();
}
}


I am looking for REGEX which will give me data along with the end tag

e.g.

input:
-----------------
<p>ABC<p>
-----------------
Output would be
-----------------
ABC<p>

-----------------

it will only remove the first para

para tag,Not for the second para

tag and all text in between would be same.

I want to mention here that i am looking for

<p>ABC<p> 

not for

<p>ABC</p>

Its for specific text file having irregular

tags

Example:

i have big xhtml file like...

<p>scet</p>
<p>sunny </p>
<p>             <!--this tag is to be removed -->
<p>              <!--this tag is to be removed -->
<p>mark</p>
<p>Thomas </p>

its a complete XHTML file.having body head etc tags Only problem here is extra tags i am expecting output like this

<p>scet</p>
<p>sunny </p>

<p>mark</p>
<p>Thomas </p>

This will work, take html document in string xhtml

 public static class XHTMLCleanerUpperThingy
    {
        private const string p = "<p>";
        private const string closingp = "</p>";

    public static string CleanUpXHTML(string xhtml)
    {
        StringBuilder builder = new StringBuilder(xhtml);
        for (int idx = 0; idx < xhtml.Length; idx++)
        {
            int current;
            if ((current = xhtml.IndexOf(p, idx)) != -1)
            {
                int idxofnext = xhtml.IndexOf(p, current + p.Length);
                int idxofclose = xhtml.IndexOf(closingp, current);

                // if there is a next <p> tag
                if (idxofnext > 0)
                {
                    // if the next closing tag is farther than the next <p> tag
                    if (idxofnext < idxofclose)
                    {
                        for (int j = 0; j < p.Length; j++)
                        {
                            builder[current + j] = ' ';
                        }
                    }
                }
                // if there is not a final closing tag
                else if (idxofclose < 0)
                {
                    for (int j = 0; j < p.Length; j++)
                    {
                        builder[current + j] = ' ';
                    }
                }
            }
        }

        return builder.ToString();
    }
}