且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

文本转换为xml

更新时间:2023-02-06 15:56:08

你只需检查你的逻辑,没有别的。允许XML只有一个top元素root(感谢Richard Deeming)。



我不会使用 StringBuilder $ c $但是,但是。我会使用 string.Format (仅适用于非常严格的XML架构),或者更好的是, System.Xml.XmlWriter

https ://msdn.microsoft.com/en-us/library/system.xml.xmlwriter%28v=vs.110%29.aspx [ ^ ]。



-SA


这是一个可能的解决方案,但如果它正是你想要的那么我很难用您提供的信息。

生成的XML结构只是猜测,可能不适合您的需求。



首先阅读全部内容将文件转换为字符串。

(如果文件很大,这种方法可能不是***的)

  string  content = File.ReadAllTe xt(< filename>); < /   filename   >  



然后在内容上使用此正则表达式:

正则表达式=  new 正则表达式( @  ##(?< parent> \w +)\\\\ n(?< children> [\ w] +)\\\((?< values> [\ S] +)(\\\\ n (?!#)| 


))+
,RegexOptions.None);



此表达式将给出你有三个命名组,其中'values'组有一个或多个捕获。

然后遍历所有匹配并创建XML结构。

我使用过XElement,但是也可以使用XmlDocument。

 XElement xeRoot =  new  XElement(  root);  //  将名称root更改为任何合适的 
foreach (匹配m in expression.Matches(content))
{
XElement xeParent = new XElement(m.Groups [ parent]。 );

string [] children = m.Groups [ children]。Value.Split(' '); // 将空格用作子项的分隔符

foreach (捕获上限 m.Groups [ values]。捕获)
{
XElement xeChild = new XElement( child); // 将名称child更改为任何合适的
string [] values = cap.Value.Split( new string [] { @ * @},StringSplitOptions.None);

// 检查子项和值计数是否相等
if (children.Length!= values.Length)
throw new 异常( 子项数和值不匹配。 );

for int i = 0 ; i < children.Length; i ++)
{
XElement xeChildValue = new XElement(children [i]);
xeChildValue.Value = values [i];
xeChild.Add(xeChildValue);
}

xeParent.Add(xeChild);
}

xeRoot.Add(xeParent);
}



最后将XML数据保存到文件

 XDocument doc =  new  XDocument(); 
doc.Add(xeRoot);
doc.Save( @ C:\Temp\test.xml) ;





生成的XML

 <?  xml     version   =  1.0    encoding   =  utf-8  >  
< root >
< 主要 >
< child >
< MachineName > RMM-LT-417 < / MachineName >
< DomainName > Home.LOCAL < / DomainName >
&lt ; Scandate > 03/23/2015 18:48:38 < / Scandate >
< GUID n> > 31c0841e-f7bf-4de1-9d75-7e9080498e6b-20141216020243430495 < / GUID >
< RegID > 2853625 < / RegID >
< / child >
< / Main
>
< AP_LogicalDrivesInformation >
< child >
< 标题 > C < / Caption >
< 描述 > 硬盘< / Description >
< DriveType > 云位置固定< / DriveType >
< FileSystem > NTFS < / FileSystem >
< FreeSpace > 421002715136 < / FreeSpace >
< UsedSpace > 79097778176 < / UsedSpace >
< 尺寸 > 500100493312 < /尺寸 >
< CFreeSpace > 392.1 GB < / CFreeSpace >
< CUsedSpace > 73.7 GB < / CUsedSpace >
< VolumeName > 操作系统< / VolumeName >
< VolumeSerialNumber > 469D6D66 < / VolumeSerialNumber >
< / child >
< child >
< 标题 > D < / Caption >
< 描述 > 硬盘驱动器< /描述 >
< DriveType > Drive Fixed < pan> / DriveType >
< FileSystem > NTFS < / FileSystem >
< FreeSpace > 484706508800 < / FreeSpace >
< UsedSpace > 4765618176 < / UsedSpace >
< 尺寸an> > 489472126976 < / Size >
< CFreeSpace > 451.4 GB < / CFreeSpace >
< CUsedSpace > 4.4 GB < / CUsedSpace >
< VolumeName > DATAPART < / VolumeName >
< VolumeSerialNumber > 7A9EDCCC < / VolumeSerialNumber >
< / child >
< 孩子 >
< 标题 > E < /标题 >
< 说明 > MATSHITA DVD + -RW UJ8E2 < n> / Description >
< DriveType > Cd-Rom < / DriveType >
< FileSystem > < / FileSystem >
&lt ; FreeSpace > 0 < / FreeSpace >
&lt ; UsedSpace > 0 < / UsedSpace >
< 尺寸 > 0 < / Size >
< CFreeSpace > 0 < / CFreeSpace >
< CUsedSpace > 0 < / CUsedSpace >
< VolumeName > < / VolumeName >
< VolumeSerialNumber > 0 < / VolumeSerialNumber >
< / child >
< / AP_LogicalDrivesInformation
>
< / root >


I am being assigned with the task to convert plain text file into xml file using C#.the plain text file contains:

##Main
MachineName DomainName Scandate GUID RegID
RMM-LT-417@*@Home.LOCAL@*@03/23/2015 18:48:38@*@31c0841e-f7bf-4de1-9d75-7e9080498e6b-20141216020243430495@*@2853625

##AP_LogicalDrivesInformation
Caption Description DriveType FileSystem FreeSpace UsedSpace Size CFreeSpace CUsedSpace VolumeName VolumeSerialNumber
C@*@Hard Drive@*@Drive Fixed@*@NTFS@*@421002715136@*@79097778176@*@500100493312@*@392.1 GB@*@73.7 GB@*@OS@*@469D6D66
D@*@Hard Drive@*@Drive Fixed@*@NTFS@*@484706508800@*@4765618176@*@489472126976@*@451.4 GB@*@4.4 GB@*@DATAPART@*@7A9EDCCC
E@*@MATSHITA DVD+-RW UJ8E2@*@Cd-Rom@*@@*@0@*@0@*@0@*@0@*@0@*@@*@0


where ####Main and ##AP_LogicalDrivesInformation are root element
MachineName DomainName Scandate GUID RegID are nodes and line below that are its value which splits line after @*@.

My code for the same is :

var xml = new StringBuilder();

xml.Append("<Main>\n");

foreach (var line in File.ReadAllLines(@"yourfile.txt"))
{
    Regex.Replace(line, @"(?<=(?:\r?\n){2}|\A)(?:\r?\n)+", "");

    var line1 = line.Replace("@*@", ";");

    var vals = line1.Split(';');

    // TODO add more fields
    xml.AppendFormat("<MachineName>{0}</MachineName>\n <DomainName>{1}</DomainName>\n <Scandate>{2}</Scandate>\n <GUID>{3}</GUID>\n <RegID>{4}</RegID>\n",
                   vals[0].Trim(), vals[1].Trim(), vals[2].Trim(), vals[3].Trim(), vals[4].Trim());
    xml.Append("</Main>\n");
}


But the problem now is no new root node is created for ##AP_LogicalDrivesInformation

You need just to check your logic, nothing else. XML is allowed to have only one top element, root (credit to Richard Deeming).

I would not use StringBuilder though. I would use string.Format (for very rigid-format XML schema only) or, even better, System.Xml.XmlWriter:
https://msdn.microsoft.com/en-us/library/system.xml.xmlwriter%28v=vs.110%29.aspx[^].

—SA


This is one possible solution, however if it is exactly what you want is difficult for me to say with the little information you have provided.
The resulting XML structure is just a guesstimate and might not suit your needs.

First read the whole contents of the file into a string.
(If the file is very big, this approach might not be the best)
string content = File.ReadAllText(<filename>);</filename>


Then use this regular expression on the content:

Regex expression = new Regex(@"##(?<parent>\w+)\r\n(?<children>[\w ]+)\r\n((?<values>[\S ]+)(\r\n(?!#)|


))+", RegexOptions.None);


This expression will give you three named groups where of the 'values' group has one or more captures.
Then loop through all matches and create the XML structure.
I have used XElement, but XmlDocument can also be used.

XElement xeRoot = new XElement("root");   // Change the name 'root' to whatever suitable
foreach (Match m in expression.Matches(content))
{
    XElement xeParent = new XElement(m.Groups["parent"].Value);

    string[] children = m.Groups["children"].Value.Split(' ');      // Use space as delimter for the children
    
    foreach (Capture cap in m.Groups["values"].Captures)
    {
        XElement xeChild = new XElement("child");   // Change the name 'child' to whatever suitable
        string[] values = cap.Value.Split(new string[] { "@*@" }, StringSplitOptions.None);

        // Check that the children and values counts are equal
        if (children.Length != values.Length)
            throw new Exception("The number of children and values mismatch.");
      
        for (int i = 0; i < children.Length; i++)
        {
            XElement xeChildValue = new XElement(children[i]);
            xeChildValue.Value = values[i];
            xeChild.Add(xeChildValue);
        }

        xeParent.Add(xeChild);
    }
    
    xeRoot.Add(xeParent);
}


Finally save the XML data to file

XDocument doc = new XDocument();
doc.Add(xeRoot);
doc.Save(@"C:\Temp\test.xml");



Resulting XML

<?xml version="1.0" encoding="utf-8"?>
<root>
  <Main>
    <child>
      <MachineName>RMM-LT-417</MachineName>
      <DomainName>Home.LOCAL</DomainName>
      <Scandate>03/23/2015 18:48:38</Scandate>
      <GUID>31c0841e-f7bf-4de1-9d75-7e9080498e6b-20141216020243430495</GUID>
      <RegID>2853625</RegID>
    </child>
  </Main>
  <AP_LogicalDrivesInformation>
    <child>
      <Caption>C</Caption>
      <Description>Hard Drive</Description>
      <DriveType>Drive Fixed</DriveType>
      <FileSystem>NTFS</FileSystem>
      <FreeSpace>421002715136</FreeSpace>
      <UsedSpace>79097778176</UsedSpace>
      <Size>500100493312</Size>
      <CFreeSpace>392.1 GB</CFreeSpace>
      <CUsedSpace>73.7 GB</CUsedSpace>
      <VolumeName>OS</VolumeName>
      <VolumeSerialNumber>469D6D66</VolumeSerialNumber>
    </child>
    <child>
      <Caption>D</Caption>
      <Description>Hard Drive</Description>
      <DriveType>Drive Fixed</DriveType>
      <FileSystem>NTFS</FileSystem>
      <FreeSpace>484706508800</FreeSpace>
      <UsedSpace>4765618176</UsedSpace>
      <Size>489472126976</Size>
      <CFreeSpace>451.4 GB</CFreeSpace>
      <CUsedSpace>4.4 GB</CUsedSpace>
      <VolumeName>DATAPART</VolumeName>
      <VolumeSerialNumber>7A9EDCCC</VolumeSerialNumber>
    </child>
    <child>
      <Caption>E</Caption>
      <Description>MATSHITA DVD+-RW UJ8E2</Description>
      <DriveType>Cd-Rom</DriveType>
      <FileSystem></FileSystem>
      <FreeSpace>0</FreeSpace>
      <UsedSpace>0</UsedSpace>
      <Size>0</Size>
      <CFreeSpace>0</CFreeSpace>
      <CUsedSpace>0</CUsedSpace>
      <VolumeName></VolumeName>
      <VolumeSerialNumber>0</VolumeSerialNumber>
    </child>
  </AP_LogicalDrivesInformation>
</root>