且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

我如何在C#中解析这种字符串

更新时间:2023-11-06 22:49:16

好吧,每列似乎是一个固定的宽度,所以只需弄清楚每列的宽度,并为文件中的每个字符串做一个修剪(子串(行,x, y))其中x是要开始的字符串的索引,y是列的宽度。



不要忘记在第一行数据之前不解析任何内容,并处理行可能并不总是你预期的长度的可能性。


你需要意识到它看起来你不需要解析一个字符串。



如果你打开文件在Excel中你会注意到Excel将其识别为固定宽度文件;所以你需要的是一个文件解析器。



这里有很多例子。虽然这个是十年之久,但它仍然有效。

CodeProject:使用.NET自定义属性处理固定宽度的平面文件 [ ^ ]



VB还有一个 TextFieldParser 类,如果您在 Microsoft.VisualBasic.FileIO 命名空间中添加引用,则可以使用该类/ BLOCKQUOTE>

I am having output in the following manner. I am trying to parse that output and assign it to a datatable but i am facing difficulty because the output column text have spaces between the character because of that i am unable to split data in equal columns

sample o/p

List of Sessions

Cell Manager: prd.cp.sya.com
Creation Date: 10/11/2018 10:53:48 AM

 Session Type                 Specification                       Status                    Mode    Start Time            Queuing Duration       GB Written    # Media # Errors # Warnings # Files Success Session ID          

 Backup                       IDB IDB_Backup                      Completed                 full    10/10/2018 11:50:07      0:00     1:07          1345.11          1        0          0   23794    100% 2018/10/10-38       
 Copy (scheduled)             Daily_Incr_Copy                     Aborted                   -       10/10/2018 1:13:07 P     0:00    10:10             0.00          1        1          1       0      0% 2018/10/10-40       
 Backup                       EDRFT8 quESDapg-scan_DB_P8RDEUA_Da Completed                 full    10/10/2018 3:00:07 P     0:00     0:08             2.64          1        0          0       1    100% 2018/10/10-41       
 Backup                       ESRDE8 ptfrapg-scan_DB_P8WSEROD2_D Completed                 full    10/10/2018 3:00:07 P     0:00     0:10             3.35          1        0          0       1    100% 2018/10/10-42       
 Backup                       Backup_Servers_Weekly_Full_Daily_In Completed/Errors          incr    10/10/2018 5:00:07 P     0:00     0:40            88.63          1        1         50  377910    100% 2018/10/10-43       
 Backup                       Prdesrda01_Daily backup            Completed                 incr    10/10/2018 5:00:07 P     0:00     0:06             0.00          1        0          0    7214    100% 2018/10/10-44       
 Backup                       Win_FS_Daily_Incr_44                Completed                 incr    10/10/2018 5:45:07 P     0:00     2:42           305.12          1        0          0 1369158    100% 2018/10/10-45       
 Backup                       Win_FS_DMZ_Daily_Incr_01            Completed                 incr    10/10/2018 5:45:07 P     0:00     0:39            94.10          1        0         60 1158973    100% 2018/10/10-46       
 Backup                       PRRDFTFS01_Daily_Incr               Completed                 incr    10/10/2018 6:00:07 P     0:00     0:15            14.81          1        0         72  128805    100% 2018/10/10-47       
 Backup                       PREWQAFS02_Daily_Incr               Completed                 incr    10/10/2018 6:00:07 P     0:00     2:21            12.14          1        0          0  658882    100% 2018/10/10-48       
 Backup                       PRDTRFGS03_Daily_Incr               Completed                 incr    10/10/2018 6:00:07 P     0:00     3:23           130.02          1        0          0 1172956    100% 2018/10/10-49       
 Backup                       Win_FS_Daily_Incr_23                Completed                 incr    10/10/2018 6:00:08 P     0:00     3:50           127.36          1        0          0 2343249    100% 2018/10/10-50       
 Backup                       Win_FS_Daily_Incr_21                Completed                 incr    10/10/2018 6:00:08 P     0:00     1:47           389.71          1        0          0 1966853    100% 2018/10/10-51       
 Backup                       Win_FS_Weekly_Full_01               Completed                 full    10/10/2018 6:30:07 P     0:00     2:58          1434.19          1        0          0 7311244    100% 2018/10/10-52       
 Backup                       Win_FS_Weekly_Full_30               Completed/Errors          full    10/10/2018 6:30:08 P     0:00     3:14          1616.41          1        1          0 7037181    100% 2018/10/10-53       
 Backup                       Win_FS_Weekly_Full_32               Completed                 full    10/10/2018 6:30:08 P     0:00     4:07          1229.42          1        0          0 1195721    100% 2018/10/10-54       
 Backup                       Win_FS_Weekly_Full_33               Completed                 full    10/10/2018 6:30:08 P     0:00     2:13           629.20          1        0          0 6740762    100% 2018/10/10-55       
 Backup                       ORDTFC8 DeRDTapg53_test Backup_Arch Completed                 full    10/10/2018 6:51:07 P     0:00     0:07             8.00          1        0          0       1    100% 2018/10/10-56       
 Backup                       PRDRDFSW35_MSSQL_Offline_1AM_Daily_ Completed                 incr    10/10/2018 10:00:07      0:06     1:42           640.34          1        0          1     687    100% 2018/10/10-77



What I have tried:

I tried the following method:

string[] stringSeparators = new string[] {"\r\n"};
            string[] lines = OutputFilePath.Split(stringSeparators, StringSplitOptions.RemoveEmptyEntries);
            lines = lines.Where(x => !string.IsNullOrEmpty(x)).ToArray();
            foreach (var item in lines.Skip(5))
            {
                //formatedop = item.Split(' ');
                formatedop = Regex.Split(item, @"\s{1,}");
                formatedop = formatedop.Where(x => !string.IsNullOrEmpty(x)).ToArray();
                try
                {
                    if (formatedop.Length > 16)
                    {
                        if (formatedop[1].Contains("(scheduled)"))
                        {
                            formatedop[0] = formatedop[0] + " " + formatedop[1];
                            formatedop[5] = formatedop[5] + " " + formatedop[6] + " " + formatedop[7];
                            table.Rows.Add(
                       formatedop[0].ToString(),
                       formatedop[2].ToString(),
                       formatedop[3].ToString(),
                       formatedop[4].ToString(),
                       formatedop[5].ToString(),
                       formatedop[8].ToString(),
                       formatedop[9].ToString(),
                       formatedop[10].ToString(),
                       formatedop[11].ToString(),
                       formatedop[12].ToString(),
                       formatedop[13].ToString(),
                       formatedop[14].ToString(),
                       formatedop[15].ToString(),
                       DateTime.Now.AddDays(-1).ToString("yyyy-MM-dd"));
                        }
                        if (formatedop[1].Contains("Oracle8"))
                        {
                            formatedop[1] = formatedop[1] + " " + formatedop[2];
                            formatedop[5] = formatedop[5] + " " + formatedop[6] + " " + formatedop[7];
                            table.Rows.Add(
                       formatedop[0].ToString(),
                       formatedop[1].ToString(),
                       formatedop[3].ToString(),
                       formatedop[4].ToString(),
                       formatedop[5].ToString(),
                       formatedop[8].ToString(),
                       formatedop[9].ToString(),
                       formatedop[10].ToString(),
                       formatedop[11].ToString(),
                       formatedop[12].ToString(),
                       formatedop[13].ToString(),
                       formatedop[14].ToString(),
                       formatedop[15].ToString(),
                       DateTime.Now.AddDays(-1).ToString("yyyy-MM-dd"));
                        }
                    }
                    else 
                    {
                        formatedop[5] = formatedop[5] + " " + formatedop[6] + " " + formatedop[7];
                        table.Rows.Add(
                       formatedop[0].ToString(),
                       formatedop[1].ToString(),
                       formatedop[2].ToString(),
                       formatedop[3].ToString(),
                       formatedop[4].ToString(),
                       formatedop[5].ToString(),
                       formatedop[8].ToString(),
                       formatedop[9].ToString(),
                       formatedop[10].ToString(),
                       formatedop[11].ToString(),
                       formatedop[12].ToString(),
                       formatedop[13].ToString(),
                       formatedop[14].ToString(),
                       formatedop[15].ToString(),
                       DateTime.Now.AddDays(-1).ToString("yyyy-MM-dd"));
                    }

                }
                catch(System.Exception ex)
                {
                    formatedop[0].ToString();
                    Console.WriteLine(ex.InnerException.ToString());
                }
            }


But this is not working properly. It breaks at one or the other line. I need a clean and clear method. Can anyone suggest something. I need output like this:

formatedop[0]=Backup
formatedop[1]=IDB IDB_Backup
formatedop[2]=Completed
formatedop[3]=full
formatedop[4]=10/10/2018 11:50:07
formatedop[5]=0:00
formatedop[6]=1:07
formatedop[7]=1345.11
formatedop[8]=1
formatedop[9]=0
formatedop[10]=0
formatedop[11]=23794
formatedop[12]=100%
formatedop[13]=2018/10/10-38

Well, each column appears to be a fixed width, so simply figure out how many characters wide each column is, and for each string in the file, do a Trim(Substring(line, x, y)) where x is the index into the string to start in, and y is the width of the column.

Don't forget to NOT parse anything before the first line of data, and handle the possibility that the line may not always be the length you expect it to be.


You need to realize that it looks like you don't need to parse a string.

If you open the file up in Excel you will notice Excel recognizes this as a "Fixed-Width" file; so what you need is a file parser.

There are quite a few examples of this out there. While this one is one over a decade old it is still functional.
CodeProject: Handling Fixed-width Flat Files with .NET Custom Attributes[^]

VB also has a TextFieldParser class that can be used if you add in a referecne to the Microsoft.VisualBasic.FileIO namespace