且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何从powershell中的文本表中删除第n列?

更新时间:2022-10-23 09:11:00

您正在查看的是固定宽度数据.为了解决这个问题,Import-Csv 不会这样做,我前段时间做了几个函数来转换固定宽度数据.

function ConvertFrom-FixedWidth {[CmdletBinding()][输出类型([PSObject[]])]参数([参数(强制 = $true,位置 = 0,ValueFromPipeline = $true)][字符串[]]$数据,[int[]]$ColumnWidths = $null, # 一个整数数组,包含每个字段的字符宽度[string[]]$Header = $null, # 如果给定,则包含列标题的字符串数组[switch]$AllowRaggedContent # 如果设置,则该函数接受要修剪的最后一项.)# 如果数据是通过管道发送的,使用 $input 来收集 as arrayif ($PSCmdlet.MyInvocation.ExpectingInput) { $Data = @($Input) }# 或使用: $Data = $Input |ForEach-Object { $_ }如果 (!$ColumnWidths) {写详细使用第一行计算列宽"# 尝试从顶部(标题)行确定每个字段的宽度.# 这只有在该行中的字段不包含空格时才能正常工作# 字符或如果每个字段与下一个字段之间的间隔超过 1 个空格.# 临时用下划线替换标题行中的单个空格$row1 = $Data[0] -replace '(\S+) (\S+)', '$1_$2'# 获取每个字段的起始索引并加上最后一个字段的总长度$indices = @(([regex] '\S+').Matches($row1) | ForEach-Object {$_.Index}) + $row1.Length# 根据索引计算单个字段的宽度$ColumnWidths = (0..($indices.Count -2)) |ForEach-Object { $indices[$_ + 1] - $indices[$_] }}# 将字段宽度整数数组组合成一个正则表达式字符串,如'^(.{10})(.{50})(.{12})'$values = $ColumnWidths |ForEach-Object { "(.{$_})" }如果($AllowRaggedContent){# 考虑太短的字段(可能通过修剪尾随空格)# 将最后一项设置为 '(.*)$' 以捕获字符串中剩余的任何字符.$values[-1] = '(.*)$'}$regex = '^{0}' -f ($values -join '')写详细拆分字段并生成输出"# 执行一个脚本块来转换数组中的每一行.$csv = &{开关 -Regex ($Data) {$正则表达式{# 用逗号加入匹配的捕获组并包装字段# 在双引号之间.字段值内的双引号必须加倍.($matches[1..($matches.Count - 1)] | ForEach-Object { '"{0}"' -f ($_.Trim() -replace '"', '""') })-加入 ','}}}if ($Header) { $csv |ConvertFrom-Csv -Header $Header }其他 { $csv |ConvertFrom-Csv }}函数 ConvertTo-FixedWidth {[CmdletBinding()][输出类型([字符串[]])]参数 ([参数(强制 = $true,位置 = 0,ValueFromPipeline = $true)][PSObject[]]$Data,[参数(强制= $false)][验证范围(1, 8)][int]$Gap = 2)# 如果数据是通过管道发送的,则使用 $input 收集为数组if ($PSCmdlet.MyInvocation.ExpectingInput) { $Data = @($Input) }# 或使用: $Data = $Input |ForEach-Object { $_ }# 以正确的顺序从第一个对象中获取标题写详细检索列标题"$headers = $Data[0].PSObject.Properties |ForEach-Object {$_.Name }# 计算每列的最大宽度写详细的计算列宽"$columnWidths = @{}foreach($Data 中的 $item){foreach($headers 中的 $column){$length = [Math]::Max($item.$column.Length, $column.Length)if ($column -ne $headers[-1]) { $length += $Gap }if ($columnWidths[$column]) { $length = [Math]::Max($columnWidths[$column], $length) }$columnWidths[$column] = $length}}# 输出标题,全部左对齐$line = foreach($headers 中的 $column){"{0, -$($columnWidths[$column])}" -f $column}# 输出第一行(标题)$line -join ''# 正则表达式来测试数值$reNumeric = '^[+-]?\s*(?:\d{1,3}(?:(,?)\d{3})?(?:\1\d{3})*(\.\d*)?|\.\d+)$'# 接下来遍历所有数据行并输出格式化的行foreach($Data 中的 $item){$line = foreach($headers 中的 $column){$padding = $columnWidths[$column]# 如果值为数字,则右对齐,否则左对齐如果($item.$column -match $reNumeric){$padding -= $Gap"{0, $padding}{1}" -f $item.$column, (' ' * $Gap)}别的 {"{0, -$padding}" -f $item.$column}}# 输出行$line -join ''}}

有了这些功能,使用你的例子,你可以这样做:

#从文件中获取固定宽度的数据并将其转换为PSObjects数组$data = (Get-Content -Path 'D:\Test.txt') |ConvertFrom-FixedWidth -AllowRaggedContent -Verbose# 现在你可以像这样删除任何列$data = $data |选择对象 * -ExcludeProperty 'AGE'# 显示在屏幕上$数据|格式表 -AutoSize# 保存到磁盘作为新的固定宽度文件$数据|ConvertTo-FixedWidth -Gap 3 -Verbose |设置内容 -Path 'D:\Output.txt'# 或者您可以将其作为常规 CSV 保存到磁盘$数据|导出-CSV -Path 'D:\Output.csv' -NoTypeInformation

屏幕上的结果:

NAME READY STATUS RESTARTS IP 节点指定节点就绪门---- ----- ------ -------- -- ---- -------------- ---------------me-pod-name 2/2 运行 0 10.0.0.10 node1 me-pod-name-2 1/1 Running 0 10.0.1.20 node2 me-pod-name-3 1/1 Running 0 10.0.0.30 node3 

保存到固定宽度文件的结果:

NAME READY STATUS RESTARTS IP 节点指定节点就绪门me-pod-name 2/2 运行 0 10.0.0.10 node1 me-pod-name-2 1/1 Running 0 10.0.1.20 node2 me-pod-name-3 1/1 Running 0 10.0.0.30 node3 

Let's say I am working with a nicely formatted table. Take kubectl output for example:

NAME            READY   STATUS    RESTARTS   AGE     IP          NODE   NOMINATED NODE   READINESS GATES
me-pod-name     2/2     Running   0          6s      10.0.0.10   node1  <none>           <none>
me-pod-name-2   1/1     Running   0          6d18h   10.0.1.20   node2  <none>           <none>
me-pod-name-3   1/1     Running   0          11d     10.0.0.30   node3  <none>           <none>

I tend to watch such output and log changes a lot. In this case, I would like to remove one of the middle columns from the table and still get a nice output. E.g. let's try to remove the AGE column as it changes a lot and is useless for watching when the resource is young:

NAME            READY   STATUS    RESTARTS   IP          NODE   NOMINATED NODE   READINESS GATES
me-pod-name     2/2     Running   0          10.0.0.10   node1  <none>           <none>
me-pod-name-2   1/1     Running   0          10.0.1.20   node2  <none>           <none>
me-pod-name-3   1/1     Running   0          10.0.0.30   node3  <none>           <none>

My question is: How do I easily remove such column and still output nicely formatted table, with all other columns intact? The columns are not always the same size (e.g. age is not always 8 characters wide). I'd like to find some reusable, one-liner solution as I work a lot with the CLI tools (not only k8s related) and are in need of filtering them. Also, I'd like to avoid pattern-based solution, where I need to produce regex for each column I want to remove - I am able to do it, but that requires me writing specific solution for each use case.

I tried to work with ConvertFrom-String and format table, but this messes a lot with a data format (e.g. "1/1" is being treated as a date format which is not true for the case).

What you are looking at is Fixed-Width data. To handle that, the Import-Csv won't do so I made a couple of functions some time ago to convert from and to Fixed-Width data.

function ConvertFrom-FixedWidth {
    [CmdletBinding()]
    [OutputType([PSObject[]])]
    Param(
        [Parameter(Mandatory = $true, Position = 0, ValueFromPipeline = $true)]
        [string[]]$Data,

        [int[]]$ColumnWidths = $null,    # an array of integers containing the width in characters for each field
        [string[]]$Header = $null,       # if given, a string array containing the columnheaders
        [switch]$AllowRaggedContent      # if set, the function accepts the last items to be trimmed.
    )

    # If the data is sent through the pipeline, use $input to collect is as array
    if ($PSCmdlet.MyInvocation.ExpectingInput) { $Data = @($Input) }
    # or use : $Data = $Input | ForEach-Object { $_ }

    if (!$ColumnWidths) {
        Write-Verbose "Calculating column widths using first row"
        # Try and determine the width of each field from the top (header) line.
        # This can only work correctly if the fields in that line do not contain space
        # characters OR if each field is separated from the next by more than 1 space.

        # temporarily replace single spaces in the header row with underscore
        $row1 = $Data[0] -replace '(\S+) (\S+)', '$1_$2'  

        # Get the starting index of each field and add the total length for the last field
        $indices = @(([regex] '\S+').Matches($row1) | ForEach-Object {$_.Index}) + $row1.Length
        # Calculate individual field widths from their index
        $ColumnWidths = (0..($indices.Count -2)) | ForEach-Object { $indices[$_ + 1] - $indices[$_] }
    }

    # Combine the field widths integer array into a regex string like '^(.{10})(.{50})(.{12})'
    $values = $ColumnWidths | ForEach-Object { "(.{$_})" }
    if ($AllowRaggedContent) {
        # account for fields that are too short (possibly by trimming trailing spaces)
        # set the last item to be '(.*)$' to capture any characters left in the string.
        $values[-1] = '(.*)$'
    }
    $regex = '^{0}' -f ($values -join '')

    Write-Verbose "Splitting fields and generating output"
    # Execute a scriptblock to convert each line in the array.
    $csv = & { 
        switch -Regex ($Data) {
            $regex {
                # Join what the capture groups matched with a comma and wrap the fields
                # between double-quotes. Double-quotes inside a fields value must be doubled.
                ($matches[1..($matches.Count - 1)] | ForEach-Object { '"{0}"' -f ($_.Trim() -replace '"', '""') }) -join ','
            }
        }
    }
    if ($Header) { $csv | ConvertFrom-Csv -Header $Header }
    else { $csv | ConvertFrom-Csv }
}

function ConvertTo-FixedWidth {
    [CmdletBinding()]
    [OutputType([String[]])]
    Param (
        [Parameter(Mandatory = $true, Position = 0, ValueFromPipeline = $true)]
        [PSObject[]]$Data,

        [Parameter(Mandatory = $false)]
        [ValidateRange(1, 8)]
        [int]$Gap = 2
    )

    # if the data is sent through the pipeline, use $input to collect is as array
    if ($PSCmdlet.MyInvocation.ExpectingInput) { $Data = @($Input) }
    # or use : $Data = $Input | ForEach-Object { $_ }

    # get the headers from the first object in the correct order
    Write-Verbose "Retrieving column headers"
    $headers = $Data[0].PSObject.Properties | ForEach-Object {$_.Name }

    # calculate the maximum width for each column
    Write-Verbose "Calculating column widths"
    $columnWidths  = @{}
    foreach ($item in $Data) {
        foreach ($column in $headers) {
            $length = [Math]::Max($item.$column.Length, $column.Length)
            if ($column -ne $headers[-1]) { $length += $Gap }
            if ($columnWidths[$column]) { $length = [Math]::Max($columnWidths[$column], $length) }
            $columnWidths[$column] = $length
        }
    }

    # output the headers, all left-aligned
    $line = foreach ($column in $headers) {
        "{0, -$($columnWidths[$column])}" -f $column
    }
    # output the first (header) line
    $line -join ''

    # regex to test for numeric values
    $reNumeric = '^[+-]?\s*(?:\d{1,3}(?:(,?)\d{3})?(?:\1\d{3})*(\.\d*)?|\.\d+)$'

    # next go through all data lines and output formatted rows
    foreach ($item in $Data) {
        $line = foreach ($column in $headers) {
            $padding = $columnWidths[$column]
            # if the value is numeric, align right, otherwise align left
            if ($item.$column -match $reNumeric) { 
                $padding -= $Gap
                "{0, $padding}{1}" -f $item.$column, (' ' * $Gap)
            }
            else {
                "{0, -$padding}" -f $item.$column
            }
        }
        # output the line
        $line -join ''
    }
}

Having these functions in place, using your example, you could do this:

# get the fixed-width data from file and convert it to an array of PSObjects 
$data = (Get-Content -Path 'D:\Test.txt') | ConvertFrom-FixedWidth -AllowRaggedContent -Verbose

# now you can remove any column like this
$data = $data | Select-Object * -ExcludeProperty 'AGE'

# show on screen
$data | Format-Table -AutoSize

# save to disk as new fixed-width file
$data | ConvertTo-FixedWidth -Gap 3 -Verbose | Set-Content -Path 'D:\Output.txt'

# or you can save it as regular CSV to disk
$data | Export-Csv -Path 'D:\Output.csv' -NoTypeInformation

Result on screen:

NAME          READY STATUS  RESTARTS IP        NODE  NOMINATED NODE READINESS GATES
----          ----- ------  -------- --        ----  -------------- ---------------
me-pod-name   2/2   Running 0        10.0.0.10 node1 <none>         <none>         
me-pod-name-2 1/1   Running 0        10.0.1.20 node2 <none>         <none>         
me-pod-name-3 1/1   Running 0        10.0.0.30 node3 <none>         <none>

Result as saved to Fixed-Width file:

NAME            READY   STATUS    RESTARTS   IP          NODE    NOMINATED NODE   READINESS GATES
me-pod-name     2/2     Running          0   10.0.0.10   node1   <none>           <none>         
me-pod-name-2   1/1     Running          0   10.0.1.20   node2   <none>           <none>         
me-pod-name-3   1/1     Running          0   10.0.0.30   node3   <none>           <none>