且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

一些代码读取.csv文件崩溃

更新时间:2023-11-14 14:32:04

崩溃的循环应该是更像:

The loop that's crashing should be more like:

enum { MAX_FIELD_WIDTH = 10 };  // Including null terminator

i = j = 0;
while ((ch = getc(csvFile)) != EOF)
{
    if (ch == ',' || ch == '\n')
    {
        bigArr[i++][j] = '\0';
        j = 0;
    }
    else
    {
        if (j < MAX_FIELD_WIDTH - 1)
            bigArr[i][j++] = ch;
        // else ignore excess characters
}

警告:未经测试的代码!

您的代码只是创建一个线性列表 l * c 字段值,这是罚款。您可以通过访问字段 bigArr [n * c] 通过选择行 n bigArr [n * c + c - 1] (从第0行开始计算)

Your code is simply creating a linear list of l * c field values, which is fine. You can pick the fields for line n by accessing fields bigArr[n * c] through bigArr[n * c + c - 1] (counting from line 0).

对于诸如 l c ,我使用较长的名称,如(或线)和 cols 。还不长,但更有意义。应该使用单个字符名称范围有限。

For important variables like l and c, I use longer names such as rows (or lines) and cols. Still not long, but more meaningful. Single character names should be used with limited scope.

请注意,此代码忽略CSV格式的细微之处,例如带有双引号内的逗号的字段,更不用说双引号中的换行符领域。它也忽略了行中不同数量字段的可能性。如果代码跟踪行号,则可以处理太多的字段(忽略额外的)和太少的字段(为缺少的字段创建空条目)。如果预扫描文件的代码更清晰,则可以保留每行最小和最大列数以及行数的记录。

Note that this code ignores subtleties of the CSV format such as fields with commas inside double quotes, let alone newlines within double quoted fields. It also ignores the possibility of varying numbers of fields in the lines. If the code kept track of line numbers, it would be possible to handle both too many fields (ignoring the extra) and too few fields (creating empty entries for missing fields). If the code that pre-scans the file was cleverer, it could keep a record of the minimum and maximum number of columns per line as well as the number of lines. Problems could then be diagnosed too.

使用更复杂的内存管理方案,也可以扫描文件一次,如果文件实际上具有优势,终端或管道,而不是磁盘文件。它也可以处理任意长的字段值,而不是将它们限制为10个字节,包括终端空字节。

With a more complex memory management scheme, it would also be possible to scan the file just once, which has advantages if the file is actually a terminal or pipe, rather than a disk file. It could also handle arbitrarily long field values instead of restricting them to 10 bytes including the terminal null byte.

代码应该检查文件是否可以打开,并在完成后将其关闭。当前的函数界面是:

The code should check that the file could be opened, and close it when it is finished. The current function interface is:

int changeValue(int line, int col, char str[], const char* path)

但显示的代码会忽略前三个值。这可能是因为最终的代码会更改读取的值之一,然后重写该文件。如果被要求更改不存在的列或行,则可能会报告错误。这些相对较小的意义可能是由于最小化以使代码类似于MCVE(如何创建最小,完整和可验证的示例? a>)。

but the first three values are ignored by the code shown. This is probably because the final code will change one of the values read and then rewrite the file. Presumably, it would report an error if asked to change a non-existent column or line. These relatively minor infelicities are probably due to the minimization to make the code resemble an MCVE (How to create a Minimal, Complete, and Verifiable Example?).