且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在C中读取具有不同数据类型的多行

更新时间:2023-01-30 15:57:29

注意,我认为您的帖子是从3条不同的行中读取值,例如:

Note, I take your post to be reading values from 3 different lines, e.g.:

%s
%s
%d %d

(主要是通过使用fgets来证明的,fgets是面向行的输入函数,该函数读取一行输入(直到包括 '\n' ).如果不是这种情况,则以下内容将不适用(并且可以大大简化)

(primarily evidenced by your use of fgets, a line-oriented input function, which reads a line of input (up to and including the '\n') each time it is called.) If that is not the case, then the following does not apply (and can be greatly simplified)

由于您正在将多个值读取到结构数组中的单个元素中,因此在开始将信息复制到结构成员中之前,您可能会发现它更好(并且更健壮),可以使用临时值读取每个值并验证每个值.他们自己.这样,您可以(1)验证所有值的读取,以及(2)验证所有必需值的解析或转换,然后再将成员存储在结构中并增加数组索引.

Since you are reading multiple values into a single element in an array of struct, you may find it better (and more robust), to read each value and validate each value using temporary values before you start copying information into your structure members themselves. This allows you to (1) validate the read of all values, and (2) validate the parse, or conversion, of all required values before storing members in your struct and incrementing your array index.

此外,您还需要从titleartist中都删除尾部'\n',以防止嵌入的换行符悬挂在字符串的末尾(这会导致搜索titleartist).例如,将它们放在一起,您可以执行以下操作:

Additionally, you will need to remove the tailing '\n' from both title and artist to prevent having embedded newlines dangling off the end of your strings (which will cause havoc with searching for either a title or artist). For instance, putting it all together, you could do something like:

void rmlf (char *s);
....
char title[MAX_TITLE] = "";
char artist[MAX_ARTIST = "";
char a[10] = "";
int min, sec;
...
while (fgets (title, MAX_TITLE, file) &&     /* validate read of values */
       fgets (artist, MAX_ARTIST, file) &&
       fgets (a, 10, file)) {

    if (sscanf (a, "%d %d", &min, &sec) != 2) {  /* validate conversion */
        fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
        continue;  /* skip line - tailor to your needs */
    }

    rmlf (title);   /* remove trailing newline */
    rmlf (artist);

    s[i].time.min = min;    /* copy to struct members & increment index */
    s[i].time.sec = sec;
    strncpy (s[i].title, title, MAX_TITLE);
    strncpy (s[i++].artist, artist, MAX_ARTIST);
}

/** remove tailing newline from 's'. */
void rmlf (char *s)
{
    if (!s || !*s) return;
    for (; *s && *s != '\n'; s++) {}
    *s = 0;
}

(注意:,这还将读取所有值,直到使用feof在没有 的情况下遇到EOF为止(请参阅相关链接: ))

(note: this will also read all values until an EOF is encountered without using feof (see Related link: Why is "while ( !feof (file) )" always wrong?))

防止短读 fgets

根据乔纳森(Jonathan)的评论,在使用fgets时,您应该进行检查以确保您确实阅读了整行,并且没有经历过短读,在该情况下,您提供的最大字符值未达到足以读取整行(例如,短读,因为该行中的字符仍未读)

Following on from Jonathan's comment, when using fgets you should really check to insure you have actually read the entire line, and not experienced a short read where the maximum character value you supply is not sufficient to read the entire line (e.g. a short read because characters in that line remain unread)

如果发生短读,这将完全破坏您从文件中读取任何其他行的能力,除非您正确处理了故障.这是因为下一次读取尝试不会在您认为正在读取的行上开始读取,而是尝试读取发生短读的行中的其余字符.

If a short read occurs, that will completely destroy your ability to read any further lines from the file, unless you handle the failure correctly. This is because the next attempt to read will NOT start reading on the line you think it is reading and instead attempt to read the remaining characters of the line where the short read occurred.

您可以通过验证读入缓冲区的最后一个字符实际上是'\n'字符来验证fgets的读取. (如果该行的长度超过您指定的最大值,则 nul-终止字符之前的最后一个字符将改为普通字符.)如果遇到短读,然后,您必须读取并丢弃长行中的其余字符,然后再继续下一次读取. (除非您使用的是动态分配的缓冲区,您可以根据需要简单地realloc来读取行的其余部分以及您的数据结构)

You can validate a read by fgets by validating the last character read into your buffer is in fact a '\n' character. (if the line is longer than the max you specify, the last character before the nul-terminating character will be an ordinary character instead.) If a short read is encountered, you must then read and discard the remaining characters in the long line before continuing with your next read. (unless you are using a dynamically allocated buffer where you can simply realloc as required to read the remainder of the line, and your data structure)

您的情况使每个struct元素要求输入文件中的3行数据都使验证复杂化.您必须始终保持3行读取同步,以在读取循环的每次迭代期间将所有3行作为一组读取(即使发生短读取).这意味着您必须验证是否已读取所有3行并且没有发生短读操作,以便在不退出输入循环的情况下处理任何一个短读. (如果您只想终止任何短读上的输入,则可以分别验证每个输入,但这会导致输入例程非常不灵活.

Your situation complicates the validation by requiring data from 3 lines from the input file for each struct element. You must always maintain your 3-line read in sync reading all 3 lines as a group during each iteration of your read loop (even if a short read occurs). That means you must validate that all 3 lines were read and that no short read occurred in order to handle any one short read without exiting your input loop. (you can validate each individually if you just want to terminate input on any one short read, but that leads to a very inflexible input routine.

除了从输入中删除尾随的换行符之外,您还可以将上面的rmlf函数调整为一个可以验证fgets读取的函数.我在下面的一个名为shortread的函数中完成了此操作.可以对原始功能和读取循环的调整进行编码,如下所示:

You can tweak the rmlf function above to a function that validates each read by fgets in addition to removing the trailing newline from the input. I have done that below in a function called, surprisingly, shortread. The tweaks to the original function and read loop could be coded something like this:

int shortread (char *s, FILE *fp);
...
    for (idx = 0; idx < MAX_SONGS;) {

        int t, a, b;
        t = a = b = 0;

        /* validate fgets read of complete line */
        if (!fgets (title, MAX_TITLE, fp)) break;
        t = shortread (title, fp);

        if (!fgets (artist, MAX_ARTIST, fp)) break;
        a = shortread (artist, fp);

        if (!fgets (buf, MAX_MINSEC, fp)) break;
        b = shortread (buf, fp);

        if (t || a || b) continue;  /* if any shortread, skip */

        if (sscanf (buf, "%d %d", &min, &sec) != 2) { /* validate conversion */
            fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
            continue;  /* skip line - tailor to your needs */
        }

        s[idx].time.min = min;   /* copy to struct members & increment index */
        s[idx].time.sec = sec;
        strncpy (s[idx].title, title, MAX_TITLE);
        strncpy (s[idx].artist, artist, MAX_ARTIST);
        idx++;
    }
...
/** validate complete line read, remove tailing newline from 's'.
 *  returns 1 on shortread, 0 - valid read, -1 invalid/empty string.
 *  if shortread, read/discard remainder of long line.
 */
int shortread (char *s, FILE *fp)
{
    if (!s || !*s) return -1;
    for (; *s && *s != '\n'; s++) {}
    if (*s != '\n') {
        int c;
        while ((c = fgetc (fp)) != '\n' && c != EOF) {}
        return 1;
    }
    *s = 0;
    return 0;
}

(注意:在上面的示例中,shortread检查组成和 title,artist,time 组的每一行的结果.)

(note: in the example above the result of the shortread check for each of the lines that make up and title, artist, time group.)

为验证该方法,我整理了一个简短的示例,该示例将有助于将所有内容都放在上下文中.查看示例,让我知道是否还有其他问题.

To validate the approach I put together a short example that will help put it all in context. Look over the example and let me know if you have any further questions.

 #include <stdio.h>
#include <string.h>

/* constant definitions */
enum { MAX_MINSEC = 10, MAX_ARTIST = 32, MAX_TITLE = 48, MAX_SONGS = 64 };

typedef struct {
    int min;
    int sec;
} stime;

typedef struct {
    char title[MAX_TITLE];
    char artist[MAX_ARTIST];
    stime time;
} songs;

int shortread (char *s, FILE *fp);

int main (int argc, char **argv) {

    char title[MAX_TITLE] = "";
    char artist[MAX_ARTIST] = "";
    char buf[MAX_MINSEC] = "";
    int  i, idx, min, sec;
    songs s[MAX_SONGS] = {{ .title = "", .artist = "" }};
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
        return 1;
    }

    for (idx = 0; idx < MAX_SONGS;) {

        int t, a, b;
        t = a = b = 0;

        /* validate fgets read of complete line */
        if (!fgets (title, MAX_TITLE, fp)) break;
        t = shortread (title, fp);

        if (!fgets (artist, MAX_ARTIST, fp)) break;
        a = shortread (artist, fp);

        if (!fgets (buf, MAX_MINSEC, fp)) break;
        b = shortread (buf, fp);

        if (t || a || b) continue;  /* if any shortread, skip */

        if (sscanf (buf, "%d %d", &min, &sec) != 2) { /* validate conversion */
            fprintf (stderr, "error: failed to parse 'min' 'sec'.\n");
            continue;  /* skip line - tailor to your needs */
        }

        s[idx].time.min = min;   /* copy to struct members & increment index */
        s[idx].time.sec = sec;
        strncpy (s[idx].title, title, MAX_TITLE);
        strncpy (s[idx].artist, artist, MAX_ARTIST);
        idx++;
    }
    if (fp != stdin) fclose (fp);   /* close file if not stdin */

    for (i = 0; i < idx; i++)
        printf (" %2d:%2d  %-32s  %s\n", s[i].time.min, s[i].time.sec, 
                s[i].artist, s[i].title);

    return 0;
}

/** validate complete line read, remove tailing newline from 's'.
 *  returns 1 on shortread, 0 - valid read, -1 invalid/empty string.
 *  if shortread, read/discard remainder of long line.
 */
int shortread (char *s, FILE *fp)
{
    if (!s || !*s) return -1;
    for (; *s && *s != '\n'; s++) {}
    if (*s != '\n') {
        int c;
        while ((c = fgetc (fp)) != '\n' && c != EOF) {}
        return 1;
    }
    *s = 0;
    return 0;
}

示例输入

$ cat ../dat/titleartist.txt
First Title I Like
First Artist I Like
3 40
Second Title That Is Way Way Too Long To Fit In MAX_TITLE Characters
Second Artist is Fine
12 43
Third Title is Fine
Third Artist is Way Way Too Long To Fit in MAX_ARTIST
3 23
Fourth Title is Good
Fourth Artist is Good
32274 558212 (too long for MAX_MINSEC)
Fifth Title is Good
Fifth Artist is Good
4 27

使用/输出示例

$ ./bin/titleartist <../dat/titleartist.txt
  3:40  First Artist I Like               First Title I Like
  4:27  Fifth Artist is Good              Fifth Title is Good