更新时间:2023-11-29 20:09:04
我喜欢使用下面的代码来完成这个任务
I like to use the following code for exactly this task
fid = fopen('someTextFile.txt', 'rb');
%# Get file size.
fseek(fid, 0, 'eof');
fileSize = ftell(fid);
frewind(fid);
%# Read the whole file.
data = fread(fid, fileSize, 'uint8');
%# Count number of line-feeds and increase by one.
numLines = sum(data == 10) + 1;
fclose(fid);
如果您有足够的内存来一次读取整个文件,速度会非常快.它应该适用于 Windows 和 Linux 风格的行尾.
It is pretty fast if you have enough memory to read the whole file at once. It should work for both Windows- and Linux-style line endings.
我测量了迄今为止提供的答案的性能.这是确定包含 100 万个双精度值(每行一个值)的文本文件的行数的结果.平均 10 次尝试.
I measured the performance of the answers provided so far. Here is the result for determining the number of lines of a text file containing 1 million double values (one value per line). Average of 10 tries.
Author Mean time +- standard deviation (s)
------------------------------------------------------
Rody Oldenhuis 0.3189 +- 0.0314
Edric (2) 0.3282 +- 0.0248
Mehrwolf 0.4075 +- 0.0178
Jonas 1.0813 +- 0.0665
Edric (1) 26.8825 +- 0.6790
使用 Perl 并将所有文件作为二进制数据读取的方法是最快的.如果 Perl 在内部也一次读取文件的大块而不是逐行循环,我不会感到惊讶(只是猜测,对 Perl 一无所知).
So fastest are the approaches using Perl and reading all the file as binary data. I would not be surprised, if Perl internally also read large blocks of the file at once instead of looping through it line by line (just a guess, do not know anything about Perl).
使用简单的 fgetl()
循环比其他方法慢 25-75 倍.
Using a simple fgetl()
-loop is by a factor of 25-75 slower than the other approaches.
编辑 2: 包括 Edric 的第二种方法,我会说,它快得多并且与 Perl 解决方案相当.
Edit 2: Included Edric's 2nd approach, which is much faster and on-par with the Perl solution, I'd say.