且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

使用fputcsv()/fgetcsv()写入csv时,数据出现乱码

更新时间:2023-11-27 12:40:40

罪魁祸首是fputcsv()使用转义字符,这是CSV的非标准扩展. (嗯,就RFC 7111而言,它可以视为标准.)基本上,必须禁用此转义字符,但是将空字符串作为$ escape传递给fputcsv()无效.通常,传递NUL字符应该可以达到预期的效果,但是,请参见 https://3v4l.org/MlluN

There seems to be an encoding issue or bug in PHP with fputcsv() and fgetcsv().

The following PHP code:

$row_before = ['A', json_encode(['a', '\\', 'b']), 'B'];

print "\nBEFORE:\n";
var_export($row_before);
print "\n";

$fh = fopen($file = 'php://temp', 'rb+');

fputcsv($fh, $row_before);

rewind($fh);

$row_after = fgetcsv($fh);

print "\nAFTER:\n";
var_export($row_after);
print "\n\n";

fclose($fh);

Gives me this output:

BEFORE:
array (
  0 => 'A',
  1 => '["a","\\\\","b"]',
  2 => 'B',
)

AFTER:
array (
  0 => 'A',
  1 => '["a","\\\\',
  2 => 'b""]"',
  3 => 'B',
)

So clearly, the data is damaged on the way. Originally there were just 3 cells in the row, afterwards there are 4 cells in the row. The middle cell is split thanks to the backslash that is also used as an escape character.

See also https://3v4l.org/nc1oE Or here, with explicit values for delimiter, enclosure, escape_char: https://3v4l.org/Svt7m

Is there any way I can sanitize / escape my data before writing to CSV, to guarantee that the data read from the file will be exactly the same?

Is CSV a fully reversible format?

EDIT: The goal would be a mechanism to properly write and read ANY data as csv, so that after one round trip the data is still the same.

EDIT: I realize that I do not really understand the $escape_char parameter. See also fgetcsv/fputcsv $escape parameter fundamentally broken Maybe an answer to this would also bring us closer to a solution.

The culprit is that fputcsv() uses an escape character, which is a non-standard extension to CSV. (Well, as far as RFC 7111 can be regarded as standard.) Basically, this escape character would have to be disabled, but passing an empty string as $escape to fputcsv() doesn't work. Usually, passing a NUL character should give the desired results, however, see https://3v4l.org/MlluN.