且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

读取大XLS和XLSX文件

更新时间:2022-12-08 11:18:20

好的,所以我尝试复制您的excel文件,然后将XLSX2CSV完全扔出了窗口.我认为将xlsx转换为csv的方法不是正确的方法,因为根据您的XLSX格式,它可以读取所有空行(您可能知道,因为您已将行计数器设置为60k).不仅如此,而且如果我们考虑到字段,它可能会或可能不会导致带有特殊字符的错误输出,例如您的问题.

Okay, so I've tried replicating your excel file and I completly threw the XLSX2CSV out the window. I don't think the approach of converting the xlsx into csv is the right one because, as depending on your XLSX format, it can read all the empty rows (you probably know that because you've set a row counter of 60k). not only that but if we're taking into consideration fields, it may or may not cause incorrect output with special characters, like your problem.

我所做的是我使用了这个库 https://github.com/davidpelfree/sjxlsx读取并重写文件.这非常简单,新的xlsx生成的文件中的字段已更正.

What I've done is I've used this library https://github.com/davidpelfree/sjxlsx to read and re-write the file. It's pretty much straight-forward and the new xlsx generated file has the fields corrected.

我建议您尝试这种方法(也许不使用此lib),尝试重新写入文件以更正它.

I suggest you try this approach (maybe not with this lib), of trying to re-write the file in order to correct it.