更新时间:2023-11-27 14:08:34
对于Ruby 1.8.6,您可以使用Ruby Iconv是标准库的一部分:
根据这个有用的文章,似乎您可以至少从字符串中清除不需要的win-1252字符,如下所示:
ic = Iconv.new('UTF-8 // IGNORE','UTF-8')
valid_string = ic.iconv untrusted_string +'')[0 ..- 2]
然后可能会尝试进行完全转换像这样:
ic = Iconv.new('UTF-8','W INDOWS-1252')
valid_string = ic.iconv(untrusted_string +'')[0 ..- 2]
I'm migrating some data from MS Access 2003 to MySQL 5.0 using Ruby 1.8.6 on Windows XP (writing a Rake task to do this).
Turns out the Windows string data is encoded as windows-1252 and Rails and MySQL are both assuming utf-8 input so some of the characters, such as apostrophes, are getting mangled. They wind up as "a"s with an accent over them and stuff like that.
Does anyone know of a tool, library, system, methodology, ritual, spell, or incantation to convert a windows-1252 string to utf-8?
For Ruby 1.8.6, it appears you can use Ruby Iconv, part of the standard library:
According this helpful article, it appears you can at least purge unwanted win-1252 characters from your string like so:
ic = Iconv.new('UTF-8//IGNORE', 'UTF-8')
valid_string = ic.iconv(untrusted_string + ' ')[0..-2]
One might then attempt to do a full conversion like so:
ic = Iconv.new('UTF-8', 'WINDOWS-1252')
valid_string = ic.iconv(untrusted_string + ' ')[0..-2]