且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

PowerShell Invoke-RestMethod Umlauts与UTF-8和Windows-1252有关的问题

更新时间:2023-11-27 13:50:40

如注释中所述,Confluence API似乎使用UTF8编码了http响应,但 not 并未包含"Content-键入标题以表明这一点.

As discussed in the comments, it looks like the Confluence API encodes http responses using UTF8, but does not include the "Content-Type" header to indicate that.

字符集参数的HTTP规范说:在没有此标头的情况下,客户端应假定它是使用ISO-8859-1字符集编码的,因此您的请求中发生的事情是这样的:

The HTTP specification for the charset parameter says that in the absence of this header, the client should assume it's encoded with ISO-8859-1 character set, so what is happening in your request is something like this:

# server (Confluence API) encodes response text using utf8
PS> $text = "ü";
PS> $bytes = [System.Text.Encoding]::UTF8.GetBytes($text);
PS> write-host $bytes;
195 188

# client (Invoke-RestMethod) decodes bytes as ISO-8859-1
PS> $text = [System.Text.Encoding]::GetEncoding("ISO-8859-1").GetString($bytes);
PS> write-host $text;
ü

鉴于您无法控制服务器发送的内容,您要么需要自己捕获原始字节(例如,使用

Given that you can't control what the server sends, you'll either need to capture the raw bytes yourself (e.g. using System.Net.Http.HttpClient) and decode them using UTF8, or modify the existing response to compensate for the encoding mismatch (e.g. below).

PS> $text = "ü"
PS> $bytes = [System.Text.Encoding]::GetEncoding("ISO-8859-1").GetBytes($text)
PS> $text = [System.Text.Encoding]::UTF8.GetString($bytes)
PS> write-host $text
ü

请注意,如果您使用Invoke-RestMethod的 -Outfile 参数,它可能会将响应字节直接流式传输到磁盘,而无需对其进行解码或编码,因此结果文件已经包含 utf8 $ bytes而不是 utf8 $ bytes->使用ISO-8859-1解码的字符串->使用utf8编码的文件字节

Note that if you use the -Outfile parameter of Invoke-RestMethod it presumably streams the response bytes directly to disk without decoding or encoding them, so the resultant file already contains utf8 $bytes rather than utf8 $bytes -> string decoded using ISO-8859-1 -> file bytes encoded using utf8