且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

SQL Server 2008 - 将 XML 声明添加到 XML 输出

更新时间:2022-11-28 12:47:06

TL;DR

连接:<?xml version="1.0" encoding="windows-1252" ?> 与您的 XML,转换为 varchar(max).

TL;DR

Concatenate this: <?xml version="1.0" encoding="windows-1252" ?> with your XML, converted to varchar(max).

我同意 j0N45 架构不会改变任何内容.正如他引用的答案指出的那样:

I agree with j0N45 that the schema will not change anything. As the answer he references points out:

您必须手动添加.

我在另一个答案中提供了一些示例代码.基本上,您将 XML CONVERT 转换为 varcharnvarchar,然后将其与 XML 声明连接起来,例如 <?xml version="1.0" encoding="windows-1252" ?>.

I provided some example code to do so in another answer. Basically, you CONVERT the XML into varchar or nvarchar and then concatenate it with the XML declaration, such as <?xml version="1.0" encoding="windows-1252" ?>.

但是,选择正确的编码很重要.SQL Server 根据其排序规则设置生成非 Unicode 字符串.默认情况下,这将由数据库排序规则设置管理,您可以使用此 SQL 确定:

However, it's important to choose the right encoding. SQL Server produces non-Unicode strings according to its collation settings. By default, that will be governed by the database collation settings, which you can determine using this SQL:

SELECT DATABASEPROPERTYEX('ExampleDatabaseName', 'Collation');

常见的默认排序规则是SQL_Latin1_General_CP1_CI_AS",其代码页为 1252.您可以使用此 SQL 检索代码页:

A common default collation is "SQL_Latin1_General_CP1_CI_AS", which has a code page of 1252. You can retrieve the code page with this SQL:

SELECT COLLATIONPROPERTY('SQL_Latin1_General_CP1_CI_AS', 'CodePage') AS 'CodePage';

对于代码页 1252,您应该使用编码名称windows-1252".ISO-8859-1"的使用是不准确的.您可以使用项目符号"字符进行测试: •.它的 Unicode 代码点值为 8226(十六进制 2022).您可以使用以下代码可靠地在 SQL 中生成字符,而不管排序规则如何:

For code page 1252, you should use an encoding name of "windows-1252". The use of "ISO-8859-1" is inaccurate. You can test that using the "bullet" character: •. It has a Unicode Code Point value of 8226 (Hex 2022). You can generate the character in SQL reliably, regardless of collation, using this code:

SELECT NCHAR(8226);

它在 windows-1252 代码页中也有一个 149 的代码点,所以如果你使用的是SQL_Latin1_General_CP1_CI_AS"的通用默认排序规则,那么你也可以使用:

It has also has a code point of 149 in the windows-1252 code page, so you if you are using the common, default collation of "SQL_Latin1_General_CP1_CI_AS", then you can also produce it using:

SELECT CHAR(149);

但是,CHAR(149) 不会成为所有排序规则中的项目符号.例如,如果您尝试这样做:

However, CHAR(149) won't be a bullet in all collations. For example, if you try this:

SELECT CONVERT(char(1),char(149)) COLLATE Chinese_Hong_Kong_Stroke_90_BIN;

你根本没有子弹.

ISO-8859-1"代码页是 Windows-28591.没有任何 SQL Server 排序规则(无论如何在 2005 年)使用该代码页.您可以使用以下方法获取完整的代码页列表:

The "ISO-8859-1" code page is Windows-28591. None of the SQL Server collations (in 2005 anyway) use that code page. You can get a full list of code pages using:

SELECT [Name], [Description], [CodePage] = COLLATIONPROPERTY([Name], 'CodePage')
FROM ::fn_helpcollations()
ORDER BY [CodePage] DESC;

您可以通过尝试在 SQL 本身中使用ISO-8859-1"来进一步验证它是错误的选择.以下 SQL:

You can further verify that "ISO-8859-1" is the wrong choice by trying to use it in SQL itself. The following SQL:

SELECT CONVERT(xml,'<?xml version="1.0" encoding="ISO-8859-1"?><test>•</test>');

将生成不包含项目符号的 XML.事实上,它不会产生任何字符,因为 ISO-8859-1 没有为代码点 149 定义字符.

Will produce XML which does not contain a bullet. Indeed, it won't produce any character, because ISO-8859-1 has no character defined for code point 149.

SQL Server 以不同的方式处理 Unicode 字符串.对于 Unicode 字符串 (nvarchar),"没有需要不同的代码页来处理不同的字符集".但是,SQL Server 不使用UTF-8"编码.如果您尝试在 SQL 本身中使用它:

SQL Server handles Unicode strings differently. With Unicode strings (nvarchar), "there is no need for different code pages to handle different sets of characters". However, SQL Server does NOT use "UTF-8" encoding. If you try to use it within SQL itself:

SELECT CONVERT(xml,N'<?xml version="1.0" encoding="UTF-8"?><test>•</test>');

你会得到一个错误:

消息 9402,级别 16,状态 1,第 1 行 XML 解析:第 1 行,字符 38,无法切换编码

Msg 9402, Level 16, State 1, Line 1 XML parsing: line 1, character 38, unable to switch the encoding

相反,SQL 使用UCS-2"编码,所以这会起作用:

Rather, SQL uses "UCS-2" encoding, so this will work:

SELECT CONVERT(xml,N'<?xml version="1.0" encoding="UCS-2"?><test>•</test>');