且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

SQL Server 2008 - 将XML声明添加到XML输出

更新时间:2022-11-27 20:59:23

TL; DR



将您的XML转换为<?xml version =1.0encoding =windows-1252?> to varchar(max)。

TL;DR

Concatenate this: <?xml version="1.0" encoding="windows-1252" ?> with your XML, converted to varchar(max).

我同意 j0N45 该模式不会改变任何东西。由于他提到的答案指出:

I agree with j0N45 that the schema will not change anything. As the answer he references points out:


您必须手动添加。

You have to add it manually.

我提供了一些示例代码,以便在另一个答案。基本上,您将 CONVERT 将XML转换为 varchar nvarchar 和然后将其与XML声明连接起来,例如<?xml version =1.0encoding =windows-1252?>

I provided some example code to do so in another answer. Basically, you CONVERT the XML into varchar or nvarchar and then concatenate it with the XML declaration, such as <?xml version="1.0" encoding="windows-1252" ?>.

但是,选择正确的编码很重要。 SQL Server根据其排序规则设置生成非Unicode字符串。默认情况下,这将由数据库排序规则设置管理,您可以使用此SQL来确定:

However, it's important to choose the right encoding. SQL Server produces non-Unicode strings according to its collation settings. By default, that will be governed by the database collation settings, which you can determine using this SQL:

SELECT DATABASEPROPERTYEX('ExampleDatabaseName', 'Collation');

常见的默认排序规则是SQL_Latin1_General_CP1_CI_AS,它具有代码页1252.您可以使用此SQL检索代码页:

A common default collation is "SQL_Latin1_General_CP1_CI_AS", which has a code page of 1252. You can retrieve the code page with this SQL:

SELECT COLLATIONPROPERTY('SQL_Latin1_General_CP1_CI_AS', 'CodePage') AS 'CodePage';

对于代码页1252,您应该使用编码名称 windows-1252 。使用ISO-8859-1是不准确的。您可以使用子弹字符测试:•。它的Unicode代码点值为8226(十六进制2022)。您可以使用以下代码可靠地生成SQL中的字符:无论归类如何,请使用以下代码:

For code page 1252, you should use an encoding name of "windows-1252". The use of "ISO-8859-1" is inaccurate. You can test that using the "bullet" character: •. It has a Unicode Code Point value of 8226 (Hex 2022). You can generate the character in SQL reliably, regardless of collation, using this code:

SELECT NCHAR(8226);

它在Windows-1252代码页中的代码点也是149,所以你如果正在使用SQL_Latin1_General_CP1_CI_AS的常见默认排序规则,那么您也可以使用以下方式生成它:

It has also has a code point of 149 in the windows-1252 code page, so you if you are using the common, default collation of "SQL_Latin1_General_CP1_CI_AS", then you can also produce it using:

SELECT CHAR(149);

但是,CHAR(149)不会是所有排序规则中的项目符号。例如,如果您尝试这样做:

However, CHAR(149) won't be a bullet in all collations. For example, if you try this:

SELECT CONVERT(char(1),char(149)) COLLATE Chinese_Hong_Kong_Stroke_90_BIN;

根本没有一个项目符号。

You don't get a bullet at all.

ISO-8859-1代码页是Windows-28591 一>。没有一个SQL Server排序规则(在2005年)使用该代码页。您可以使用以下方式获取完整的代码页列表:

The "ISO-8859-1" code page is Windows-28591. None of the SQL Server collations (in 2005 anyway) use that code page. You can get a full list of code pages using:

SELECT [Name], [Description], [CodePage] = COLLATIONPROPERTY([Name], 'CodePage')
FROM ::fn_helpcollations()
ORDER BY [CodePage] DESC;

您可以通过尝试使用它来进一步验证ISO-8859-1是错误的选择在SQL本身。以下SQL:

You can further verify that "ISO-8859-1" is the wrong choice by trying to use it in SQL itself. The following SQL:

SELECT CONVERT(xml,'<?xml version="1.0" encoding="ISO-8859-1"?><test>•</test>');

将生成不包含项目符号的XML。实际上,它不会产生任何字符,因为ISO-8859-1没有为代码点149定义字符。

Will produce XML which does not contain a bullet. Indeed, it won't produce any character, because ISO-8859-1 has no character defined for code point 149.

SQL Server以不同的方式处理Unicode字符串。使用Unicode字符串( nvarchar ),不需要不同的代码页来处理不同的字符集。但是,SQL Server不使用UTF-8编码。如果您尝试在SQL中使用它:

SQL Server handles Unicode strings differently. With Unicode strings (nvarchar), "there is no need for different code pages to handle different sets of characters". However, SQL Server does NOT use "UTF-8" encoding. If you try to use it within SQL itself:

SELECT CONVERT(xml,N'<?xml version="1.0" encoding="UTF-8"?><test>•</test>');

您将收到错误:


消息9402,级别16,状态1,行1 XML解析:第1行,字符38,
无法切换编码

Msg 9402, Level 16, State 1, Line 1 XML parsing: line 1, character 38, unable to switch the encoding

相反,SQL使用U​​CS-2编码,所以这将工作:

Rather, SQL uses "UCS-2" encoding, so this will work:

SELECT CONVERT(xml,N'<?xml version="1.0" encoding="UCS-2"?><test>•</test>');