Sql 2008, invalid character parsing xml with characters with tilde

2

I'm parsing a xml text which contains characters like 'á é ñ'.

I'm getting a 'An invalid character was found in text content.' error, like this

declare @Xml varchar(100)

set @Xml =
'
<?xml version="1.0" encoding="UTF-8"?>
<Root>á</Root>
'

declare @XmlId integer

execute dbo.sp_xml_preparedocument @XmlId output, @Xml

select * from openXml( @XmlId, '/', 2) with (
  Root varchar(10)
)
execute dbo.sp_xml_removedocument @XmlId

And I'm getting the following error:

The XML parse error 0xc00ce508 occurred on line number 3, near the XML text "<Root>".
Msg 6602, Level 16, State 2, Procedure sp_xml_preparedocument, Line 1
The error description is 'An invalid character was found in text content.'.
Msg 8179, Level 16, State 5, Line 13
Could not find prepared statement with handle 0.
Msg 6607, Level 16, State 3, Procedure sp_xml_removedocument, Line 1
sp_xml_removedocument: The value supplied for parameter number 1 is invalid.

Is there some way that sql can parse this xml? Or the problem is the encoding?

Is the only solution to encode those characters or is there a more elegant way to solve it?

sql
sql-server
xml
sql-server-2008
html-encode
asked on Stack Overflow Jan 10, 2014 by opensas

2 Answers

4

I got same error today that passing serialized object as xml to my store procedure. Eventually I found where the mistake is.

Change your code from:

declare @Xml varchar(100)

To:

declare @Xml nvarchar(100)

Here is summary I found online, hopefully help you.

An nvarchar column can store any Unicode data. A varchar column is restricted to an 8-bit codepage (non-Unicode character data). Using nvarchar rather than varchar can help you avoid doing encoding conversion every time you read from or write to database.

answered on Stack Overflow Mar 8, 2016 by Void
-1

The character á character is not a valid valid for utf-8 encoding. This is what any xml validator should tell you. The solution is to encode it properly.

If you are gettting the data from someone else, you should tell them they are doing it wrong. If you generate this data, you should fix this. Assuming you are stuck in the middle, it is possible to write a pre-processor for the file that "fixes" invalid XML before handing it of to your process that requires valid XML (vendors unwilling or unable to provide valid XML should be avoided whenever possible)

ADDED

You will be unsuccessful in a quest to convince TSQL to parse XML that won't validate.

answered on Stack Overflow Jan 10, 2014 by Gary Walker • edited Jan 10, 2014 by Gary Walker

User contributions licensed under CC BY-SA 3.0