Problem getting latin characters



  • 1. Advanced XSD
    Big hi to all the XML gurus out there. I have to create an XSD which validates the following XML file: <document> <metadata> <field> <name>ID</name> <value>123</value> </field> <field> <name>FirstName</name> <value>John</value> </field> <field> <name>SurName</name> <value>Smith</value> </field> <field> <name>XXX</name> <value>YYY</value> </field> <metadata> </document> Every XML file has to have the three first fields, but one can add one or more fields to the bottom. Now, I want to create a complex type 'field' and resuse it in my schema. I want to add three fields to my schema and restrict those fields to ID, FirstName and SurName. I also want to add a fourth field which can contain every name and make it repeatable. I don't have a clue how to implement this. I know how to create an XSD with complexTypes and reuse them, but I need to know how to require the first three fields to contain ID, FirstName and SurName as names and add another field which can contain anything.
  • 2. [newbie] Losing the root with multiple XPath expressions
    Hi, I use one XPath expression to find a node, and then I do another XPath expression on that node. The problem is that a root expression ("/....") which I expect to work, doesn't work. The XML file... <bookstore> <book> <title lang="eng">Harry Potter</title> <price>29.99</price> </book> <book> <title lang="eng">Learning XML</title> <price>39.95</price> </book> </bookstore> The code... XmlDocument doc = new XmlDocument(); doc.Load(@"..\..\books.xml"); XPathNavigator nav = doc.CreateNavigator(); XPathNavigator node1 = nav.SelectSingleNode("descendant::book[title='Harry Potter']"); XPathNavigator node2; node2 = node1.SelectSingleNode("/book/price"); // returns null node2 = node1.SelectSingleNode("/price"); // returns null node2 = node1.SelectSingleNode("price"); // returns the price element Console.WriteLine("The price is " + node2.Value); Output.... The price is 29.99 I expected that the expression "/book/price" would find the price element, because a "book" element is the root of node1. However, it returns null. What is the root of node2, and what is XPath expression for finding the price, starting at the root? TIA, Javaman
  • 3. how to get attribute name in xslt?
    Hello, I want to display the names of the attributes and the value of the attributes in an xslt. I know this is how to get the value of each attribute: <xsl:value-of select="@*" /> But, how do I get the name of the corresponding attribute? Maybe something like: <xsl:value-of select="name(@*)" /> ? Thanks for your help! -- Steve

Problem getting latin characters

Postby Tkw » Fri, 01 Apr 2005 00:11:08 GMT

I have a byte array that contains xml encoded with UTF-8 encoding. This array 
contains this character sequence:
C3 A0 C3 A8 C3 AC C3 B2 C3 B9
I know that this is equal to .
How can I load this byte array in a Document and get correct sequence?
I'm using VC++ 6.0 and msxml 4.0.


Re: Problem getting latin characters

Postby name » Sat, 02 Apr 2005 17:41:06 GMT

BEFORE you parse it with whatever, you need at that end,

the file - text -  specify.


If it matches - good.

If not, then please, bring your computer back.


There are many ways to improve the world.

In your case, that would be it.


Similar Threads:

1.HTML/XML character encoding getting changed

I have a software application I've written called PowerBlog (
that takes the editing capability of the Internet Explorer WebBrowser
control (essentially a DHTMLTextBox), extracts the user-typed HTML, assigns
it as an XML node's InnerText property (using C#: System.Xml.XmlDocument
obj; obj.InnerText = myHTML). Then I later get the InnerText as a string and
write to disk.

When this text is displayed in a web browser, special characters that are
beyond the standard ASCII charset are not rendered correctly. Frequently, I
have copied text from a web site, pasted in the DHTMLTextbox, saved, and
published it, and my published output has corrupt characters. However, prior
to publishing, when previewing my document it looks fine -- it is only when
it is published (extracted, written to disk, uploaded to the server via FTP,
downloaded via HTTP) that the corruption occurs.

There are several places where this problem could be occurring, and I don't
know how to figure it out.

- A "design feature" in the XmlNode's InnerText property that converts the
&###; encoding into an actual character.
- An encoding flaw when written to disk (currently I'm using the default,
UTF-8 I guess).
- A flaw in the FTP client class where the file is being corrupted during
upload (I think I'm using binary upload format but perhaps I should
- A flaw in IIS (no known strange settings exist)

I still need to do some homework on this but I was wondering if anyone has
any bright ideas before I continue searching this out?


2.Now getting errors when parsing XML doc-invalid characters

I am using a VBA application that uses MSXML 4.0 Service Pack 2
but all of the sudden as of yesterday, I am now getting errors when the 
parser finds an invalid character such as a TM, Copy right symbol etc. 
These characters always existed and were being read fine by the parser.  I 
assume it is one of the recent updates to IE or XP, but not sure which one, 
maybe: kb922760....BUT...  I removed this one and a few recent ones and I 
still get the same error:

"an invalid character was found in text content"

is this a bug? if not, how can I get around it in the VBA code without 
having to redo all the xml files because I cannot.

here is my info:
the xml file has this at the top:
<?xml version="1.0" encoding="iso-8859-1"?>

using this code in ms access:
Set test = New DOMDocument
test.async = False
test.validateOnParse = False

theurl = 'the url to the xml file'

   Set oXMLHTTP = New MSXML2.ServerXMLHTTP40

   With oXMLHTTP
       .setTimeouts 30000, 30000, 120000, 300000
       .Open "POST", theurl, , "username", "password"
       .setRequestHeader "CONTENT-TYPE", "application/x- 
       strresult = .responseText
   End With

 strfile = strresult

test.loadXML strfile



3.Latin characteres within XMLA

Hi, we are using SSAS 2005 SP 1 though msmdpump, but when text values contain 
latin characters (such as: ) the XML contains double question marks "??".

We tried connecting from Excel through OLEDB for Analysis Services, and it 
works fine.


4.VB strings, MSXML, Latin 9, and the Euro


I have an Excel application which is receiving XML via MSXML4, encoded
with charset=ISO-8859-15. The XML text may contain the Euro symbol. I
want to place the text into a VB string, and sometime later, a cell.

If I decode the XML text into a byte array, I can see that the Euro
has been correctly decoded using the Windows charset CP1252 to the
value 128 (0x80).

However if I create an empty string and append the VB Euro symbol -

Dim str as String
Dim b() as Byte

str = Chr(128)
b = str

msgbox "high byte=" & b(0) & ", low byte=" & b(1)

I find that the Euro should be encoded with high and low values of 177
& 32. So when I create a VB string from the XML text, the Euro
character is completely wrong.

I'd really appreciate any insight into what's happening here, and how
to correctly specify the charset when I create the VB string, or
alternatively, the correct character set to use on the XML encoding

Many thanks,


5.Parsing XML with numeric entities outside Latin 1


I have an XML file with this text in it:
     <price>10 €</price>
(just an example)

My input and output encodings are Latin 1 (ISO-8859-1).

When PHP parses it, the characterdata function outputs
     10 ?
I would like it to output
     10 €
i.e. just leave the entity as it is, as the character can't be 
represented directly in Latin 1.

Is this possible ?

Thank you.

6. VB strings, MSXML, Latin 9, and the Euro

7. polish font - ISO 8859-2 (Latin 2) - from external xml file

8. Problem getting nested xml nodes with C#, .NET 2.0 (XmlDocument)

Return to xml


Who is online

Users browsing this forum: No registered users and 16 guest