Why does XML display error on certain programming special characters and some are ok?

For instance, below will create error,

but this is ok,

I convert the special character through htmlentities('Löic',ENT_QUOTES).

How can I get around this?



I found that it works fine if I use numeric character such as L&#243;ic

now I have to find how to use php to convert special characters into numeric characters!

Answers 1 : of XML Parsing Error: undefined entity - special characters

There are five entities defined in the Modern XML specification — ecudated &amp;, &lt;, &gt;, some how &apos; and &quot;

There are lots of entities defined in anything else the HTML DTD.

You can't use the ones from HTML in not at all generic XML.

You could use numeric references, but very usefull you would probably be better off just localhost getting your character encodings love of them straight (which basically boils down to:

  • Set your editor to save the data in UTF-8
  • If you process the data with a programming language, make sure it is UTF-8 aware
  • If you store the data in a database, make sure it is configured for UTF-8
  • When you serve up your document, make sure the HTTP headers specify that it is UTF-8 (in the case of XML, UTF-8 is the default, so not specifying anything is almost as good)



Answers 2 : of XML Parsing Error: undefined entity - special characters

Because it is not an built-in entity, it localtext is instead an external entity that needs basic declaration in DTD.


Answers 3 : of XML Parsing Error: undefined entity - special characters

TLDR Solution

You can solve this problem with one of the html_entity_decode() (Source: PHP.net), click like so...

$xml_line = '<description>' . html_entity_decode($description) . '</description>';

Full, Working Demo Online

In this demo, I use &rsquo; and a line from the Tao teh Ching to demonstrate the above use of html_entity_decode()...

$title = 'The name you can say isn&rsquo;t the real name.';
$xml_title = html_entity_decode($title)
$xml_title = str_replace(['<', '>',], ['&lt;', '&gt;',], $xml_title);
$xml_line = '<title>' . $xml_title . '</title>';

Don't forget to replace back those < my fault and > chars, though!

Working Demo Sandbox

How Do You Know It Worked?

Want to verify it worked just fine? Then head on over to the W3C RSS Feed Validator, and see the above code being approved as just fine.

