An awesome discovery of today was made in an article titled An XML library for PHP you may not hate. As an unexpected twist I really didn't hate it, in fact it helped me to solve a problem that I had. It is called sabre/xml and it's a part of the sabre/dav project.
XML Namespaces and libxml
XML namespaces look like a really simple concept:
http://example.com/ns2 are full xml namespace names a.k.a. namespace uris. They should be universally unique among XML documents. On the contrary
ns2 and also the empty prefix for
http://example.com have any meaning only in the context of this particular document. This concept however is a source of much confusion in the XML world because people tend to treat these prefixes as universal.
I never worked with libxml directly but both PHP wrappers have the same problem. Let's see it at the SimpleXML example:
<?php // open the file $xml = new SimpleXMLElement(file_get_contents('example.xml')); // xpath doesn't work with empty namespace prefixes in a namespaced document $xml->registerXPathNamespace('r', 'http://example.com'); // register prefix that is assigned to another namespace in the document $xml->registerXPathNamespace('ns1', 'http://example.com/ns2'); echo strval($xml->xpath('/r:doc/ns1:elem')); // Value 2? nope, it's Value 1
Some libraries may assign random prefixes so the conflict may be not that obvious. Of course you may check for all prefixes with
$xml->getDocNamespaces() but what to do if a conflict is detected? Throw an error? But it's a perfectly valid situation. Assign random prefixes? But it's not convenient.
SimpleXML has a solution for this with explicit namespace calls. Of course we have to drop the convenience of XPath for this:
but it has a bug in another scenario. If you save a document subtree, all namespace declarations are lost:
Clark Notation and sabre/xml
So here comes our new hero.
sabre/xml drops prefixes entirely and uses so called Clark Notation.
$data in json:
As you see the element names contain full namespace uris. Saving subtrees should leave out no data:
<?php $writer = new \Sabre\Xml\Writer(); // you can set default namespace prefixes or the library will generate random ones $writer->namespaceMap = [ 'http://example.com/ns2' => 'ns2', ]; // PHP base XMLReader's boilerplate code // it's conveniently wrapped in \Sabre\Xml\Service // but I need direct access to the Writer for more control here $writer->openMemory(); $writer->startDocument(); // that's not how you retrieve subtrees in an actual code :D $writer->write($data['value']); echo $writer->outputMemory(); // <?xml version="1.0"?> // <ns2:elem xmlns:ns2="http://example.com/ns2">Value 2</ns2:elem> // Finally!
Of course I left many more nice features of
sabre/xml like object mapping,
XmlDeserializable interfaces, convenience helpers for key-value and collection like data structures and so on. My goal was to show how it helps me to work with xml namespaces in a strict way.