sabre/xml ######### :category: PHP :tags: xml, sabre/dav :date: 2020-06-10 00:00:00 +0300 .. _sabre/dav: https://sabre.io/ .. _sabre/xml: https://sabre.io/xml/ An awesome discovery of today was made in an article titled `An XML library for PHP you may not hate`__. As an unexpected twist I really didn't hate it, in fact it helped me to solve a problem that I had. It is called `sabre/xml`_ and it's a part of the `sabre/dav`_ project. .. __: https://evertpot.com/an-xml-library-you-may-not-hate/ .. TEASER_END XML Namespaces and libxml ========================= XML namespaces look like a really simple concept: .. code-block:: xml Value 1 Value 2 Here ``http://example.com``, ``http://example.com/ns1``, ``http://example.com/ns2`` are full xml namespace names a.k.a. namespace uris. They should be universally unique among XML documents. On the contrary ``ns1`` and ``ns2`` and also the empty prefix for ``http://example.com`` have any meaning only in the context of this particular document. This concept however is a source of much confusion in the XML world because people tend to treat these prefixes as universal. I never worked with libxml directly but both PHP wrappers have the same problem. Let's see it at the SimpleXML example: .. code-block:: php registerXPathNamespace('r', 'http://example.com'); // register prefix that is assigned to another namespace in the document $xml->registerXPathNamespace('ns1', 'http://example.com/ns2'); echo strval($xml->xpath('/r:doc/ns1:elem')[0]); // Value 2? nope, it's Value 1 Some libraries may assign random prefixes so the conflict may be not that obvious. Of course you may check for all prefixes with ``$xml->getDocNamespaces()`` but what to do if a conflict is detected? Throw an error? But it's a perfectly valid situation. Assign random prefixes? But it's not convenient. SimpleXML has a solution for this with explicit namespace calls. Of course we have to drop the convenience of XPath for this: .. code-block:: php children('http://example.com/ns2')->elem) but it has a bug in another scenario. If you save a document subtree, all namespace declarations are lost: .. code-block:: php children('http://example.com/ns2')->elem->asXML(); // Value 2 // not // Value 2 Clark Notation and sabre/xml ============================ .. _Clark Notation: http://www.jclark.com/xml/xmlns.htm So here comes our new hero. ``sabre/xml`` drops prefixes entirely and uses so called `Clark Notation`_. .. code-block:: php XML(file_get_contents('example.xml')); $data = $reader->parse(); ``$data`` in json: .. code-block:: json { "name": "{http://example.com}doc", "value": [ { "name": "{http://example.com/ns1}elem", "value": "Value 1", "attributes": [] }, { "name": "{http://example.com/ns2}elem", "value": "Value 2", "attributes": [] } ], "attributes": [] } As you see the element names contain full namespace uris. Saving subtrees should leave out no data: .. code-block:: php namespaceMap = [ 'http://example.com/ns2' => 'ns2', ]; // PHP base XMLReader's boilerplate code // it's conveniently wrapped in \Sabre\Xml\Service // but I need direct access to the Writer for more control here $writer->openMemory(); $writer->startDocument(); // that's not how you retrieve subtrees in an actual code :D $writer->write($data['value'][1]); echo $writer->outputMemory(); // // Value 2 // Finally! Of course I left many more nice features of ``sabre/xml`` like object mapping, ``XmlSerializable`` & ``XmlDeserializable`` interfaces, convenience helpers for key-value and collection like data structures and so on. My goal was to show how it helps me to work with xml namespaces in a strict way.