sabre/xml
#########
:category: PHP
:tags:  xml, sabre/dav
:date:  2020-06-10 00:00:00 +0300
.. _sabre/dav: https://sabre.io/
.. _sabre/xml: https://sabre.io/xml/
An awesome discovery of today was made in an article titled `An XML library for PHP you may not hate`__.
As an unexpected twist I really didn't hate it, in fact it helped me to solve a problem that I had.
It is called `sabre/xml`_ and it's a part of the `sabre/dav`_ project.
.. __: https://evertpot.com/an-xml-library-you-may-not-hate/
.. TEASER_END
XML Namespaces and libxml
=========================
XML namespaces look like a really simple concept:
.. code-block:: xml
    
    
        Value 1
        Value 2
    
Here ``http://example.com``, ``http://example.com/ns1``, ``http://example.com/ns2`` are full xml namespace names a.k.a. namespace uris.
They should be universally unique among XML documents.
On the contrary ``ns1`` and ``ns2`` and also the empty prefix for ``http://example.com`` have any meaning only in the context of this particular document.
This concept however is a source of much confusion in the XML world because people tend to treat these prefixes as universal.
I never worked with libxml directly but both PHP wrappers have the same problem.
Let's see it at the SimpleXML example:
.. code-block:: php
    registerXPathNamespace('r', 'http://example.com');
    // register prefix that is assigned to another namespace in the document
    $xml->registerXPathNamespace('ns1', 'http://example.com/ns2');
    echo strval($xml->xpath('/r:doc/ns1:elem')[0]); // Value 2? nope, it's Value 1
Some libraries may assign random prefixes so the conflict may be not that obvious.
Of course you may check for all prefixes with ``$xml->getDocNamespaces()`` but what to do if a conflict is detected?
Throw an error? But it's a perfectly valid situation.
Assign random prefixes? But it's not convenient.
SimpleXML has a solution for this with explicit namespace calls.
Of course we have to drop the convenience of XPath for this:
.. code-block:: php
    children('http://example.com/ns2')->elem)
but it has a bug in another scenario.
If you save a document subtree, all namespace declarations are lost:
.. code-block:: php
    children('http://example.com/ns2')->elem->asXML();
    // Value 2
    // not
    // Value 2
Clark Notation and sabre/xml
============================
.. _Clark Notation: http://www.jclark.com/xml/xmlns.htm
So here comes our new hero.
``sabre/xml`` drops prefixes entirely and uses so called `Clark Notation`_.
.. code-block:: php
    XML(file_get_contents('example.xml'));
    $data = $reader->parse();
``$data`` in json:
.. code-block:: json
    {
        "name": "{http://example.com}doc",
        "value": [
            {
                "name": "{http://example.com/ns1}elem",
                "value": "Value 1",
                "attributes": []
            },
            {
                "name": "{http://example.com/ns2}elem",
                "value": "Value 2",
                "attributes": []
            }
        ],
        "attributes": []
    }
As you see the element names contain full namespace uris. Saving subtrees should leave out no data:
.. code-block:: php
    namespaceMap = [
        'http://example.com/ns2' => 'ns2',
    ];
    // PHP base XMLReader's boilerplate code
    // it's conveniently wrapped in \Sabre\Xml\Service
    // but I need direct access to the Writer for more control here
    $writer->openMemory();
    $writer->startDocument();
    // that's not how you retrieve subtrees in an actual code :D
    $writer->write($data['value'][1]);
    echo $writer->outputMemory();
    // 
    // Value 2
    // Finally!
Of course I left many more nice features of ``sabre/xml`` like object mapping, ``XmlSerializable`` & ``XmlDeserializable`` interfaces, convenience helpers for key-value and collection like data structures and so on.
My goal was to show how it helps me to work with xml namespaces in a strict way.