www.rozumim.cz

Don't do this in your XML

The system I’m working on communicates with many other systems through their APIs. It has to consume a lot of different XML documents in the process.

Here are a few lessons I’ve learned about how to abuse the XML and damage the already-pretty-poor reputation of it. I wish this will help you to avoid these mistakes when you want to build your own XML document.

##Order of your elements matters

Imagine you have an identifier abcdef that can be split in two parts (part1 and part2). If we want to present it in XML, we can do it like this:

1
2
3
<id>
  <part1>abc</part1><part2>def</part2>
</id>

Now imagine you have two of those identifiers. How would you go about putting them in XML? This is how one of the APIs does it (the first identifier here is abcdef, the second ghijkl):

1
2
3
<id>
  <part1>abc</part1><part2>def</part2> <part1>ghi</part1><part2>jkl</part2>
</id>

The fixed version:

1
2
3
4
5
6
7
8
<ids>
  <id type="firstType">
    <part1>abc</part1><part2>def</part2>
  </id>
  <id type="secondType">
    <part1>ghi</part1><part2>jkl</part2>
  </id>
</ids>

##Avoid hierarchy

Let’s say you want to express that there are cities Prague and Berlin in a region Europe and city New York in region North America. Here is the wrong way:

1
2
3
4
5
6
7
<regions>
  <region>Europe</region>
  <city>Prague</city>
  <city>Paris</city>
  <region>North America</region>
  <city>New York</city>
</regions>

Elements <region> are used merely as headings for cities that follow.

The fixed version:

1
2
3
4
5
6
7
8
9
<regions>
  <region name="Europe">
    <city>Prague</city>
    <city>Paris</city>
  </region>
  <region name="North America">
    <city>New York</city>
  </region>
</regions>

##Change the nodes based on the value of parent’s attribute

Imagine you have to put details about a customer in an XML document. Your customer can be either a person or a company. If you want to make things more challenging to the user of your API, you should use just one element <customer> and change it’s content completely according to the type of the customer. Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
<!-- first customer -->
<customer type="person">
 <personFirstName>John</personName>
 <personSurname>Doe</personName>
 <personStreet>Sesame St.</personStreet>
 <personCity>Prague</personCity>
</customer>

<!-- second customer -->
<customer type="company">
 <companyName>ABC</companyName>
 <companyStreet>Sesame St.</companyStreet>
 <companyCity>Prague</companyCity>
</customer>

The fixed version:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<!-- first customer -->
<customer type="person">
 <firstName>John</firstName>
 <surname>Doe</surname>
 <name></name>
 <street>Sesame St.</street>
 <city>Prague</city>
</customer>

<!-- second customer -->
<customer type="company">
 <firstName></firstName>
 <surname></surname>
 <name>ABC</name>
 <street>Sesame St.</street>
 <city>Prague</city>
</customer>

##Don’t offer a definition

XML is widely popular data exchange format supported in all the languages you can imagine. Plus you named the elements very clearly. Users can simply open it and read it like a book.

But still, please, give your users the XSD for your XML documents. They don’t like to guess, what can be the content of an element or an attribute and you don’t like to answer their questions that are inevitable without a proper schema definition.

Just a reminder: the title is Don’t do this in your XML.

26. 2. 2014, kategorie: it
comments powered by Disqus