Define an additional data model

In order to construct (or not) an additional [1] data model, you can follow these steps which have proven helpful:
  1. exactly determine your needs
  2. decide whether using an additional data model is the best approach
  3. write down the technical definition
  4. create some documentation for it to help technical implementation
  5. create example(s)

exactly determine your needs

To successfully define a useful addition on top of an already existing data standard is often a task that hinges upon how good you have analyzed the needs of the currently employing parties. Wherever data standards are in place they were decided upon to meet the needs of all participants as good as possible. Defining additions to these should not be done on a daily basis, because it would undermine the notion of a standard. But if you take care not to rush things, you will normally have a good starting point in the bilateral business relations where the current standard “itches” you the most. Nonetheless you should consider seeking out others with similar problems and pool all additional wishes before sorting them to the categories ‘mandatory’, ‘helpful’, ‘would be nice to have’.

Now that you have gathered your needs and the needs of your business partners

decide whether using an additional data model is the best approach

Everything in the mandatory part would have to be in the addition, everything really helpful should be in the addition and everything that would also be nice to have should be discarded, since the goal of the additional standard is to be just that: a nice small addition of unmet core business needs, not a bloated standard of it’s own. If you end up defining dozens of data items your standard is missing, you should consider using another standard in the first place.

If that is not the case or you have already looked for something useful and it turns out there really is nothing fitting, then continue to

write down the technical definition

Open up the template spreadsheet which will help you define your own data addition. Fill column A(Information) with your decided upon new metadata using a new row for every item. Then loop over the items and decide on the technical identifier (technical name), how often it may occur (min, max), and the exact technical (in a programmatic way usable) structure to carry this item would be. Look at the structures you and your business partners use to carry the information and at least consider a union or maybe there already exists a fitting structure, like String or digit. Be sure not to jump too short at this point, for it is easy to loose usability of the whole definition if you define e.g. an order number to be 6 digits at max and some of your partners have letters in their order references or these are simply 10 digit long.

Consider if the information is of additional or alternative nature. If the information can not be shipped with the underlying data structure, then this would be an addition. If for instance you would like to express already contained information in a different way, it would be alternative data (e.g. article name/description in English and an alternative naming in Chinese).

Further determine what the allowed references are, that the additional data may be referring to. At this point you are done with the spreadsheet and now only have to consider a means of transport for your data. Since additional data is always just an addition to some basic data structure, you might want to transport the addition as close as possible with the data it is “adding” information to. You might want to consider only proposing a way of transport instead of making it the mandatory way, but that ultimately depends on your situation and transports already in use. To make identification of the used additional data scheme as easy as possible you should name the resulting xml file closely to the name of the data scheme. The preferred way to

name it

is

<prefix>-<business sector>-<base object group>-<version>-<unique part>.xml

part meaning
prefix additional_data
defining party id e.g. logistics
base object group invoice
version 1.0
unique e.g. uuid, timestamp

At first sight the file name seems a little bulky, but it allows for easy identification of what is contained in the file without even parsing it, saving a lot of resources in processing.

The defining part of your work is done at this point, of course you need to

create some documentation for it, to help technical implementation

Even if your group of partners is rather small, it is unlikely that every party had their, later implementing, technician send to the definition work shop. In order to make implementation as painless as possible you should now take some minutes to write down a small documentation (as a minimum you can always use the spreadsheet, and a short document that motivates the business cases and sums up the means of transport) for the additional data. Just write down what you discussed earlier. For the items that would be What exactly every information item is and why it is so helpful or needed in what business process(es).

At this point you should also consider to extend the basic additional data xsd to your special case. You don’t need to, but a more elaborate xsd surely helps both ends of the process guaranteeing they produce valid metadata structures or that those data can indeed be processed without (technical) errors.

Feel free to create a more sophisticated document explaining the whys and hows in greater detail. But as all of us know, all theory is dry and to moist it up for others to swallow you should

create example(s)

Be sure to create at least one sample file for your addition, it might even be the best to create samples for all the business cases you had in mind when defining the addition. Name them just like that and you already have a nice little package for every body that might be interested in joining your club of extended metadata users.

You should now have at least these components:
  • a document motivating business cases and declaring means of transport
  • a spreadsheet containing the technical definition in detail
  • a xsd file (derived from the base xsd or maybe just the base xsd)
  • at least one sample file

If you found this article rather dry as well, then take a look at our Quickstart.

[1]through the course of this article we will shorten the phrase additional or alternative down to additional