Within the industry, we talk a lot about object-oriented languages. A great quote I picked up from David A. Taylor is "Objects themselves are naturally adoptive units of software, but they lose all their flexibility if they are locked into conventional applications." (Object Technology: A Manager’s Guide, 1998, Addison-Wesley) That is why I was so surprised in the Harrisburg Java User group this month when I saw a very neat demo using XML within an old-style SQL data-extract statement, contradicting this axiom.

XML (eXtensible Markup Language) is a way to code data for transfer between machines in a very portable way, just as HTML web pages are portable. Web pages are portable because the HTML Markup Language is universal -- it doesn’t care what the hardware is that is "serving" up the HTML and it doesn’t care what kind of PC the browser is loaded on. In a similar way, XML is a way to make data portable. Many products are now taking advantage of the XML way of coding data, from Office 2000 document-templates to HostBridge’s CICS Transaction Server for mainframes which intercepts transactions requests and converts them to XML on the fly.

To make data formatted to XML portable, the embedded tags are just identifiers (just as HTML has "identifiers" for formatting) which must be extracted or "parsed". There are 2 kinds of XML parsers. First is the SAX Simplified API for XML parser, a smaller, quicker parser that reads sequentially through the document. The second is a higher level DOM Document Object Model parser that is the heavier duty cousin, which keeps the entire document in memory for subsequent manipulation or update by an object-oriented language.

Although storing data into a document of ASCII text with the XML label on each data element can make your data be read universally, the parsing of the data is slow and it takes a lot of space to store these documents, since the tags are often repeated throughout the document / throughout multiple documents. Also, the document itself may include a lot of white space or blanks. Thus, XML will never replace a true database table for efficiency in hard drive space and retrieval speed.

New DB Tables called UDB (Universal DataBase) tables are now being provided by several vendors which allow pictures, documents and even XML docs to be stored next to regular data-fields within a table, providing some real flexibility. And now SQL (Structured Query Language, the 20-year-old extraction and report language used on all computers with database tables) is now able to handle the new XML formats.

The astounding demo I saw was using the SAX parser in a custom Java function to return parts of an "XML field" within a regular database table. It was also using regulay SQL syntax!

In Oracle 8i, I was very impressed to see the SQL statement "select xmlutil.getNode( 'STATE', xml-Doc-field ) from file-name" use a custom Java method "getNode" to extract a field from an XML document which was stored in a field of a UDB table.

This timed as being 20% faster than using PL SQL string manipulation. And a data base trigger (like a record add) can also force this customized Java function. So by adding an empty record containing only an XML document into your table, you would set off parser functions to auto-load all the other fields in the table.

Similarly, IBM’s DB2 XML Extender can store an XML in a field or provide a link or pointer from a table field to an external document (eg. a picture of a product) and even if the file is moved within the scope of the file system, the link is not broken, but remains intact. It can also map XML fields to regular fields or index the XML fields by use of a DAD (Document Access Definition). This IBM product is available for DB2 on AIX, Sun, Linux, NT, but not iSeries AS/400. (Redbook SG24-6130-00, Dec 2000). This approach also makes embedded XML documents appear within regular database SQL extractions.

So, Oracle’s use of Java within SQL contradicts David A. Taylor’s assumption. And IBM’s innovative approach melding tables and XML is something to make the rest of us sit up and take notice.

John Voris, Digica Inc www.digica.com


Return to Home Page