The following is a replicated copy of XML By Example from IBM Developer Connection. This copy is only presented for completeness and avoid "broken" link.

IBM ShopIBM Support Download
HomeNewsProductsservicesSolutionsAbout IBM
Search 
 in 
IBM : Business Partners : PartnerWorld for Developers : Developer Connection

Bottom

XML by example

by Richard Sinn

While humans communicate verbally, companies communicate with documents. In a traditional setting, companies have application forms, memos, account receivable documents, purchase orders, and so on, to communicate within the organization or outside to other companies. In today’s e-business world, most companies have Web sites and they use HTML documents as their main communication vehicle for their customers and business partners.

Documents are traditionally in human readable or “what you see is what you getEformat to enable the communication between humans. Some of the most popular document editing tools for editors are formatted documents. While formatted documents look and print extremely well on paper, they often are not suitable for Web publishing. Most Web documents such as HTML are in ASCII text format. For comparison, Figure 1 shows a document named Profile00 in Word format and Figure 2 in XML format.

Figure 1

Figure 2

Table of contents

In need of a structured document - XML

Businesses need a form of document that can be understood by humans and computers. In addition, the document must contain enough added information to enable the understanding of its underlying structure as well as the meaning of the data (meta data). XML was created to do that. In general, an XML document consists of the following three parts:

  • Structure: Structure is the document type and the organization of its elements. For example, memos, application forms, resume, etc. A set of rules is in place to enforce what kind of elements it contains, in what order they occur, and what additional attributes of elements are allowed.
  • Presentation: This is the way information is presented to the reader on the Web, on a piece of paper or via voice synthesis. Whether a block of text is in bold or italic, which fonts to use, etc also are specified.
  • Data content: The informational data contained in a document.

This article presents an example to show how XML divides structure, presentation and data content. Before that, let's take a brief look at the history of XML as well as some well-known applications.

Table of contents

History and applications of XML

XML has been in development since the 1960s through its parent called SGML. SGML was set as an international standard in 1986 as the way for structured document publishing. Although SGML contains a lot of useful concepts and abilities for performing complex publishing, it found little application in publishing other then working as a difficult technology for high-end systems used by corporations with deep pockets. In the mid-1990s, an SGML application called HTML emerged as the main publishing method for large-scale electronic documents on the World Wide Web. In 1996 a working group in the World Wide Web Consortium started developing XML as a streamlined version of SGML. In a way, XML is a “cleaned upEversion of SGML retaining its very powerful structured concept but removing portions that are very complex and have limited application. In other words, XML is a streamlined version of SGML designed for transmission of structured data over the Web.

In order to provide better customer service, most of the financial institutions provide online banking for their customers. Users can purchase goods online with their credit card, download their credit card statements and pay their bills without any concern on how data are represented in different financial transactions by different institutions. Online banking like this can be done by any OFX-compliant application. OFX stands for Open Financial Exchange, which is an XML application developed jointly by Microsoft, Intuit, and Checkfree. (For more information, go to http://www.ofx.net/ofx/). An OFX transaction in XML using Microsoft Money might look like the following:

<RequestStatement>
<BankAccount>
    <BankID>888</BankID>
    <AccountID>9394</AccountID>
    <AccountType>CHECKING</AccountType>
</BankAccount>
</RequestStatement>

Another common XML application initiative is Channel Definition Format (CDF). It is an XML application that enables the timely delivery of business-critical information. Users of “ChannelsE locate and register channel information of interest to them and their business. After registration, any changes to the selected information appears automatically rather than having to revisit and download again. CDFs are used in WindowsEActive Channel, Active Desktop and Microsoft Software Update.

Table of contents

Where to start

There is a lot of diverse XML information out on the Web. One of the most popular places to get started is IBM developerWorks at http://www.ibm.com/developer/. The developerWorks Web site is a good source of information on the latest technology. Under the XML Zone, articles on different XML topics including XML tutorials, development tools and sample codes are available to download.

Table of contents

XML example

Figure 3 shows how XML data is developed. The first component is an XML document that the contains content character data and marked up with XML tags. Next, an XML document optionally can be associated with a set of rules known as Document Type Definition (DTD). The DTD specifies rules such as ordering of elements, default values, and so on. The third component is the XML Parser that checks the XML document against the DTD and then splits the document up into markup regions and character-data regions. After processing with the XML parser, the data now is in a structured format and can be processed by any XML application.

Figure 3

Let’s make a personal profile as our first XML example. XML documents can be edited by any text editor such as notepad in Windows or vi in UNIX. However, if you are using a plain text editor, you have to manually type in all the tags. There are some XML-specific editors, like the one shown in Figure 4, that help eliminate the need to type the tags manually. However, most of the editors today are not as functionally rich as your everyday word processor counterparts.

Figure 4

<?xml version="1.0"?>
<!DOCTYPE profile SYSTEM "profile.dtd">
<Profile>
<Owner type = "STUDENT" age = "20">
<Name>
    <FirstName>Richard</FirstName>
    <MiddleName init = "P">Pong Nam</MiddleName>
    <LastName>Sinn</LastName>
</Name>
<Phone>
    <Home>(000)000-0000</Home>
    <Work>(000)000-0000</Work>
    </Fax>
    </Pager>
    </Cell>
</Phone>
<Address type = "HOUSE">
    <StreetAddr>555 Bailey Avenue</StreetAddr>
    <City>San Jose</City>
    <State>Ca</State>
    <ZipCode>95141</ZipCode>
</Address>
<Email>
    <ul>
        <li>sinn@us.ibm.com</li>
        <li>sinn@mathcs.sjsu.edu</li>
        <li>webmaster@openloop.com</li>
    </ul>
</Email>

<Education>
    <Institution>
        <GraduationDate>1998</GraduationDate>
        <schoolName>University of Minnesota-Twin Cities</schoolName>
        <degree type = "MS" major =  "CS" gpa = "3.97">
    </Institution>

    <Institution>
        <GraduationDate>1994</GraduationDate>
        <schoolName>University of Wisconsin-Madison</schoolName>
        <degree type = "BS" major =  "CS" gpa = "3.90"/>
    </Institution>
</Education>
<TechSkills>
		<Languages>Java</Languages>		
		<Languages>C++</Languages>
		<Languages>C</Languages>
		<Languages>JavaScript</Languages>
		<Languages>XML</Languages>
		<Languages>HTML</Languages>
		<Languages>SQL</Languages>
		<System>Windows</System>				
	</TechSkills>
</Owner>
</Profile>

In the above example, the processing instructions <?xml version="1.0"?> indicate to the parser that we are using standard XML version 1.0. The second line indicates that we are using profile.dtd as our Document Type Definition. The current XML document is checked against the rules stated in profile.dtd. The example also shows how the start- and end-tag are used to contain content data. All valid XML documents include a start-tag and an end-tag. (For example, the start-tag <profile> is ended with </profile>.)

In the following code sample, the element address has an attribute called "type" that is set to the value “HOUSE.E Address also contains four sub-elements in this order: StreetAddr, City, State, ZipCode.

<Address type = "HOUSE">
    <StreetAddr>555 Bailey Avenue</StreetAddr>
    <City>San Jose</City>
    <State>Ca</State>
    <ZipCode>95141</ZipCode>
</Address>

Table of contents

Document type definition (DTD) example

In order to ensure authors follow certain rules when writing an XML document, a DTD is used. The following is the profile DTD used in our example.

<!-- Document type Definition for the Profile Application -->

<!-- An profile document contains one or more owners -->
<!ELEMENT profile (owner)+>

<!-- an owner contains these six sessions in this sequence -->>
<!ELEMENT owner (Name, Phone, Address, Email, Education, techSkills)>

<!-- Every owner is either a STUDENT or PROFESSIONAL 
This is indicated by its type attribute.
If a value is not supplied for this attribute,
it defaults to STUDENT -->
<!ATTLIST owner type (STUDENT|PROFESSIONAL) "STUDENT">

<!-- Every owner must also has a age attribute.-->
<!ATTLIST owner age CDATA #REQUIRED>

<!ELEMENT FirstName ANY>
<!ELEMENT LastName ANY>

<!ELEMENT Name (FirstName, MiddleName, LastName)>
<!ELEMENT MiddleName ANY>
<!ATTLIST MiddleName init
 (A|B|C|D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z) #IMPLIED>

<!ELEMENT Home ANY>
<!ELEMENT Work ANY>
<!ELEMENT Fax ANY>
<!ELEMENT Pager ANY>
<!ELEMENT Cell ANY>

<!ELEMENT Phone (Home, Work, Fax, Pager, Cell)>

<!ELEMENT StreetAddr ANY>
<!ELEMENT City ANY>
<!ELEMENT State ANY>
<!ELEMENT ZipCode ANY>

<!ELEMENT Address (StreetAddr, City, State, ZipCode)>
<!ATTLIST Address type (HOUSE|APT) "APT">

<!ELEMENT Email (ul)+>

<!ELEMENT li ANY>
<!ELEMENT ul (li)+>

<!ELEMENT Education (Institution)+>

<!ELEMENT GraduationDate ANY>
<!ELEMENT schoolName ANY>
<!ELEMENT degree ANY>

<!ELEMENT Institution (GraduationDate, schoolName, degree)> 

<!ATTLIST degree 
            type (BS|MS|PhD) "BS"
            major (CS|Math|Other) "CS"
            gpa CDATA #REQUIRED>

<!ELEMENT System ANY>
<!ELEMENT Languages ANY>
<!ELEMENT techSkills (System|Languages)+>

Most of the rules are documented with comments. Let’s take a closer look at the rules regarding address.

<!ELEMENT Address (StreetAddr, City, State, ZipCode)>
<!ATTLIST Address type (HOUSE|APT) "APT">

The first line states that an element of type Address can contain four sub elements. The Address must have a StreetAddr element, then City, State and finally ZipCode. The second line states that an element of type Address has an attribute called "type" that is either HOUSE or APT. The default value for attribute type is APT.

Table of contents

Checking out XML document

There are many XML parsers available. If you visit , there are more than 10 free parsers available for download in the XML section. In this article, I used a Microsoft command line validation tool called XMLINT.EXE. It is an updated version of the XMLINT command line tool that shipped in the Internet Explorer 4 SDK. The tool checks whether a given XML file is well formed. It also uses the XML DOM to check that the document is valid according to the DTD.

Figure 5 shows two error messages caused by a missing MiddleName a </Name> tag. When correct, no error messages are returned by the parser.

Figure 5

Table of contents

Viewing your XML document

You can view your XML document with any graphical user interface (GUI). Before the release of Microsoft Internet Explorer 5.0 Web browser, the only way of viewing an XML document was by using a Java applet like the one shown in Figure 6. With Internet Explorer 5.0, you could view your XML document natively in a browser, as shown in Figure 7. Clicking the EEsign expands the XML session details, as shown in Figure 8.

Figure 6

Figure 7

Figure 8

Table of contents

Conclusion

The Information Technology industry is full of buzzwords such as groupware, directory system, Internet, intranets and extranets. Most of the technologies have been hyped to death with very little thought on what the Internet was designed for Eimproving how information and resources are shared. XML can help you organize your information and resources. It is the future of e-business communication and Web publishing. I hope this article gives you a quick introduction on where to download useful XML related tools and helps to give you a jump start on learning XML.

Table of contents

Author

Richard Sinn is a Staff Software Engineer in IBM Santa Teresa Laboratory, San Jose California. He is also a lecturer in San Jose State University and a freelance writer for different magazines and journals. He can be reached via e-mail at webmaster@openloop.com or at his Web site at http://www.openloop.com/.

This document is maintained by devcon@us.ibm.com.
 

PrivacyLegalContact------