Launching XML Web Sites with Rocket

Over the past 21 months I've used this space to extol the capabilities of XML and to present projects that both teach the craft and provide useful applications. During that time, you've had the opportunity to learn how the DOM can be used to report on XML documents, to navigate Web-site structures, and more. Together, we've examined how XSL can be used to transform XML documents into HTML. And we have explored the fine art of serving XML using Java servlets and Active Server Pages (ASP).

As Web Techniques moves to its new format, this month marks the final installment of Beyond HTML. As a parting gift, I'd like to present a framework I've created called Rocket (available at Rocket Home page). The name is a play on XML's "skyrocketing" success in virtually every phase of computing. In a nutshell, Rocket is a collection of skeletal XML documents, XSL style sheets, and DTDs that you can use as a basis for creating your own XML-based Web site. To serve XML documents, Rocket includes a collection of "interface" programs that perform browser detection, determine how documents should be served, select and apply style sheets, and more.

Using Rocket, you can quickly enable your site to serve XML documents. Rocket lets you transform XML into HTML that has been specially tuned for rendering in Netscape Navigator. Rocket also lets you exchange XML streams between XML-capable browsers and HTTP servers. Thus, Rocket lets XML-capable browsers communicate directly with your server. Rocket does all of this without interfering with your existing Web documents -- it coexists with all of your HTML, applets, and CGI scripts. Best of all, Rocket is easy to install and ready to run: All you need to do is customize it for your site.

Before you read on, I should mention that Rocket is currently designed to operate with ASP. However, there's nothing to prevent you from dropping the framework into a servlet environment, or running it in conjunction with Perl's XML::Parser module. In this case, the gateway programs will have to be rewritten to run in those environments -- a half day's work for any knowledgeable servlet programmer or Perl developer. The one caveat is that some style sheets may have to be tweaked to work with newer style-sheet processors.

How It Works

Figure 1 shows a typical interaction between a browser and an XML-enabled server. The interaction begins when the browser requests an XML document. On the server, the request is intercepted by some type of "gateway" program. Because Rocket currently uses ASP, this gateway is a JavaScript contained in default.asp ( Listing One) in the root directory. One task of the gateway program is to determine browser support and to select a method for serving XML content. For additional details see the box titled "Specifying Documents in the Query String." In the typical interaction, we assume that the client making the request is not XML savvy. (In other words, the browser is not Internet Explorer 5.) So, the gateway program processes the XML document on the server. On the server, Rocket puts browsers into one of two categories: Netscape Navigator and other browsers. Other browsers are treated generically. If Navigator is making the request, a Navigator-specific style sheet is applied; otherwise a generic style sheet is applied.

Next, XML documents are also placed into classes. The six classes are the home page, standard Web page, article, news story, archive, and biography. For example, if an index page is being requested, the gateway program attaches the webpage.xsl style sheet to the requested document. This style sheet knows how to transform and render index.xml documents. If a news document is being requested, the gateway program selects the news.xsl style sheet, and so on.

Once the browser and document classes have been determined, this information is combined to select the style sheet instance. With this information in hand, the gateway program can invoke the XML and XSL processors, load the XML document, apply its style sheet, and return the resulting transformed HTML. (For details on this process, see "Beyond HTML," Web Techniques, November 1999.)

So, what happens when the request is made by an XML-capable client? In this case, Rocket redirects the client to the XML page (home.xml, for instance). This has the effect of serving the XML document directly to the client. In this case, the browser will process the XML document, apply its style sheets, and perform any necessary transformations. The benefit here is that the server doesn't need to process the XML document, thus avoiding overhead.

Site Structure

Conceptually, Rocket is laid out like your typical Web site (see Figure 2). The home page provides a starting point for navigating the site. The one difference is that the documents being navigated are XML documents. In theory, these documents are stored in a document repository. (This leaves open the possibility of document storage and retrieval to be managed through a database, by a publish-and-subscribe system, or other means.) Rocket also provides the ability to access data through an ODBC data source. The data could be XML text or raw data. Finally, Rocket contains repositories for DTDs and XSL style sheets. Separating style sheets, DTDs, and documents in this manner allows, say, style-sheet developers to work on transformations and presentation, and authors to create new documents, independently of each other.

In practice, Rocket uses the Web's file-based model to store documents, as shown in Figure 3. Style sheets are stored in a common stylesheets directory and DTDs are maintained in a DTD directory. However, XML documents are spread throughout the site, similar to the way HTML documents are stored. Each subdirectory contains two files: index.xml and default.asp. This default.asp file is a slimmed-down version of the root default.asp file (discussed later). The other file, index.xml, provides the starting point for navigating that particular topic area. index.xml references a style sheet, webpage.xsl, which transforms index documents to HTML.

Installing Rocket

Before you install the framework, you first need to ensure that your server has everything Rocket needs: an HTTP connection, ASP support, and XML capabilities. If you're running Internet Information Server on Windows NT Server, the first two are given. (For more details on installing ASP support on Personal Web Server under Windows 95/98 or Peer Web Services on NT Workstation, see "Beyond HTML," Web Techniques, November 1999.)

The other piece of the equation is XML support. You can use any XML processor you wish. However, the easiest way to add XML support to your server is to install IE 5, which installs the XML and XSL processors, registers the text/xml MIME type, and creates associations for the .xml and .xsl file types. Because Rocket is designed to work with the MSXML processor, this is the recommended approach.

If you prefer to use a processor other than MSXML, such as IBM's XML for Java (XML4J) or James Clark's Expat, you'll have to manually register the MIME and file types with the OS. Either of these approaches has the benefit of being OS and Web-server independent. In the case of XML4J, you'll likely have to install a servlet engine and write servlets to replace the functionality of the ASPs (see "Beyond HTML," Web Techniques, May 1999 for details). In the case of Expat, you may want to consider using CGI scripts to perform your browser detection, and Perl's XML::Parser module, which provides an interface to the Expat parser, to process your XML documents.

In either case, you'll have to test Rocket's style sheets to ensure compatibility with these processors. The problem is that the XSL standard has changed dramatically since Microsoft's implementation was released. As experimental technologies, both XML4J and Expat have kept pace with the emerging XSL specifications. Unfortunately, this means that the style sheets may not behave correctly under these processors. The good news is that the style sheets should work with only a few minor tweaks.

Once you have the necessary support installed, adding Rocket is a simple matter of unzipping the Rocket archive into the root directory of your test Web server, typically the wwwroot directory on IIS. Of course, I highly encourage you to test Rocket on a test server before attempting to deploy it in a production environment. With that said, Rocket does nothing that could harm your existing system. Rocket is designed to coexist with HTML, which means you can add XML functionality to your site without reworking all of its content. For example, my site maintains HTML in the archives section and uses XML for all new content.

Using Rocket

As a Web-site template, Rocket provides topic areas titled Topic One, Topic Two, Topic Three, and Topic Four. There are corresponding entries in the navigation bar located along the left side and along the bottom of the browser window. When you unpack Rocket, it also creates directories corresponding to these topic areas and places template XML documents in them. To begin customizing Rocket, you'll first want to change these topic names to reflect those on your site. Because the navigation bars are formatted according to the browser class, this information is contained in the individual style sheets. Thus, you must alter all 18 of these style-sheet files. Why didn't Rocket use <xsl:include> or <xsl:import> to include the navigation code in all the style sheets? Unfortunately, the Microsoft XSL processor doesn't support these elements. However, I'm working on a navigation scheme based on one I presented in this column in October 1998. When implemented, this lets you make your navigation changes to a single document and have them reflected throughout the site.

With the navigation handled, you can begin customizing the template documents. For example, home.xml is shown in Listing Two. It provides XML element types that let you describe the title of your home page, set a publication date, and create a table of contents. This example contains three entries, each described by a headline and an abstract. The headline element is used to create the headline that will be displayed and contains an href attribute that points to the XML document instance. When this element is processed by its style sheet, the headline is created and positioned on the page. When the user clicks on the headline, he or she is taken to that article. The abstract element designates the XML text, which can be used to create a short abstract describing the article. To customize your home page, start by replacing the text and attributes within index.xml and test the results. You can do the same with other XML document types throughout the site.

Eventually, you'll likely want to change the appearance of the home page. The styles for the home page are contained in homepage.xsl. Like all style sheets in Rocket, this style sheet transforms the document into HTML for presentation in the browser. In addition to navigation, the style sheet places a log at the top of the window, formats the table of contents, and places copyright notices at the bottom. You may want to provide alternative navigation, say, a thin bar along the top of the screen with pull-down menus. This is the place to do it. Simply replace the existing HTML with your navigation scheme. The IE and Navigator style sheets also include a set of CSS rules that are used to format the transformed HTML. You can also add your own style rules to further customize your site.

Document Type Definitions

Currently DTDs don't figure prominently in Rocket. To be sure, DTDs and validation are extremely important. If you create a document that doesn't conform to the style sheets, Rocket won't be able to render it. By validating your documents, you ensure that they'll be displayed. However, validation should occur at the point when a document has been authored and is being placed into the system. The next version of Rocket will include a validator that will check your documents before they're entered into the system. In the meantime, your documents will be valid as long as you follow the templates. Alternatively, you can use an editor like SoftQuad's XMetaL to author your documents. Either way, you should have little difficulty creating valid documents that will be recognized by Rocket's style sheets.

Not Rocket Science

The last word to be said about Rocket is that it's not rocket science. Rocket is a simple solution that's designed to give you a kick-start in launching your own XML-based Web sites. Over time I would like to add style sheets that give visitors the ability to dynamically change the user interface, create DOM scripts that let you select alternative navigation schemes, and include support for Java servlets and CGI. Indeed, there's a lot to do to make Rocket a general-purpose framework. As you can see, there's an endless list of enhancements one can make. To that end, I invite anyone who wants to, to join in the project. I'm particularly interested in adding style sheets for other browsers. Anyone who contributes to the project will be credited. The source files are maintained at www.beyondhtml.com/rocket. Whether or not you decide to join in, I'd love to hear from anyone who makes Rocket fly in his or her own projects.

(Get the source code for this article here.)


Michael is the author of Building Web Sites with XML from Prentice Hall. He provides XML training to large companies and publishes LifestylesSantaCruz.com. Michael also serves as Web Techniques' editor at large. He can be reached at mfloyd@lifestylesSantaCruz.com.




Copyright © 2003 CMP Media LLC