|
|
||||||
|
|
![]() |
|
![]() |
|
||
|
|
||||||
Roll Your Own XML Editor
By Michael Floyd
When Tim Berners-Lee developed the concepts behind the World Wide Web, he never envisioned that HTML authors would work directly with the markup language. Instead, he theorized that they would use tools to generate the markup automatically. That, of course, was before the introduction of complex tags, scripting languages, applets, dynamic HTML, and the like. The complexity of these technologies and the control that authors prefer to exercise over their documents has kept them knee deep in code.
With XML, it's possible to revisit Berners-Lee's vision. XML simplifies documents by separating presentation and programming logic from content, and that allows authors to focus solely on their data. The data can be marked up with a tool that lets authors retain control over their documents without requiring them to deal with the underlying code. To take the tool concept one step further, I created a Web-based editor that generates and saves fully validated XML documents based on text supplied by authors. While the editor doesn't completely eliminate the need for markup, it greatly simplifies the process. That means content authors can spend more time creating and less time coding.
Because I serve most of my XML documents with Rocketmy own freely available, back-end framework that I first introduced in this magazine in my "Beyond HTML" in the February 2000 issueI designed the editor to operate seamlessly with it. The editor creates an XML document that's stored on my Rocket-enabled site and automatically transformed to HTML (with complete formatting, navigation, and so on) before being served to users. Users can even change the way the document is presented on the fly. If you're not using Rocket to serve documents, no problem. The techniques described here can be adapted to most XML-based Web applications. (See " Online" for links to other articles on Rocket and information on XML editors.)
How the Editor Works
The editor has two components: a client-side interface and a server-side script that processes the data and creates the document. I wanted the interface to be browser independent, so I used basic HTML to create a form that accepts text in a defined TEXTAREA field (see Figure 1). As mentioned earlier, the form provides a list of DTDs from which authors can choose. If one is chosen, it will be used by the editor to validate the supplied document. Finally, the form requires that authors specify a fully qualified URL indicating the filename and a storage location.
From an authoring perspective, using the form is straightforward. The author simply enters text for the headline, deck, byline, and publication date. Under the current design, the tool requires the author to use some incidental markup for the text of the document. For example, the author must enter paragraph elements to delineate paragraph boundaries, as shown in Figure 1. Likewise, if the author wants text to appear in bold or italics, he or she must add appropriate elements.
The form is mostly straightforward, but note that the FORM element uses a POST method rather than the default GET method to transmit the form's data to the server. This makes it a bit easier for the script to retrieve data from the fields. Also note that the form action points to an ASP script on the server (author.asp). This script constructs the document, validates it if necessary, and generates the final XML document.
Processing the Document
The author.asp script is shown in Listing 1. As with other scripts in the Rocket framework, I use JScript as my scripting language. The overall tasks for this script are to create a new DOM object, populate it with XML, validate the document, and save it. Before it does that, the script first creates a DOM object to represent the document. This is done with the ASP Server object's createObject method. The identifier, or progID, used in createObject references the new MSXML 2 parser.
Next, the code in Listing 1 begins constructing the XML string. In most cases, it simply retrieves the form data (using the Request object's Form method) and assigns it to a variable. The DTD reference must contain a qualified name to an external DTD file. In this case, the name of the DTD is prefixed with path information and the .dtd file extension is appended to the string. The result is assigned to the documentType variable.
The script then creates the document prolog and constructs the DOCTYPE declaration. For clarity, I've broken this up into three steps. First, the code generates the XML declaration and includes the standalone pseudo-attribute. By setting standalone to a value of yes, I've indicated that the DTD isn't required to processes the resulting document. This is OK, because we want to validate the document before we save it, but not later when we serve it. The second step in the process is to take the documentType string created earlier, map it to a local path using the Server.MapPath method, and assign it to the dtdLocalPath variable. In the third and final step here, the DOCTYPE declaration is constructed from dtdLocalPath and assigned to the docTypeDecl variable.
A drawback to this approach is that the DOCTYPE declaration in the resulting XML document will contain a hard-coded local path to the DTD. Hard coding a local path isn't a good idea, in part because the document can then be validated only on that machine. This is precisely why I've set the standalone attribute to yes in the XML declaration (so that an attempted validation doesn't cause an error). A second problem with this approach is that divulging your local paths to others could potentially introduce security issues. In Rocket's case, this isn't a problem because the entire document is transformed to HTML before it's sent down to the client, and the XML DOCTYPE declaration is never seen. However, you should avoid this if you plan to send the XML directly to the client.
With the prolog portion of the string set up, the script begins creating the document string. Again, this is broken up into steps for clarity. The first step combines the document prolog, the DOCTYPE declaration, and the root element's opening tag. In addition, you'll notice the <logo> and <blackoutLogo> elements are appended to this portion of the string. Rocket uses these two logos to display the Web site's log in different formats, depending on which display theme is currently in use (for details on themes in Rocket, see "Separating Body and Soul," July 2000). In the second step, the individual elements of the documentincluding the headline, deck, byline, publication date, and article bodyare appended to the string. Finally, the root element's closing tag is appended to the document.
With the XML string in hand, the next lines turn on the XML parser's validator by setting the validateOnParse flag to true. The strings are loaded into the DOM object created at the top of the script with the loadXML method. (Note that this is different from loading an external XML document, which uses the load method rather than loadXML). Once the document string is loaded, you can check the parseError object for errors in your document. Because I set validateOnParse to true, the parseError method will return errors related to well-formedness and errors related to conformity with your DTD. If any errors occur, the script reports them to the client and ends processing. If, on the other hand, things go well, the validated document is saved using the path and filename specified by the user. Finally, a message is sent back indicating that all went well and the document has been saved.
Viewing the Results
At this point, we have taken plain text, marked it up with XML elements, validated the document and stored it on the server. The resulting XML file is shown in Listing 2. The problem is that this document is just plain XML sitting somewhere in the Web server environment. We need a way to serve it.
If you're using Rocket, simply enter the URL to view the document. For example, to view a newly created document on my site, I would direct my browser to get the file at www.beyondhtml.com/ default.asp?newdoc.xml. At this point, Rocket will grab the document, load it into a DOM object, append navigation information to the document, and attach a style sheet that performs an HTML transformation to the document. The result is a neatly formatted document, like that shown in Figure 2. It contains full navigation bars, site logos, and copyright notices in accordance with the user interface guidelines that the Webmaster has set for the site.
All Gain, No Pain
The benefits to this approach are twofold. First, I can turn a simple plaintext document into a browser-ready document with minimal work. Because I don't have to remember which elements I've created or the context in which they must be used, creating a valid document is almost foolproof. In short, I've simplified the creation process and eliminated the need for specialized knowledge of XML.
On a larger scale, there are tremendous benefits to separating my content from its design and programming logic. For example, my newly created document can be viewed in any browser, and the user can even alter the document's appearance simply by changing the site's themes. (Setting a new theme simply selects a different style sheet that renders the document using that theme.)
Additionally, I've tamed the maintenance problems usually associated with altering a site's navigational structure. Systems like Rocket keep navigation information in a separate XML document. Whenever a document is requested, this navigation document is appended to it and each style sheet simply renders the navigation information along with the rest of the main document. Normally, this type of information is hard coded into an ASP page. Under this traditional method, whenever the Webmaster reorganizes the site, he or she must modify the navigation references in every script that renders a page. Using Rocket, I simply modify the XML description, and the changes are propagated through the site automatically.
Across the board, XML lets me serve my content to a broader set of Internet devices while reducing my workload and minimizing the specialized skills required to create XML applications. Best of all, I'm able to replace more expensive, proprietary systems with a simplified technology that truly separates data from semantics.
(Get the source code for this article here.)
Michael is the author of Building Web Sites with XML from Prentice Hall, and architect of the Rocket XML framework. He is also the publisher of LifestylesSantaCruz.com and carries the honorary title of editor at large for Web Techniques. You can reach him at mfloyd@lifestylesSantaCruz.com.
|
|