|
|
||||||
|
|
![]() |
|
![]() |
|
||
|
|
||||||
Searching XML Databases
By Michael Floyd
When teaching XML, I often explain that XSL can be used to perform many of the same functions as the DOM. The basis for this comment is that both the DOM and XSL (or more specifically, the combination of XSLT and XPath), provide mechanisms for walking document trees, creating new elements, attributes, and entities, and letting you modify them in various ways. Thus, a large amount of what you can do in XSL you can do using the DOM, and vice versa.
My students often ask whether they can altogether forgo either XSL or the DOM. While the short answer often is yes, a quiet voice inside my head always says, "use both." There's no reason why your decision to use one should exclude the other. I often combine DOM and XSL to get the most out of their document manipulation and presentation features.
This is particularly true when it comes to databases. When you have XML documents that represent the result sets of database queries, you can combine DOM and XSL techniques to search them, filter the result sets, perform refinements, and present the results.
What Data, Where?
As the old adage goes, what you see is what you get. In the case of database queries, the client application may see XML, HTML, or even raw data. What you get depends both upon the database format and upon how data is processed on the server.
In a low tech solution, the data could be stored on the server in a flat XML file format. If the client is capable of handling XML data, the server simply redirects the client to the file. In this case, the client must provide all of the database features including the ability to search the database, and to filter and display records. In this case, there's little or no server-side processing.
The other end of the spectrum is when the data is stored in a proprietary database and the server processes queries and returns an XML-ized data stream. In this situation, the client application makes a query, most likely a SQL query, and receives in return an XML stream representing the result set. Microsoft SQL Server 2000, for example, lets you query the database directly through a URL. By using a
FOR XMLclause in your SQL query, you can have the resulting record set back as an XML string. Another, more typical method is to invoke a server-side script that receives the query, posts it to a back-end database, converts the result set to XML, and returns it to the client.The middle of the road is similar to the previous scenario: The client application invokes a server-side script that receives the query, posts it to a back-end database, and converts the result set to XML. Rather than returning XML to the client, however, the server script can perform the XML processing on the server and ship out an XML transformation (for example, HTML) instead.
The bottom line is that you may not always have control of where your XML data is processed, especially as database vendors add the ability to output XML directly to the browser. The good news is that using XSL and the DOM will work equally well in all three situations, whether you're working with flat files, XML data that has been streamed to the browser, or server-side XML.
Filtering and Searching
To demonstrate how a browser like Internet Explorer might handle searching XML data and presenting the results, I've assembled the search page shown in Figure 1. For sample data, I've included a subset of records from the XML Tools database (see Listing 1), which is located on my Web site. The XML Tools database contains information on freely available XML tools that can be found on the Web.
When you first load the search page, it comes up with a couple of HTML forms. The Search By Category form lets you select a subset of tools based on their category. For example, to bring up a list of all the parsers in the database, select "parser" from the list. When you click on the Filter button, the table at the bottom of the screen is populated with a list of tools from that tool class.
The second form lets you enter a string and select a field type by which to search. When you click on the search button, the application performs a substring match, returning all entries that contain the search string in the field selected. For example, I can search for tools created by IBM by entering "IBM" in the form's edit box, selecting the Company radio button, and hitting the Search button. Alternatively, I can return a list of all tools with "Tool" in their title by entering that string in the edit box and selecting the Products radio button.
Setting It Up
Listing 2 presents the code for our application. This is a client-side script, so you can test it easily. However, similar code could be included in an Active Server Page to provide the same functionality using server-side processing.
Listing 2 breaks neatly into three parts: an
onloadscript that executes when the page loads; a script that processes filtering and searching; and a<FORM>element, that accepts user input. The purpose of theonloadscript is to create three DOM objectsone to hold the XML Tools database document; a second to contain intermediate result sets; and a third to hold the XSLT style sheet. Next, the database document (tools.xml) and an empty result-set document (prodDB.xml; see Listing 3) are loaded into their respective DOM objects. The empty document simply assists in the creation of the result, as I'll explain later.Similarly, a style sheet is loaded into its DOM object. Initially, the style sheet chosen (generic.xsl) simply displays all records in the database, giving the user a complete list of record entries. Once the style sheet has been selected and loaded, it's applied to the XML document using the DOM's
transformNodemethod. Thus the DOM is often a requirement when applying style sheets to XML documents.The third part of Listing 2 contains two HTML form elements. The first form (called
category) creates the drop-down list that lets the user filter the entries displayed based on their product category. It also creates a button that lets the user submit the filtering request to the application. Thus, when the user clicks on the Filter button, theONCLICKattribute calls thefilterfunction from the main script, passing the category selection as a parameter.Combination
The
filterfunction, in the main script, takes the value, determines which category was selected, and associates a particular style sheet with that category. Then, thefilterfunction calls thechangeXSLfunction to apply the style sheet to the XML database document. ThechangeXSLfunction simply takes the filename of a style sheet and calls on thetransformNodefunction to apply the style sheet to the database document.This is where processing is handed off from the DOM to XSL. You see, the purpose of the style sheet is to filter the results and define their layout and presentation. For example, if the user selects the parser category, the parser.xsl file is selected, and the
changeXSLfunction attaches that style sheet to the document. Parser.xsl (see Listing 4) includes an<xsl:for-each>element that iterates over all of the<product>elements in the database. Within this loop, an<xsl:if>element is used to check the<product>element's category attribute for a value ofparser. If the current attribute being tested is equal toparser, then the style sheet transforms that product element. Otherwise, it's ignored, and the style sheet iterates to the next<product>element in the document.Of course, you could do this all with just the DOM. In this case, you'd have to walk the document tree using DOM methods and include an
ifstatement to perform the filtering. You could then use the DOM object created for theresultSetobject to store a "result" tree. The downside is that you'd also have to generate the display table in code, which would be a bit more work. Moreover, you'd lose a major benefit that XSL transformations offernamely, the ability to separate presentation from logic. For example, the DOM approach doesn't let me generate custom HTML documents based on browser type, as I've described in previous columns.Searching for Godot
The second HTML form in Listing 2 lets the user enter a string for which to search and optionally select a field on which to search. By default, these searches are performed on the product's
prodNameelement. Again, the user initiates the search through a button. This time, when the button is selected, theONCLICKattribute invokes thesearchfunction within the main script. Thesearchfunction takes the search string and the name of the field (product name, description, price, or company) on which to search. After ensuring that the document root is notnull(which would indicate a problem loading the XML database document), thesearchfunction uses the DOM'sselectNodesmethod to retrieve a collection ofprodNameelements that's assigned to theproductsvariable.Because
productsis a collection, it supports alengthproperty, whichsearchuses in aforloop to traverse the list. Within theforloop, the function checks thesrchType(which is passed in by theformelement) to determine the field on which to search. That is,srchTypecontains a list of values from the radio buttons in our form. So,srchType[0]represents theprodSrchradio button,srchType[1]represents thedescrSrchbutton, and so on. By examining each radio button'scheckedproperty, we determine which radio button has been selected and use that information to select the field (or subelement) on which to search. IfsrchType[0].checkedreturns true, we know theprodNamefield should be searched so we assign the current item in the list ofprodNamesto theprodvariable.Once the search field has been assigned, an
ifstatement uses ECMAScript'ssearchmethod to determine whether thesearchStrpattern is contained within the current field. If the pattern occurs in the search string, the DOM'scloneNodemethod copies the entireproductelement and all of its subelements to thecloneTreevariable. This cloned tree fragment is then inserted into theresultSet's document tree.The
forloop continues iterating through the list of product elements and searching the specified subelement, inserting matching entries into theresultSetdocument. Once the search is completed, the generic style sheet is loaded and applied to theresultSetobject using anothertransformNodecall. Finally,searchresets theresultSetobject by reloading prodDB.xml. Recall that prodDB.xml contains an emptyprodDBelement, so this action reinitializes theresultSetfor the next search.Transitions
Moving this application to an Active Server Page should be straightforward. The HTML forms in Listing 2 have to be placed into separate .html files and the
ACTIONattribute of each form must point to an ASP page containing your script. Your main script has to add code that gets the query information from the forms, then calls the search and filter functions. Retrieving the form data is easy using ASP'sFormobject. Finally, the code contained in theonloadscript has to be placed in the main portion of the ASP script. Recall that this code creates and loads your DOM objects. Because the code used to create these differs on the server side, you'll have to replace the call tonew ActiveXObject()with ASP'sServer.CreateObjectfunction. The result can be passed to the variables as written, and the rest of the program should run as before.Ultimately, you can use the DOM and scripting to search, filter, and present your data. However, I think you'll find that you can build more robust applications with less work when you combine your DOM scripts with XSL programming.
(Get the source code for this article here.)
Michael is the author of Building Web Sites with XML from Prentice Hall, and architect of the Rocket XML framework. He is also the publisher of LifestylesSantaCruz.com and carries the honorary title of editor at large at Web Techniques. He can be reached at mfloyd@lifestylesSantaCruz.com.
|
|