Saturday 5 March 2011

Xml parsing using SAX

SAX stands for Simple API for Xml. Using SAX with JAXP allows developers to traverse through XML data sequentially, one element at a time, using a delegation event model. Each time elements of the XML structure are encountered, an event is triggered. Developers write event handlers to define custom processing for events they deem important.

This program SAXParserExample.java parses a XML document and prints it on the console.
Following xml file is used:

 

<?xml version="1.0" encoding="UTF-8"?>
<Personnel>
<Employee type="permanent">
<Name>Seagull</Name>
<Id>3674</Id>
<Age>34</Age>
</Employee>
<Employee type="contract">
<Name>Robin</Name>
<Id>3675</Id>
<Age>25</Age>
</Employee>
<Employee type="permanent">
<Name>Crow</Name>
<Id>3676</Id>
<Age>28</Age>
</Employee>
</Personnel>


Sax parsing is event based modelling.When a Sax parser parses a XML document and every time it encounters a tag it calls the corresponding tag handler methods

when it encounters a Start Tag it calls this method
    public void startElement(String uri,..

when it encounters a End Tag it calls this method
    public void endElement(String uri,...

Like the dom example this program also parses the xml file, creates a list of employees and prints it to the console. The steps involved are


  • Create a Sax parser and parse the xml
  • In the event handler create the employee object
  • Print out the data

Basically the class extends DefaultHandler to listen for call back events. And we register this handler with the Sax parser to notify us of call back events. We are only interested in start event, end event and character event.
In start event if the element is employee we create a new instant of employee object and if the element is Name/Id/Age we initialize the character buffer to get the text value.
In end event if the node is employee then we know we are at the end of the employee node and we add the Employee object to the list.If it is any other node like Name/Id/Age we call the corresponding methods like setName/SetId/setAge on the Employee object.
In character event we store the data in a temp string variable.


a) Create a Sax Parser and parse the xml


private void parseDocument() {

//get a factory
SAXParserFactory spf = SAXParserFactory.newInstance();
try {

//get a new instance of parser
SAXParser sp = spf.newSAXParser();

//parse the file and also register this class for call backs
sp.parse("employees.xml", this);

}catch(SAXException se) {
se.printStackTrace();
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch (IOException ie) {
ie.printStackTrace();
}
}

b) In the event handlers create the Employee object and call the corresponding setter methods.


//Event Handlers
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
//reset
tempVal = "";
if(qName.equalsIgnoreCase("Employee")) {
//create a new instance of employee
tempEmp = new Employee();
tempEmp.setType(attributes.getValue("type"));
}
}


public void characters(char[] ch, int start, int length) throws SAXException {
tempVal = new String(ch,start,length);
}

public void endElement(String uri, String localName,
String qName) throws SAXException {

if(qName.equalsIgnoreCase("Employee")) {
//add it to the list
myEmpls.add(tempEmp);

}else if (qName.equalsIgnoreCase("Name")) {
tempEmp.setName(tempVal);
}else if (qName.equalsIgnoreCase("Id")) {
tempEmp.setId(Integer.parseInt(tempVal));
}else if (qName.equalsIgnoreCase("Age")) {
tempEmp.setAge(Integer.parseInt(tempVal));
}

}



c) Iterating and printing.

private void printData(){

System.out.println("No of Employees '" + myEmpls.size() + "'.");

Iterator it = myEmpls.iterator();
while(it.hasNext()) {
System.out.println(it.next().toString());
}
}


Listing the full program:


import java.io.IOException;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;

import org.xml.sax.helpers.DefaultHandler;

public class SAXParserExample extends DefaultHandler{

List myEmpls;

private String tempVal;

//to maintain context
private Employee tempEmp;


public SAXParserExample(){
myEmpls = new ArrayList();
}

public void runExample() {
parseDocument();
printData();
}

private void parseDocument() {

//get a factory
SAXParserFactory spf = SAXParserFactory.newInstance();
try {

//get a new instance of parser
SAXParser sp = spf.newSAXParser();

//parse the file and also register this class for call backs
sp.parse("employees.xml", this);

}catch(SAXException se) {
se.printStackTrace();
}catch(ParserConfigurationException pce) {
pce.printStackTrace();
}catch (IOException ie) {
ie.printStackTrace();
}
}

/**
* Iterate through the list and print
* the contents
*/
private void printData(){

System.out.println("No of Employees '" + myEmpls.size() + "'.");

Iterator it = myEmpls.iterator();
while(it.hasNext()) {
System.out.println(it.next().toString());
}
}


//Event Handlers
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
//reset
tempVal = "";
if(qName.equalsIgnoreCase("Employee")) {
//create a new instance of employee
tempEmp = new Employee();
tempEmp.setType(attributes.getValue("type"));
}
}


public void characters(char[] ch, int start, int length) throws SAXException {
tempVal = new String(ch,start,length);
}

public void endElement(String uri, String localName, String qName) throws SAXException {

if(qName.equalsIgnoreCase("Employee")) {
//add it to the list
myEmpls.add(tempEmp);

}else if (qName.equalsIgnoreCase("Name")) {
tempEmp.setName(tempVal);
}else if (qName.equalsIgnoreCase("Id")) {
tempEmp.setId(Integer.parseInt(tempVal));
}else if (qName.equalsIgnoreCase("Age")) {
tempEmp.setAge(Integer.parseInt(tempVal));
}

}

public static void main(String[] args){
SAXParserExample spe = new SAXParserExample();
spe.runExample();
}

}






Running SAXParserExample (JDK 1.5+)


  1. Download SAXParserExample.java, Employee.java, employees.xml to c:\xercesTest
  2. Go to command prompt and type
    cd c:\xercesTest
  3. To compile, type
    javac -classpath . SAXParserExample.java
  4. To run,type
    java -classpath . SAXParserExample

No comments:

Post a Comment