Grid Computing Research Laboratory

State University of New York (SUNY) Binghamton
Department of Computer Science

[Home] [Background] [For Visitors] [Projects] [People] [Papers] [Talks] [Resources] [Funding] [Internal]

Wei Lu, Kenneth Chiu, Yinfei Pan
"A Parallel Approach to XML Parsing",
Grid 2006: The 7th IEEE/ACM International Conference on Grid Computing,
Barcelona, Spain, September 28-29, 2006
[PDF] [bibtex]

Abstract
XML parsing leverages the growing prevalence of multicore architectures in all sectors of the computer market, and yields significant performance improvements. This paper presents our design and implementation of parallel XML parsing. Our design consists of an initial preparsing phase to determine the structure of the XML document, followed by a full, parallel parse. The results of the preparsing phase are used to help partition the XML document for data parallel processing. Our parallel parsing phase is a modification of the libxml2 [1] XML parser, which shows that our approach applies to real-world, production quality parsers. Our empirical study shows our parallel XML parsing algorithm can improved the XML parsing performance significantly and scales well.

Key Words