"In this paper we propose the TurboXPath path processor, which accepts a language equivalent to a subset of the
for-let-where constructs of XQuery over a single document.
TurboXPath can be extended to provide full XQuery support
or used to augment federated database engines for efficient
handling of queries over XML data streams produced by external sources. Internally, TurboXPath uses a tree-shaped path
expression with multiple outputs to drive the execution. The
result of a query execution is a sequence of tuples of XML
fragments matching the output nodes. Based on a streamed
execution model, TurboXPath scales up to large documents
and has limited memory consumption for increased concurrency"
"Our project (titled xstream)
concentrated on evaluation of XPath over XML streams.
This research area contains multiple challenges resulting
from both the richness of the language and the
requirement of having only a single pass over the data.
We modified and extended one of the known algorithms,
TurboXPath [4], a tree-based IBM algorithm. We also
provide extensive comparative analysis between
TurboXPath and XSQ [5], currently the most advanced of
finite automata (FA)-based algorithms."
"We present a novel streaming algorithm for evaluating XPath expressions that use backward axes
(parent and ancestor) and forward axes in a single document-order traversal of an XML document.
Other streaming XPath processors, such as YFilter, XTrie, and TurboXPath handle only forward axes.
We show through experiments that our algorithm significantly outperforms (by more than a factor of
two) a traditional non-streaming XPath engine. Furthermore, since our algorithm only retains relevant
portions of the input document in memory, it scales better than traditional XPath engines. It can process
large documents; we have successfully tested documents over 1GB in size. On the other hand, the
traditional XPath engine degrades considerably in performance for documents over 100 MB in size and
fails to complete for documents of size over 200 MB."