org.xbis
Class XBISReader

java.lang.Object
  extended byorg.xbis.XBISReader
All Implemented Interfaces:
XBISConstants
Direct Known Subclasses:
XBISToSAXAdapter

public abstract class XBISReader
extends java.lang.Object
implements XBISConstants

XML Binary Information Set input handler. This reads a compact representation of the data in an XML document, with the advantages of reduced document size and lower processing overhead as compared to the standard text document representation.

This class provides basic input handling, but the driver methods for reading the serial form and constructing the corresponding document representation must be implemented by a subclass specific to the representation used. The serial form is itself independent of the original representation, and may be read by the input handlers for representations other than the one from which it was generated.

Version:
0.8
Author:
Dennis M. Sosnoski

Nested Class Summary
static class XBISReader.CharBlock
          Data block describing a block of characters.
 
Field Summary
static int DEFAULT_BUFFER_SIZE
          Default size of input buffer.
static int INITIAL_HANDLE_SIZE
          Initial size of handle value arrays.
static int INITIAL_STRING_SIZE
          Initial size of String buffer (must be larger than short limit).
protected  int m_attrValueCount
          Number of attribute handle value Strings currently defined.
protected  java.lang.String[] m_attrValueTable
          Table of Strings corresponding to attribute handle values (created when needed).
static int SHORT_STRING_LIMIT
          Limit for reading String data a byte at a time.
 
Fields inherited from interface org.xbis.XBISConstants
ATTRIBUTE_HANDLE_MASK, ATTRIBUTE_NEWREF_FLAG, ATTRIBUTE_NOTEND_FLAG, ATTRIBUTE_VALUEREF_FLAG, DOM_SOURCE_ID, DOM4J_SOURCE_ID, ELEMENT_HANDLE_MASK, ELEMENT_HASATTRIBUTES_FLAG, ELEMENT_HASCHILDREN_FLAG, HEADER_VERSION_ID, JDOM_SOURCE_ID, NAME_LENGTH_MASK, NAME_NS_MASK, NAME_NS_SHIFT, NAMESPACEDECL_HANDLE_MASK, NO_NAMESPACE, NODE_ELEMENT_FLAG, NODE_NAMESPACEDECL_FLAG, NODE_PLAINTEXT_FLAG, NODE_TEXTREF_FLAG, NODE_TYPE_ATTRIBUTEDECL, NODE_TYPE_CDATA, NODE_TYPE_COMMENT, NODE_TYPE_DOCTYPE, NODE_TYPE_DOCUMENT, NODE_TYPE_ELEMENTDECL, NODE_TYPE_END, NODE_TYPE_EXTERNALENTITYDECL, NODE_TYPE_MASK, NODE_TYPE_NOTATION, NODE_TYPE_PI, NODE_TYPE_SKIPPEDENTITY, NODE_TYPE_UNPARSEDENTITY, NS_HANDLE_NONE, NS_HANDLE_XML, NSDEF_PRELENGTH_MASK, NSDEF_URIHANDLE_MASK, NSDEF_URIHANDLE_SHIFT, PLAINTEXT_LENGTH_MASK, SAX2_SOURCE_ID, TEXTREF_HANDLE_MASK, XML_NAMESPACE
 
Constructor Summary
XBISReader()
          Constructor.
 
Method Summary
protected  void activateNamespace(java.lang.Object ns)
          Activate a namespace.
protected  int addNamespace(java.lang.Object ns)
          Add namespace definition to table.
protected  int addNamespaceURI(java.lang.String uri)
          Add namespace URI definition to table.
protected abstract  java.lang.Object buildName(java.lang.Object ns, java.lang.String local)
          Build name instance.
protected abstract  java.lang.Object buildNamespace(java.lang.String prefix, java.lang.String uri)
          Build namespace instance.
protected  void clearArray(int count, java.lang.Object[] array)
          Clears an array of Objects.
protected  void clearReset()
          Clear reset state.
protected  void declareNamespace(java.lang.Object obj)
          Declare namespace.
protected  char[][] doubleArray(char[][] base)
          Doubles the size of an array of char[] values.
protected  java.lang.Object[] doubleArray(java.lang.Object[] base)
          Doubles the size of an array of Object values.
protected  java.lang.String[] doubleArray(java.lang.String[] base)
          Doubles the size of an array of String values.
 int getBytesRemaining()
          Get bytes remaining in buffer **TESTING ONLY**.
protected abstract  void initState()
          Initialize state information used during the serialization process.
protected  boolean isEnd()
          Checks if end of input reached.
protected  boolean isReset()
          Check reset state.
protected  int readByte()
          Read next byte from buffer.
protected  char[] readChars(int length)
          Read string of characters from stream.
protected  char[] readCharsDef(int lead)
          Read reusable text node from stream.
protected  java.lang.Object readNamespaceDecl(int initial)
          Read namespace declaration from stream.
protected  java.lang.Object readNamespaceDef()
          Read namespace definition from stream.
protected  java.lang.Object readNamespaceRef(int initial, int mask)
          Read active namespace reference from stream.
protected  java.lang.String readPlainText(int lead)
          Read plain text node from stream.
protected  java.lang.Object readQuickAttribute(int initial)
          Read attribute name definition from stream with partial lead byte.
protected  java.lang.Object readQuickElement(int initial)
          Read element name definition from stream with partial lead byte.
protected  int readQuickValue(int initial, int mask)
          Read non-negative integer value with partial lead byte from stream.
protected  java.lang.String readString()
          Read String directly from stream.
protected  XBISReader.CharBlock readStringChars()
          Read characters in a string directly from stream.
protected  java.lang.String readStringData(int length)
          Read String data from stream.
protected  java.lang.String readStringDef(int lead)
          Read reusable text node from stream.
protected  XBISReader.CharBlock readTextChars(int lead)
          Read characters in a plain text node directly from stream.
protected  int readValue()
          Read non-negative integer value from stream.
 void reset()
          Reset state information used during the serialization process.
 int setStream(java.io.InputStream is)
          Set input stream.
protected  void undeclareNamespace()
          Undeclare namespace.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT_BUFFER_SIZE

public static final int DEFAULT_BUFFER_SIZE
Default size of input buffer.

See Also:
Constant Field Values

INITIAL_HANDLE_SIZE

public static final int INITIAL_HANDLE_SIZE
Initial size of handle value arrays.

See Also:
Constant Field Values

SHORT_STRING_LIMIT

public static final int SHORT_STRING_LIMIT
Limit for reading String data a byte at a time.

See Also:
Constant Field Values

INITIAL_STRING_SIZE

public static final int INITIAL_STRING_SIZE
Initial size of String buffer (must be larger than short limit).

See Also:
Constant Field Values

m_attrValueTable

protected java.lang.String[] m_attrValueTable
Table of Strings corresponding to attribute handle values (created when needed).


m_attrValueCount

protected int m_attrValueCount
Number of attribute handle value Strings currently defined.

Constructor Detail

XBISReader

public XBISReader()
Constructor. Allocates and initializes instances of the data tables used for storing state information during the serialization process.

Method Detail

doubleArray

protected final java.lang.Object[] doubleArray(java.lang.Object[] base)
Doubles the size of an array of Object values. Constructs and returns the resized array, copying values from the old array across to the new one.

Parameters:
base - array to be resized
Returns:
replacement array with values copied

doubleArray

protected final char[][] doubleArray(char[][] base)
Doubles the size of an array of char[] values. Constructs and returns the resized array, copying values from the old array across to the new one.

Parameters:
base - array to be resized
Returns:
replacement array with values copied

doubleArray

protected final java.lang.String[] doubleArray(java.lang.String[] base)
Doubles the size of an array of String values. Constructs and returns the resized array, copying values from the old array across to the new one.

Parameters:
base - array to be resized
Returns:
replacement array with values copied

clearArray

protected final void clearArray(int count,
                                java.lang.Object[] array)
Clears an array of Objects.

Parameters:
count - number of references present in array
array - array to be cleared (may be null)

setStream

public int setStream(java.io.InputStream is)
              throws java.io.IOException
Set input stream. This first resets the state to clear any data that came from another stream, then sets the new input stream and reads and verifies the XML Serial Format header information.

Parameters:
is - serialization input stream
Returns:
output adapter identifier
Throws:
java.io.IOException - on error reading from stream

getBytesRemaining

public final int getBytesRemaining()
Get bytes remaining in buffer **TESTING ONLY**. This is a convenience method supplied for use in testing which will not be part of the released implementation.

Returns:
current position back from end of buffer

isEnd

protected final boolean isEnd()
                       throws java.io.IOException
Checks if end of input reached.

Returns:
true if end of input, false if data remaining
Throws:
java.io.IOException - on error reading from stream

readByte

protected final int readByte()
                      throws java.io.IOException
Read next byte from buffer. Reads another block of data from the input stream if the buffer is empty.

Returns:
byte value from buffer
Throws:
java.io.IOException - on error reading from stream

readValue

protected final int readValue()
                       throws java.io.IOException
Read non-negative integer value from stream. This handles decoding of the variable-length representation used in the serial form, which gives seven bits of value per serialized byte by using the high-order bit of each byte as a continuation flag.

Returns:
integer value read from stream
Throws:
java.io.IOException - on error reading from stream

readQuickValue

protected final int readQuickValue(int initial,
                                   int mask)
                            throws java.io.IOException
Read non-negative integer value with partial lead byte from stream. This handles decoding of the variable-length representation used in the serial form, with a partial leading byte. This format is used when a handle value or length is combined with flags. If the value from the initial byte is zero, the value is encoded in standard form in the following byte(s).

Parameters:
initial - byte containing value or continuation
mask - value mask within first byte (right justified)
Returns:
integer value read from stream
Throws:
java.io.IOException - on error reading from stream

readChars

protected final char[] readChars(int length)
                          throws java.io.IOException
Read string of characters from stream. This method uses a dual approach to building the array of characters, depending on the number of characters. If the count is low, byte at a time reads are used to collect the characters (with between one and three bytes per character). If the count is high, we take as many characters as possible from the buffer before checking for end conditions. This approach creates some duplicate code, but gives the best performance.

Parameters:
length - number of characters to read
Returns:
array containing characters read (starting at position 0, to the requested length), only valid until the next call to a method in this class
Throws:
java.io.IOException - on error reading from stream

readStringData

protected final java.lang.String readStringData(int length)
                                         throws java.io.IOException
Read String data from stream. Just uses readChars(int) to read the characters, then turns them into a string to be returned.

Parameters:
length - number of character to read
Returns:
string read from input
Throws:
java.io.IOException - on error reading from stream

readStringChars

protected final XBISReader.CharBlock readStringChars()
                                              throws java.io.IOException
Read characters in a string directly from stream. This handles decoding of the string value representation used in the serial form, consisting of a leading character count plus one (with the value zero used to indicate a null string), in the standard integer value encoding, followed by the specified number of characters. Character data uses a separate encoding scheme. It is an error if this method is called for a null string.

Returns:
information for string of characters read from stream (only valid until the next call to a method in this class)
Throws:
java.io.IOException - on error reading from stream
See Also:
readChars(int)

readString

protected final java.lang.String readString()
                                     throws java.io.IOException
Read String directly from stream. This handles decoding of the String value representation used in the serial form, consisting of a leading character count plus one (with the value zero used to indicate a null String), in the standard integer value encoding, followed by the specified number of characters. Character data uses a separate encoding scheme.

Returns:
String read from stream (may be null)
Throws:
java.io.IOException - on error reading from stream
See Also:
readStringData(int)

readTextChars

protected final XBISReader.CharBlock readTextChars(int lead)
                                            throws java.io.IOException
Read characters in a plain text node directly from stream. This first gets the length of the text either from the lead byte or from the following byte(s), as encoded. It then reads the required number of characters and returns them as a String.

Parameters:
lead - node definition byte
Returns:
information for string of characters read from stream (only valid until the next call to a method in this class)
Throws:
java.io.IOException - on error reading from stream
See Also:
readChars(int)

readPlainText

protected final java.lang.String readPlainText(int lead)
                                        throws java.io.IOException
Read plain text node from stream. This first gets the length of the text either from the lead byte or from the following byte(s), as encoded. It then reads the required number of characters and returns them as a String.

Parameters:
lead - node definition byte
Returns:
text read from stream
Throws:
java.io.IOException - on error reading from stream
See Also:
readStringData(int)

readCharsDef

protected final char[] readCharsDef(int lead)
                             throws java.io.IOException
Read reusable text node from stream. If a handle is given in the input this returns the array of characters with that handle. Otherwise, it reads a string definition as an array of characters, assigning it the next consecutive handle.

Parameters:
lead - node definition byte
Returns:
text read from stream
Throws:
java.io.IOException - on error reading from stream

readStringDef

protected final java.lang.String readStringDef(int lead)
                                        throws java.io.IOException
Read reusable text node from stream. If a handle is given in the input this returns the String with that handle. Otherwise, it reads a String definition, assigning it the next consecutive handle.

Parameters:
lead - node definition byte
Returns:
text read from stream
Throws:
java.io.IOException - on error reading from stream

addNamespace

protected final int addNamespace(java.lang.Object ns)
Add namespace definition to table.

Parameters:
ns - namespace information
Returns:
index number of added namespace

addNamespaceURI

protected final int addNamespaceURI(java.lang.String uri)
Add namespace URI definition to table.

Parameters:
uri - namespace URI
Returns:
index number of added namespace URI

activateNamespace

protected void activateNamespace(java.lang.Object ns)
                          throws XBISException
Activate a namespace. This assigns an active namespace handle to the namespace, and also adds it to the table of active namespaces.

Parameters:
ns - namespace to be activated
Throws:
XBISException - on error in processing XML data

readNamespaceDef

protected final java.lang.Object readNamespaceDef()
                                           throws java.io.IOException,
                                                  XBISException
Read namespace definition from stream. This is used to define a new namespace (prefix+URI pair). The format is a single byte with a handle value for the namespace URI and a quick length field for the prefix. This also assigns an active namespace handle to the new namespace and adds it to the active namespace stack for automatic cleanup at the end of the element.

Returns:
name object reconstructed from stream
Throws:
java.io.IOException - on error reading from stream
XBISException - on error in processing XML data

readNamespaceDecl

protected final java.lang.Object readNamespaceDecl(int initial)
                                            throws java.io.IOException,
                                                   XBISException
Read namespace declaration from stream. If a previously defined namespace handle is supplied this reactivates that namespace for the following element. Otherwise, it reads a new namespace definition and assigns it a new namespace handle. Either way, the namespace is assigned an active namespace handle and is added to the active namespace stack for automatic cleanup at the end of the element.

Parameters:
initial - byte containing handle or continuation
Returns:
name object reconstructed from stream
Throws:
java.io.IOException - on error reading from stream
XBISException - on error in processing XML data

readNamespaceRef

protected final java.lang.Object readNamespaceRef(int initial,
                                                  int mask)
                                           throws java.io.IOException,
                                                  XBISException
Read active namespace reference from stream. This method interprets the namespace handle value from the lead byte of a name definition, if necessary reading additional bytes for the handle or defining a new namespace with the next consecutive handle value.

Parameters:
initial - byte containing handle or continuation
mask - value mask within first byte (right justified)
Returns:
name object reconstructed from stream
Throws:
java.io.IOException - on error reading from stream
XBISException - on error in processing XML data

readQuickAttribute

protected final java.lang.Object readQuickAttribute(int initial)
                                             throws java.io.IOException,
                                                    XBISException
Read attribute name definition from stream with partial lead byte. If a valid handle is given in the input this returns the existing name with that handle. Otherwise, it reads a namespace definition and local name from the stream to construct the new name, assigning the name the next consecutive handle value.

Parameters:
initial - byte containing value or continuation
Returns:
name object reconstructed from stream
Throws:
java.io.IOException - on error reading from stream
XBISException - on error in processing XML data

readQuickElement

protected final java.lang.Object readQuickElement(int initial)
                                           throws java.io.IOException,
                                                  XBISException
Read element name definition from stream with partial lead byte. If a valid handle is given in the input this returns the existing name with that handle. Otherwise, it reads a namespace definition and local name from the stream to construct the new name, assigning the name the next consecutive handle value.

Parameters:
initial - byte containing value or continuation
Returns:
name object reconstructed from stream
Throws:
java.io.IOException - on error reading from stream
XBISException - on error in processing XML data

clearReset

protected void clearReset()
Clear reset state. This method must be called before beginning any operations which write to the output stream.


isReset

protected boolean isReset()
Check reset state.

Returns:
true if reset, false if not

reset

public void reset()
Reset state information used during the serialization process. This reinitializes the serialization state so that an instance of this class can be reused to unserialize multiple independent documents. Subclasses overriding this method to perform their own reinitialization must call the base class method before returning to the caller.


initState

protected abstract void initState()
Initialize state information used during the serialization process. This abstract method must be implemented by each subclass to set up predefined state information in the tables.


buildNamespace

protected abstract java.lang.Object buildNamespace(java.lang.String prefix,
                                                   java.lang.String uri)
Build namespace instance. This abstract method must be implemented by each subclass to build a namespace object of the appropriate type.

Parameters:
prefix - namespace prefix
uri - namespace URI
Returns:
constructed namespace object

declareNamespace

protected void declareNamespace(java.lang.Object obj)
                         throws XBISException
Declare namespace. This method may be overridden by subclasses to handle the beginning of a namespace scope.

Parameters:
obj - namespace object
Throws:
XBISException - on error in processing XML data

undeclareNamespace

protected void undeclareNamespace()
Undeclare namespace. This method may be overridden by subclasses to handle the ending of a namespace scope, but they must call this base class method anytime a namespace is closed.


buildName

protected abstract java.lang.Object buildName(java.lang.Object ns,
                                              java.lang.String local)
Build name instance. This abstract method must be implemented by each subclass to build a name object of the appropriate type.

Parameters:
ns - namespace for name
local - local name
Returns:
constructed name object


Company Web Site

XML Binary Information Set Home