Elm  1.0
ELM is a library providing generic data structures, OS-independent interface, plugins and XML.
 All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends Groups Pages
elm::xom::Serializer Class Reference

#include <elm/xom/Serializer.h>

Public Member Functions

 Serializer (io::OutStream &out_stream)
 
 Serializer (io::OutStream &out, string encoding)
 
const stringgetEncoding (void) const
 
int getIndent (void) const
 
const stringgetLineSeparator (void) const
 
int getMaxLength (void) const
 
bool getPreserveBaseURI (void) const
 
bool getUnicodeNormalizationFormC () const
 
void setIndent (int indent)
 
void setLineSeparator (string line_separator)
 
void setMaxLength (int max_length)
 
void setOutputStream (io::OutStream &out)
 
void setPreserveBaseURI (bool preserve)
 
void setUnicodeNormalizationFormC (bool normalize)
 
virtual void write (Document *doc)
 
void flush (void)
 

Protected Member Functions

int getColumnNumber (void)
 
virtual void breakLine (void)
 
virtual void write (Attribute *attribute)
 
virtual void write (Comment *comment)
 
virtual void write (DocType *doctype)
 
virtual void write (Element *element)
 
virtual void write (ProcessingInstruction *instruction)
 
virtual void write (Text *text)
 
virtual void writeAttributes (Element *element)
 
virtual void writeAttributeValue (String value)
 
virtual void writeChild (Node *node)
 
virtual void writeEmptyElementTag (Element *element)
 
virtual void writeEndTag (Element *element)
 
virtual void writeEscaped (String text)
 
virtual void writeNamespaceDeclaration (const string &prefix, const string &uri)
 
virtual void writeNamespaceDeclarations (Element *element)
 
virtual void writeRaw (String text, int length=-1)
 
virtual void writeStartTag (Element *element)
 
virtual void writeXMLDeclaration (void)
 

Detailed Description

Outputs a Document object in a specific encoding using various options for controlling white space, normalization, indenting, line breaking, and base URIs. However, in general these options do affect the document's infoset. In particular, if you set either the maximum line length or the indent size to a positive value, then the serializer will not respect input white space. It may trim leading and trailing space, condense runs of white space to a single space, convert carriage returns and linefeeds to spaces, add extra space where none was present before, and otherwise muck with the document's white space. The defaults, however, preserve all significant white space including ignorable white space and boundary white space.

Warning
This is a very limited version of the serializer:
supports only UTF-8 encoding,
  • no indentation, space, newline suppport.
Author
H. Cassé casse.nosp@m.@iri.nosp@m.t.fr

Constructor & Destructor Documentation

elm::xom::Serializer::Serializer ( io::OutStream out)

Create a new serializer that uses the UTF-8 encoding.

Parameters
out_streamthe output stream to write the document on
elm::xom::Serializer::Serializer ( io::OutStream out,
string  encoding 
)

Create a new serializer that uses the specified encoding. The encoding must be recognized by the libxml.

Parameters
outthe output stream to write the document on
encodingthe character encoding for the serialization

Member Function Documentation

void elm::xom::Serializer::breakLine ( void  )
protectedvirtual

Writes the current line break string onto the underlying output stream and indents as specified by the current level and the indent property.

Referenced by writeEmptyElementTag(), writeEndTag(), writeStartTag(), and writeXMLDeclaration().

void elm::xom::Serializer::flush ( void  )

Flush the out stream.

References elm::io::Output::flush().

int elm::xom::Serializer::getColumnNumber ( void  )
protected

Returns the current column number of the output stream. This method useful for subclasses that implement their own pretty printing strategies by inserting white space and line breaks at appropriate points. Columns are counted based on Unicode characters, not UTF-8 chars. A surrogate pair counts as one character in this context, not two. However, a character followed by a combining character (e.g. e followed by combining accent acute) counts as two characters. This latter choice (treating combining characters like regular characters) is under review, and may change in the future if it's not too big a performance hit.

Returns
the current column number
const string& elm::xom::Serializer::getEncoding ( void  ) const
int elm::xom::Serializer::getIndent ( void  ) const
const string& elm::xom::Serializer::getLineSeparator ( void  ) const
int elm::xom::Serializer::getMaxLength ( void  ) const
bool elm::xom::Serializer::getPreserveBaseURI ( void  ) const
bool elm::xom::Serializer::getUnicodeNormalizationFormC ( ) const
void elm::xom::Serializer::setIndent ( int  indent)
void elm::xom::Serializer::setLineSeparator ( string  line_separator)
void elm::xom::Serializer::setMaxLength ( int  max_length)
void elm::xom::Serializer::setOutputStream ( io::OutStream out)
void elm::xom::Serializer::setPreserveBaseURI ( bool  preserve)
void elm::xom::Serializer::setUnicodeNormalizationFormC ( bool  normalize)
void elm::xom::Serializer::write ( Document doc)
virtual

Serializes a document onto the output stream using the current options.

Parameters
docthe Document to serialize

References elm::xom::Document::getRootElement(), and writeXMLDeclaration().

Referenced by writeAttributes(), and writeChild().

void elm::xom::Serializer::write ( Attribute attribute)
protectedvirtual

Writes an attribute in the form name="value". Characters in the attribute value are escaped as necessary.

Parameters
attributethe Attribute to write

References elm::xom::Attribute::getLocalName(), elm::xom::Attribute::getValue(), writeAttributeValue(), and writeRaw().

void elm::xom::Serializer::write ( Comment comment)
protectedvirtual

Writes a comment onto the output stream using the current options. Since character and entity references are not resolved in comments, comments can only be serialized when all characters they contain are available in the current encoding.

Parameters
commentthe Comment to serialize

References elm::xom::Comment::getText(), and writeRaw().

void elm::xom::Serializer::write ( DocType *  doctype)
protectedvirtual

Writes a DocType object onto the output stream using the current options.

Parameters
doctypethe document type declaration to serialize
void elm::xom::Serializer::write ( Element element)
protectedvirtual

Serializes an element onto the output stream using the current options. The result is guaranteed to be well-formed. If element does not have a parent element, the output will also be namespace well-formed.

If the element is empty, this method invokes writeEmptyElementTag. If the element is not empty, then:

  1. It calls writeStartTag.
  2. It passes each of the element's children to writeChild in order.
  3. It calls writeEndTag.

It may break lines or add white space if the serializer has been configured to indent or use a maximum line length.

References elm::xom::ParentNode::getChild(), elm::xom::ParentNode::getChildCount(), writeChild(), writeEmptyElementTag(), writeEndTag(), and writeStartTag().

void elm::xom::Serializer::write ( ProcessingInstruction *  instruction)
protectedvirtual

Writes a processing instruction onto the output stream using the current options. Since character and entity references are not resolved in processing instructions, processing instructions can only be serialized when all characters they contain are available in the current encoding.

Parameters
instructionthe ProcessingInstruction to serialize
void elm::xom::Serializer::write ( Text text)
protectedvirtual

Writes a Text object onto the output stream using the current options. Reserved characters such as <, > and " are escaped using the standard entity references such as <, >, and ".

Characters which cannot be encoded in the current character set (for example, Ω in ISO-8859-1) are encoded using character references.

Parameters
textthe Text to serialize

References elm::xom::Text::getValue(), and writeEscaped().

void elm::xom::Serializer::writeAttributes ( Element element)
protectedvirtual

Writes all the attributes of the specified element onto the output stream, one at a time, separated by white space. If preserveBaseURI is true, and it is necessary to add an xml:base attribute to the element in order to preserve the base URI, then that attribute is also written here. Each individual attribute is written by invoking write(Attribute).

Parameters
elementthe Element whose attributes are written

References elm::xom::Element::getAttribute(), elm::xom::Element::getAttributeCount(), and write().

Referenced by writeEmptyElementTag(), and writeStartTag().

void elm::xom::Serializer::writeAttributeValue ( String  value)
protectedvirtual

Writes a string onto the underlying output stream. Non-ASCII characters that are not available in the current character set are escaped using hexadecimal numeric character references. Carriage returns, line feeds, and tabs are also escaped using hexadecimal numeric character references in order to ensure their preservation on a round trip. The four reserved characters <, >, &, and " are escaped using the standard entity references <, >, &, and ". The single quote is not escaped.

Parameters
valuethe attribute value to serialize

References elm::xom::escapeSimple(), elm::xom::isAttrEscape(), elm::CString::length(), and writeRaw().

Referenced by write().

void elm::xom::Serializer::writeChild ( Node node)
protectedvirtual

Writes a child node onto the output stream using the current options. It is invoked when walking the tree to serialize the entire document. It is not called, and indeed should not be called, for either the Document node or for attributes.

Parameters
nodethe Node to serialize

References elm::xom::Node::COMMENT, elm::xom::Node::DOCUMENT, elm::xom::Node::ELEMENT, elm::xom::Node::kind(), elm::xom::Node::TEXT, and write().

Referenced by write().

void elm::xom::Serializer::writeEmptyElementTag ( Element element)
protectedvirtual

Writes an empty-element tag for the element including all its namespace declarations and attributes.

The writeAttributes method is called to write all the non-namespace-declaration attributes. The writeNamespaceDeclarations method is called to write all the namespace declaration attributes.

If subclasses don't wish empty-element tags to be used, they can override this method to simply invoke writeStartTag followed by writeEndTag.

Parameters
elementthe element whose empty-element tag is written

References breakLine(), elm::xom::Element::getLocalName(), writeAttributes(), and writeRaw().

Referenced by write().

void elm::xom::Serializer::writeEndTag ( Element element)
protectedvirtual

Writes the end-tag for an element in the form </name>.

Parameters
elementthe element whose end-tag is written

References breakLine(), elm::xom::Node::ELEMENT, elm::xom::String::free(), elm::xom::ParentNode::getChild(), elm::xom::ParentNode::getChildCount(), elm::xom::Element::getQualifiedName(), and writeRaw().

Referenced by write().

void elm::xom::Serializer::writeEscaped ( String  text)
protectedvirtual

Writes a string onto the underlying output stream. Non-ASCII characters that are not available in the current character set are encoded with numeric character references. The three reserved characters <, >, and & are escaped using the standard entity references <, >, and &. Double and single quotes are not escaped.

Parameters
textthe parsed character data to serialize

References elm::xom::escapeSimple(), elm::xom::isTextEscape(), elm::CString::length(), and writeRaw().

Referenced by write().

void elm::xom::Serializer::writeNamespaceDeclaration ( const string prefix,
const string uri 
)
protectedvirtual

Writes a namespace declaration in the form xmlns:prefix="uri" or xmlns="uri". It does not write the spaces on either side of the namespace declaration. These are written by writeNamespaceDeclarations.

Parameters
prefixthe namespace prefix; the empty string for the default namespace
urithe namespace URI
void elm::xom::Serializer::writeNamespaceDeclarations ( Element element)
protectedvirtual

Writes all the namespace declaration attributes of the specified element onto the output stream, one at a time, separated by white space. Each individual declaration is written by invoking writeNamespaceDeclaration.

Parameters
elementthe Element whose namespace declarations are written
void elm::xom::Serializer::writeRaw ( String  text,
int  length = -1 
)
protectedvirtual

Writes a string onto the underlying output stream. without escaping any characters. Non-ASCII characters that are not available in the current character set cause an IOException.

Parameters
textthe String to serialize
lengthlength of the string (optional)

References elm::CString::length(), elm::io::Output::stream(), and elm::io::OutStream::write().

Referenced by write(), writeAttributeValue(), writeEmptyElementTag(), writeEndTag(), writeEscaped(), writeStartTag(), and writeXMLDeclaration().

void elm::xom::Serializer::writeStartTag ( Element element)
protectedvirtual

Writes the start-tag for the element including all its namespace declarations and attributes.

The writeAttributes method is called to write all the non-namespace-declaration attributes. The writeNamespaceDeclarations method is called to write all the namespace declaration attributes.

Parameters
elementthe element whose start-tag is written

References breakLine(), elm::xom::Node::ELEMENT, elm::xom::String::free(), elm::xom::ParentNode::getChild(), elm::xom::ParentNode::getChildCount(), elm::xom::Element::getNamespaceDeclarationCount(), elm::xom::Element::getNamespacePrefix(), elm::xom::Element::getNamespaceURI(), elm::xom::Element::getQualifiedName(), elm::xom::Node::kind(), writeAttributes(), and writeRaw().

Referenced by write().

void elm::xom::Serializer::writeXMLDeclaration ( void  )
protectedvirtual

Writes the XML declaration onto the output stream, followed by a line break.

References breakLine(), elm::String::toCString(), and writeRaw().

Referenced by write().


The documentation for this class was generated from the following files: