PEG markup_xhtmlmodularized--tjl:

Author: Tuomas J. Lukka
Last-Modified:2003-04-09
Revision: 1.4
Status: Current

Marked-up text has always been a difficult but focal point for us. This PEG proposes using some modules from modularized XHTML (http://www.w3.org/TR/xhtml-modularization/, http://www.w3.org/TR/xhtml11/) to accomplish it.

This PEG extends and changes the Alph PEG styled_text--benja by providing a concrete proposal of the implementation of the "formatted xanalogical text", with some different properties.

Issues

Changes

The only module suitable for the current proposal appear to be the Text module, possibly at first only the subset in Inline Phrasal.

In the future, we should probably consider next the List module, the Presentation module, the Style Sheet module, and the Style Attribute module.

For example, the following would be a legal marked-up xanalogical text fragment. The namespace h refers to our subset of modularized XHTML and the namespace a to the alph namespace.

...<a:ts b="..." s="15" e="20"/><h:em><a:ts b="..." s="20" e="22"/> <a:uts b="..." s="42" t="[his]"/></h:em>...

This shows two text span types, and how the ts span has been split by the onset of the emphasis.

We need to define a suitable API for accessing the text. A miniature variant of DOM, with xanalogical operators, seems appropriate.

First, some parts of DOM will be disabled for this use; in fact, only the classes Attr, CharacterData, DocumentFragment, Element, Node will be used.

The additional node XuText will be defined; XuText corresponds to one or more Xanalogical spans. This implies that the alph text elements shall never be seen as DOM elements.

Also, the interface Node shall be extended by the Xanalogical overlap query

boolean overlaps(SpanCollection coll);

where SpanCollection is some type of overlap interface for a collections of unordered spans, which Node shall also implement.