Generalized Content Markup Languages
(An Expert Handlers SIG Reference)
SGML-Based Languages and Derivatives
HTML: HyperText Markup Language, version 4.01
HyperText Markup Language (HTML), is the base markup language of the World Wide Web. This specification defines HTML 4.01, which is a subversion of HTML 4. In addition to the text, multimedia, and hyperlink features of the previous versions of HTML (HTML 3.2 and HTML 2.0 ), HTML 4 supports more multimedia options, scripting languages, style sheets, better printing facilities, and documents that are more accessible to users with disabilities. HTML 4 also takes great strides towards the internationalization of documents, with the goal of making the Web truly World Wide. HTML 4 is an SGML application conforming to International Standard ISO 8879 -- Standard Generalized Markup Language . Note that work has begun on the fifth revision of HTML
XML: the Extensible Markup Language
The Extensible Markup Language (XML) is a subset of SGML. Its goal is to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML.
XHTML: eXtensible HyperText Markup Language
XHTML 1.0 is a reformulation of HTML 4.01 as an XML 1.0 application, and three DTDs corresponding to the ones defined by HTML 4. The semantics of the elements and their attributes are defined in the W3C Recommendation for HTML 4. The semantics of XHTML 1.0 provides the foundation for the future extensibility of XHTML. Compatibility with existing HTML user agents is possible by following a small set of guidelines .
- SGML: Standard Generalized Markup Language
- SGML is a meta-language in which one can define markup languages for documents.
- recommended resource: The W3C has an excellent Overview of SGML Resources .
Other (Non-SGML-Based) General Markup Languages
- DocBook :
- DocBook is a schema available in several languages including RELAX NG, SGML and XML DTDs, as well as W3C XML Schema) maintained by the DocBook Technical Committee of OASIS . DocBook is particularly well suited to books and papers about computer hardware and software (though it is by no means limited to these applications). Because it is a large and robust schema, and because its main structures correspond to the general notion of what constitutes a "book", DocBook has been adopted by a large and growing community of authors writing books of all kinds. DocBook is supported "out of the box" by a number of commercial tools, and there is rapidly expanding support for DocBook in a number of free software environments. These features have combined to make DocBook a generally easy to understand, widely useful, and very popular schema. Dozens of organizations are using DocBook for millions of pages of documentation, in various print and online formats, worldwide.
- recommended resource: DocBook.org , "The Source for Documentation", maintained by by Norman Walsh and Leonard Muellner
- EBML: Extensible Binary Meta Language :
- EBML is a generalized file format for any kind of data. The goal of EBML is to be a binary equivalent to XML. It provides a basic framework for storing data in XML-like tags. EBML is not extensible in the same way that XML is, as the Document Type Definition must be known in advance.
- A document markup language and document preparation system for the TeX typesetting program. Within the typesetting system. It is widely used by mathematicians, scientists, philosophers, engineers, scholars in academia and the commercial world. As a primary or intermediate format (for example, translating DocBook and other XML-based formats to PDF) LaTeX is used because of the quality of typesetting achievable by TeX. The typesetting system offers programmable desktop publishing features and extensive facilities for automating most aspects of typesetting and desktop publishing, including numbering and cross-referencing, tables and figures, page layout and bibliographies. LaTeX is intended to provide a high-level language that accesses the power of TeX. LaTeX essentially comprises a collection of TeX macros and a program to process LaTeX documents. Because the TeX formatting commands are very low-level, it is usually much simpler for end-users to use LaTeX.
- recommended resource: Official LaTeX Project website ; dedicated to open development of LaTeX, this site contains links to and documentation for LaTeX2e (available only as a PDF file) and to experimental pre-release code which may be used in LaTeX3 )
- YAML is a human-readable data serialization format that takes concepts from languages such as XML, C, Python, Perl, as well as the format for electronic mail as specified by RFC 2822 . YAML is a recursive acronym meaning "YAML Ain't Markup Language". Early in its development, YAML was an acronym for "Yet Another Markup Language", which was retronymed to distinguish its purpose as data-centric, rather than document markup.
- recommended resource: YAML Cookbook: Equivalent data structures in YAML and Ruby