Markup Languages

A markup language is
  • a system for annotating a text in a way which is syntactically distinguishable from that text. (wikipedia, 4/20/2010)
  • a set of symbols and rules for their use when doing a markup of a document. (WordNet Search)
  • a formal way of annotating a document or collection of digital data using embedded encoding tags to indicate the structure of the document or datafile and the contents of its data elements. (eGovernment Resource Center)
The term "markup" is derived from the traditional publishing practice of "marking up"' a manuscript, which involves adding handwritten annotations in the form of conventional symbolic printer's instructions in the margins and text of a paper manuscript or printed proof.

Taxonomy of Markup

There are three categories of electronic markup: presentational, procedural, and descriptive (Coombs et al., 1987):
  • Presentational markup - This category of markup is used by traditional word-processing systems where binary codes embedded in document text to produce the WYSIWYG effect. Such markup is usually designed to be hidden from human users, even those who are authors or editors.
  • Procedural markup - In many text-processing systems, presentational markup is replaced by procedural markup, which consists of commands indicating how text should be formatted. Well-known examples include troff, LaTeX, and PostScript
  • Descriptive markup - Under the descriptive system of markup, authors identify the element types of text tokens. This category of markup, often described as "semantic," is used to label parts of the document rather than to provide specific instructions as to how they should be processed.  Examples include SGML, HTML, XHTML, and XML.
(Source: Coombs et al., 1987)


  • 1986 - The SGML (Standard Generalized Markup Language) was an ISO-standard (8879) technology for defining generalized markup languages for documents.
  • October 1991HTML Tags, as an informal CERN (European Organization for Nuclear Research) document and the first publicly available description of HTML, was mentioned on the Internet by Berners-Lee.
  • July 1992HTML DTD 1.1 was published. (DTD, "Document Type Definition" or "DOCTYPE," is a set of markup declarations that define a document type for SGML-family markup languages.)
  • June 1993 - HTML (Hypertext Markup Language), considered as an application of SGML, was formally defined by the Internet Engineering Task Force (IETF).
  • 1995HTML 2.0, completed by an HTML Working Group of the IETF, was the first HTML specification intended to be treated as a standard for future implmemtations.
  • 1996 - IETF closed its HTML Working Group, and the HTML specifications started to be maintained, with input from commercial software vendors, by the World Wide Web Consortium (W3C). 
  • July 1996 - The first Working Draft of an XML (Extensible Markup Language) specification was published by the XML Working Group.
  • January 1997 - HTML 3.2 was published as the first version developed and standardized exclusively by the W3C. (IETF closed its HTML Working Group in September 1996.)
    • Math formulas dropped entirely.
    • Various proprietary extensions reconciled.
    • Most of Netscape's visual markup tags adopted. 
    • Netscape's blink element and Microsoft's marquee element omitted as a mutual agreement between the two companies.
  • December 1997 - HTML 4.0, initially code-named "Couga," was published with the purpose to separate structure and presention.  It offers three variations:
    • Strict - in which deprecated elements are forbidden,
    • Transitional - in which deprecated elements are allowed,
    • Frameset - in which mostly only frame related elements are allowed.  
  • February 1998 - XML 1.0 became a W3C Recommendation.
  • 1999 - HTML 4.01 was published by W3C.
  • 2000 - HTML became an international standard (ISO/IEC 15445:2000). 
  • January 2000 - XHTML (Extensible Hypertext Markup Language) 1.0, as a separate language and a reformulation of HTML 4.01 using XML 1.0, was published as a W3C Recommendation.
  • May 2001 - XHTML 1.1, based on XHTML 1.0 Strict, was published as a W3C Recommendation. 
  • January 2008HTML 5, aims to reduce the need for proprietary plug-in-based rich internet application (RIA) technologies such as Adobe Flash, Microsoft Silverlight, Apache Pivot, and Sun JavaFX, was published as a Working Draft by the W3C.   


  • Coombs, Renear, and DeRose, Markup systems and the future of scholarly text processing (November 1987), Communications of the ACM.
  • Wikipedia