.. -*- rst-mode -*-

   :Copyright: 2006 Guenter Milde.
               Released under the terms of the GNU General Public License
               (v. 2 or later)

Literate Programming
********************

The Philosophy of Literate Programming
======================================

The philosophy of *literate programming* (LP) and the term itself were the
creation of Donald Knuth, the author of :title-reference:`The Art of
Computer Programming`, the TeX typesetting system, and other works of the
programming art.

  I believe that the time is ripe for significantly better documentation of
  programs, and that we can best achieve this by considering programs to be
  works of literature. Hence, my title: "Literate Programming."

  Let us change our traditional attitude to the construction of programs:
  Instead of imagining that our main task is to instruct a computer what to
  do, let us concentrate rather on explaining to humans what we want the
  computer to do.

  -- Donald E. Knuth, `Literate Programming`_,  1984

.. _definition:

The `Literate Programming FAQ`_ gives the following definition:

   *Literate programming* is the combination of documentation and source
   together in a fashion suited for reading by human beings.

and the `Wikipedia`_ says:

   *Literate programming* is a philosophy of computer programming based on
   the premise that a computer program should be written with human
   readability as a primary goal, similarly to a work of literature.

   According to this philosophy, programmers should aim for a "literate"
   style in their programming just as writers aim for an intelligible and
   articulate style in their writing. This philosophy contrasts with the
   mainstream view that the programmer's primary or sole objective is to
   create source code and that documentation should only be a secondary
   objective.

John Max Skaller put the philosophy of literate programming in one sentence

    The idea is that you do not document programs (after the fact), but
    write documents that *contain* the programs.

    -- John Max Skaller in a `Charming Python interview`_.


.. _`Literate Programming FAQ`:
   http://www.literateprogramming.com/lpfaq.pdf
.. _Literate Programming: WEB_
.. _Charming Python interview:
   http://www.ibm.com/developerworks/library/l-pyth7.html


LP Technique and Systems
==========================================

Donald Knuth not only conceived the *philosophy* of literate programming,
he also wrote the first LP *system* `WEB`_ that
implements the *technique* of literate programming:

.. _`technique of literate programming`:

  WEB itself is chiefly a combination of two other languages:

  (1) a document formatting language and
  (2) a programming language.

  My prototype WEB system uses TEX as the document formatting language
  and PASCAL as the programming language, but the same principles would
  apply equally well if other languages were substituted. [...]. The main
  point is that WEB is inherently bilingual, and that such a combination
  of languages proves to be much more powerful than either single
  language by itself. WEB does not make the other languages obsolete; on
  the contrary, it enhances them.

  -- Donald E. Knuth, `Literate Programming`_,  1984

.. _`literate programming system`:
.. _`LP system`:

There exists a famous *statement of requirements* for a LP
system:

.. _statement of requirements:

  *Literate programming systems* have the following properties:

  1. Code and extended, detailed comments are intermingled.

  2. The code sections can be written in whatever order is best for people
     to understand, and are re-ordered automatically when the computer needs
     to run the program.

  3. The program and its documentation can be handsomely typeset into a
     single article that explains the program and how it works. Indices and
     cross-references are generated automatically.

  ..  POD only does task 1, but the other tasks are much more important.

  -- Mark-Jason Dominus `POD is not Literate Programming`_


The article [contemporary-literate-programming]_ provides a survey on
different implementations of the literate programming technique.


LP with reStructuredText
==========================================

PyLit fails the `statement of requirements`_ for a LP
system, as it does not provide for code re-ordering, automatic indices and
cross-references.

According to the Wikipedia, PyLit qualifies as *semi-literate* (while POD is
classified as `embedded documentation`_):

  It is the fact that documentation can be written freely whereas code must
  be marked in a special way that makes the difference between semi-literate
  programming and excessive documenting, where the documentation is embedded
  into the code as comments.

  -- `Wikipedia entry on literate programming`_

However, it is argued that PyLit is an *educated choice* as tool for the
`technique of literate programming`_:

* PyLit is inherently bilingual. The document containing the program is
  a combination of:

  (1) a document formatting language (`reStructuredText`_) and
  (2) a programming language.

* Program and documentation can be typeset in a single article as well as
  converted to a hyperlinked HTML document representing the program's
  "web-like" structure. For indexing and cross-references, PyLit relies
  on the rich features of the `reStructuredText`_ document formatting
  language. Auto-generating of cross references and indices is
  possible if the reStructuredText sources are converted with Sphinx_

Dispensing with code re-ordering enables the `dual-source`_ concept. Code
re-ordering is considered less important when LP is not
seen as an *alternative* to structured programming but a technique
*orthogonal* to structured, object oriented or functional programming
styles:

    The computer science and programming pioneers, like Wirth and Knuth,
    used another programming style than we recommend today. One notable
    difference is the use of procedure abstractions. When Wirth and Knuth
    wrote their influential books and programming, procedure calls were used
    with care, partly because they were relatively expensive. Consequently,
    they needed an extra level of structuring within a program or procedure.
    'Programming by stepwise refinement' as well as 'literate programming'
    can be seen as techniques and representations to remedy the problem. The
    use of many small (procedure) abstractions goes in another direction. If
    you have attempted to use 'literate programming' on a program (maybe an
    object-oriented program) with lots of small abstractions, you will
    probably have realized that there is a misfit between, on the one hand,
    being forced to name literal program fragments (scraps) and on the
    other, named program abstractions. We do not want to leave the
    impression that this is a major conceptual problem, but it is the
    experience of the author that we need a variation of literate
    programming, which is well-suited to programs with many, small (named)
    abstractions.

    -- Kurt Nørmark: `Literate Programming - Issues and Problems`_

Ordering code for optimal presentation to humans is still a problem in
many "traditional" programming languages. Pythons dynamic name resolution
allows a great deal of flexibility in the order of code.

.. _POD is not Literate Programming:
     http://www.perl.com/pub/a/tchrist/litprog.html
.. _Wikipedia entry on literate programming:
.. _Wikipedia: http://en.wikipedia.org/wiki/Literate_programming
.. _embedded documentation:
    https://en.wikipedia.org/wiki/Literate_programming
    #Contrast_with_documentation_generation
.. _`Literate Programming - Issues and Problems`:
    http://www.cs.aau.dk/~normark/litpro/issues-and-problems.html
.. _dual-source: /features/index.html#dual-source
.. _Sphinx: https://www.sphinx-doc.org/


Existing Frameworks
===================

There are many LP systems available, both sophisticated
and lightweight. This non-comprehensive list compares them to PyLit.


WEB and friends
---------------

The first published literate programming system was WEB_ by Donald Knuth. It
uses Pascal as programming language and TeX as document formatting language.

CWEB_ is a newer version of WEB for the C programming language. Other
implementations of the concept are the "multi-lingual" noweb_ and
FunnelWeb_. All of them use TeX as document formatting language.

Some of the problems with WEB-like systems are:

* The steep learning curve for mastering TeX, special WEB commands and
  the actual programming language in parallel discourages potential literate
  programmers.

* While a printout has the professional look of a TeX-typeset essay, the
  programmer will spend most time viewing and editing the source. A WEB
  source in a text editor is not "suited for reading by human beings":

    In Knuth's work, beautiful literate programs were woven from ugly mixes of
    markups in combined program and documentation files. The literate program
    was directly targeted towards paper and books. Of obvious reasons,
    programmers do not very often print their programs on paper. (And as
    argued above, few of us intend to publish our programs in a book). Within
    hours, such a program print out is no longer up-to-date. And furthermore,
    being based on a paper representation, we cannot use the computers dynamic
    potentials, such as outlining and searching.

    -- Kurt Nørmark: `Literate Programming - Issues and Problems`_

How does PyLit differ from the web-like LP frameworks
(For a full discussion of PyLit's concepts see the Features_ page.):

* PyLit is not a complete `LP system`_ but a simple tool.
  It does not support re-arranging of named code chunks.

* PyLit uses `reStructuredText`_ as document formatting language.
  The literate program source is legible in any text editor.

  Even the code source is "in a fashion suited for reading by human beings",
  in opposition to source code produced by a full grown web system:

    Just consider the output of ``tangle`` to be object code: You can run
    it, but you don't want to look at it.

    -- http://www.perl.com/pub/a/tchrist/litprog.html

* PyLit is a *bidirectional* text <-> code converter.

  + You can start a literate program from a "normal" code source by
    converting to a text source and adding the documentation parts.
  + You can edit code in its "native" mode in any decent text editor or IDE.
  + You can debug and fix code with the standard tools using the code source
    Line numbers are the same in text and code format. Once ready, forward
    your changes to the text version with a code->text conversion.

  - PyLit cannot support export to several code files from one text source.


.. _Docutils: https://docutils.sourceforge.io/
.. _Features: /features/index.html

.. _WEB: http://www.literateprogramming.com/knuthweb.pdf
.. _CWEB: https://tug.ctan.org/web/cweb/cwebman.pdf
.. _noweb: https://www.cs.tufts.edu/~nr/noweb/
.. _FunnelWeb: http://www.ross.net/funnelweb/



Interscript
-----------

Interscript_ is a complex framework that uses Python both for its
implementation and to supply scripting services to client sourceware.

  Interscript is different, because the fundamental paradigm is extended so
  that there are three kinds of LP sections in file:

  1. Code sections
  2. Documentation sections
  3. Scripting sections

  where the scripting sections [...] control and possibly generate both
  documentation and code.

  -- Interscript_ homepage

.. _Interscript: http://interscript.sourceforge.net/

Ly
--

Ly_, the "lyterate programming thingy." is an engine for Literate Programming.
The design considerations look very much like the ones for PyLit:

  So why would anybody use Ly instead of the established Literate
  Programming tools like WEB or noweb?

    * Elegance. I do not consider the source code of WEB or noweb to be
      particularly nice. I especially wanted a syntax that allows me to easily
      write a short paragraph, then two or three lines of code, then another
      short paragraph. Ly accomplishes this by distinguishing code from text
      through indentation.
    * HTML. WEB and CWEB are targeted at outputting TeX. This is a valid
      choice, but not mine, as I want to create documents with WWW links, and
      as I want to re-create documents each time I change something-- both of
      these features don't go well together with a format suited to printout,
      like TeX is. (Of course you can create HTML from LaTeX, but if my output
      format is HTML, I want to be able to write HTML directly.)
    * Python. I originally created Ly for a Python-based project, and
      therefore I wanted it to run everywhere where Python runs. WEB and noweb
      are mostly targeted at UNIX-like platforms.

  -- Ly_ introduction

However, Ly introduces its own language with

* a syntax for code chunks,
* some special ways for entering HTML tags ("currently undocumented").

whereas PyLit relies on the established `reStructuredText`_ for
documentation markup and linking.

.. _Ly: http://lyterate.sourceforge.net/intro.html
.. _reStructuredText: https://docutils.sourceforge.io/rst.html


XMLTangle
---------

XMLTangle_ is an generic tangle  program written in Python. It differs from
PyLit mainly by its choice of XML as documenting language (making the source
document harder to read for a human being).

.. _XMLTangle: http://literatexml.sourceforge.net/


pyreport
--------

Pyreport_ is quite similar to PyLit in its focus on Python and the use of
reStructuredText. It is a combination of `documentation generator`__
(processing embedded documentation) and report tool (processing Python
output).

__ http://en.wikipedia.org/wiki/Documentation_generator

  pyreport is a program that runs a python script and captures its output,
  compiling it to a pretty report in a PDF or an HTML file. It can display
  the output embedded in the code that produced it and can process special
  comments (literate comments) according to markup languages (rst or LaTeX)
  to compile a very readable document.

  This allows for extensive literate programming in python, for generating
  reports out of calculations written in python, and for making nice
  tutorials.

However, it is not a tool to "write documents that contain programs".

.. _pyreport:
    http://gael-varoquaux.info/programming/
    pyreport-literate-programming-in-python.html


References
==========

.. [contemporary-literate-programming]
   Vreda Pieterse, Derrick G. Kourie, and Andrew Boake:
   `A case for contemporary literate programming`,
   in SAICSIT, Vol. 75, Pages: 2-9,
   Stellenbosch, Western Cape, South Africa: 2004,
   available at: http://portal.acm.org/citation.cfm?id=1035054
