
           Implementation of a Turing Machine as Sed Script

                     Christophe Blaess, 2001.


  This Sed script emulates a Turing machine, reading its instruction
  table on the standard input and writing the content of the tape on
  the standard output.
  

Introduction
------------

A Turing machine is a very simple computation device, introduced by Alan
Turing in 1935.  This  machine  is  made  of a magnetic tape and a read-
write head,  which can move along the tape.  This tape is  divided  into
cells, each of them could be  empty, or could contain a symbol  (letter,
digit, etc.).  The tape is theorically infinite,  but the number of non-
empty cells is finite  (this is important because we  can  always  start
with a blank tape and use a finite number of program steps to initialize
the tape).

The  head  can move to the left or to the right one cell at a time.  The
Turing  machine contains also a cell of memory that can accept a symbol.
This memory represents the state of the machine.

The behavior of the machine  at a given moment  is totally determined by
its state  and the symbole lying in front of the read head.  The machine
is programmable: we insert a list of instructions, i.e. a list of action
to perform  for  each  state  and  each possible symbol.  An instruction
contains a new symbol  to  write on the tape in place of the previous, a
direction of movement for the head, and a new state for the machine.

The machine act as follows:

- The machine reads the content of the cell being in front of the head.
  If there  isn't any instruction corresponding  to this symbol and to
  the current state of the machine, it stops.

- The  machine  writes on  the tape the new symbol as specified in the
  selected instruction, erasing the previous content of the cell. This
  can be the same symbol.

- The machine moves the read-write head one step to the left or to the
  right according to the direction  in the instruction.  In our imple-
  mentation, we add the possibility to keep the head on the same cell.

- The machine loads  its memory with  the  symbol representing the new
  state. If there isn't any instruction for this state of the machine,
  a  final  stable  state  has  been  reached,  and  the machine stops.
  Otherwise the cycle restarts.

The result of the execution of a  Turing  machine is the printing of the
tape.  For  teaching  purposes,  we choose  to  display this tape at the
begining of each cycle.


The instructions are given as five characters strings:

  - The first character is the current state,  i.e. the symbol corres-
    ponding to the state of the machine at the beginning of the cycle.
    In our implementation a symbol can be a letter,  a digit,  or  any
    punctuation mark excepted `|', `#' and `:'.  At the beginning, the
    state of the machine is always `0'.
    
  - The  second  character  is the symbol read on the tape.  The cells
    contain any letter, digit, or punctuation sign except `|', `#' and
    `:'. By default a blank cell appears as ` '.
    
The first two fields select an instruction,  the next ones represent the
action to perform:

  - The third character is the symbol to write on the tape in place of
    the previous one.
    
  - The fourth is  the direction to move the head.  It can be  `L'  or
    `l' to move to the left, `R' or `r' to move to the right,  or  ` '
    to stay at the same place.
    
  - Finally,  the  fifth  character  represents  the  new state of the
    machine to switch to at the end of the cycle.

The instructions  are  given  on the standard input of the machine.  The
machine  reads  them  all and stores them before starting, so we can put
them in a script file.  We added the possibility  to put comments in the
script file, beginning with a `#' and extending to the end of the line.

The user can also give an initial tape, as a string  beginning with `|'.
The head will be  automatically set in front of the  leftmost  cell.  Of
course,  if no tape are given,  the script provides a blank one.  As the
tape is theorically infinite,  we will extend it automatically as needed.


Implementation
--------------

We will use the  Sed  buffer  as  a storage area containing all we need:
tape, machine's state, instructions.  We will use the second buffer only
for  storage  when  displaying  the  tape  at each cycle.  If you prefer
displaying  the  content of the tape only at the end of the program, you
can avoid totally use of the second buffer.

This  script  was  written on the Gnu version of Sed, which one gives no
limit for the size of the buffer,  except  the  total  amount  of memory
available on the system. So we consider the tape as unlimited.

The first character in the Sed  buffer  is  the  current  state  of  the
machine  (initially   `0').  We  also  reserve  a  second  character  as
temporary storage area for the symbole read by the head.

Next comes the tape, as a character string terminated by a colon `:'. To
mark  the  position  of the read head, the current cell is surrounded by
two `|'.

After the  tape are the instructions,  separated by colons.  To find the
instruction corresponding  to the current  state,  we  will search for a
substring beginning after a colon, and whose first two characters (state
and current cell) are the same as the first two of the Sed buffer.

The understanding of the script is not very easy,  but  the figure below
describing  the  Sed  buffer  may  help  to read line by line our Turing
machine emulator.


   current state of the machine
   |  content of the read cell
   |  |
   |  |         Tape              Instruction       Instruction
   v  v <------------------->   <------------->   <------------->
  [0][1][2]...[|][1][|]...[0][:][0][1][1][L][1][:][1][1][0][R][2][:]...
              <------->          ^  ^  ^  ^  ^
           read-write head       |  |  |  |  |
                                 |  |  |  |  next state of the machine   
                                 |  |  |  direction of movement
                                 |  |  new symbol to write
                                 |  symbol read on the tape
                                 current state of the machine
			       
              Fig. 1 - Content of the Sed buffer.


Usage
-----

To use our machine,  we send  the instruction table and the initial tape
on its standard input,  using script files,  with ".tm" (Turing Machine)
extension.  It is not very  easy  to  write  real  programs for a Turing
machine.  But,  for  example, we  can see a script to increment a binary
number.  The  number is written on the tape before starting the program.
Our machine will use four states (`0',  `1',  `2',  and `3') and a final
stable state `F'.


State 0: We look for the number,  moving  the head to the right until
encoutering a `1' or a `0'. When done we switch to state 1.  It takes
three instructions:

   +-----------+-----------+-----------+-----------+-----------+
   |  Current  |  Symbol   | Symbol to |   Move    |   Next    |
   |   state   |   read    |   write   |   Move    |   State   |
   +-----------+-----------+-----------+-----------+-----------+
   |     0     |    ' '    |    ' '    |     R     |     0     |
   |     0     |    '0'    |    '0'    |     R     |     1     |
   |     0     |    '1'    |    '1'    |     R     |     1     |
   +-----------+-----------+-----------+-----------+-----------+

If we find another symbol on the tape, the machine will automatically
stop  (line 54 in the script "turing.sed"), but not on a final stable
state.


State 1:  We have found the number,  we will continue until the first
space.  Then we move the head  backward one step,  to put it in front
of the rightmost binary digit (least significant).

   +-----------+-----------+-----------+-----------+-----------+
   |  Current  |  Symbol   | Symbol to |   Move    |   Next    |
   |   state   |   read    |   write   |   Move    |   State   |
   +-----------+-----------+-----------+-----------+-----------+
   |     1     |    ' '    |    ' '    |     L     |     2     |
   |     1     |    '0'    |    '0'    |     R     |     1     |
   |     1     |    '1'    |    '1'    |     R     |     1     |
   +-----------+-----------+-----------+-----------+-----------+


State 2:  We want to  increment the binary digit.  If it's a `0',  we
overwrite it with a `1' and switch to next state.  If it's a `1',  we
overwrite it with a `0', then move the head to the left (on the upper
digit) and repeat the state 2.  If  we  have  found a space (the most
significant digit was a `1'),  we overwrite it with a  `1' and switch
to the next state.

   +-----------+-----------+-----------+-----------+-----------+
   |  Current  |  Symbol   | Symbol to |   Move    |   Next    |
   |   state   |   read    |   write   |   Move    |   State   |
   +-----------+-----------+-----------+-----------+-----------+
   |     2     |    ' '    |    '1'    |     R     |     3     |
   |     2     |    '0'    |    '1'    |     R     |     3     |
   |     2     |    '1'    |    '0'    |     L     |     2     |
   +-----------+-----------+-----------+-----------+-----------+


State 3:  We move the  head along the tape to the right of the number
for  aesthetic  reason  during  final display of the tape.  Then,  we
switch to the final stable state `F'.

   +-----------+-----------+-----------+-----------+-----------+
   |  Current  |  Symbol   | Symbol to |   Move    |   Next    |
   |   state   |   read    |   write   |   Move    |   State   |
   +-----------+-----------+-----------+-----------+-----------+
   |     3     |    ' '    |    ' '    |     R     |     F     |
   |     3     |    '0'    |    '0'    |     R     |     3     |
   |     3     |    '1'    |    '1'    |     R     |     3     |
   +-----------+-----------+-----------+-----------+-----------+

Let's see the full script,  with  an initial tape containing the decimal
number 151, i.e. binary "10010111".

inc.tm :

   # This script allows a Turing machine to increment a binary number
   
   # The initial tape with the number:
   | 10010111

    # State 0
    0  R0
    011R1
    000R1

    # State 1
    1  L2
    100R1
    111R1

    # State 2
    2 1R3
    201R3
    210L2

    # State 3
    3  RF
    300R3
    311R3 

Here's a  sample run of the script.  We can see the state of the machine
in parenthesis on the left, and the content of the tape with the current
cell surrounded by two `|'.

    $ ./turing.sed inc.tm
    (0) | |10010111
    (0)  |1|0010111
    (1)  1|0|010111
    (1)  10|0|10111
    (1)  100|1|0111
    (1)  1001|0|111
    (1)  10010|1|11
    (1)  100101|1|1
    (1)  1001011|1|
    (1)  10010111| |
    (2)  1001011|1|
    (2)  100101|1|0
    (2)  10010|1|00
    (2)  1001|0|000
    (3)  10011|0|00
    (3)  100110|0|0
    (3)  1001100|0|
    (3)  10011000| |
    (F)  10011000 | |
    Final state F reached... end of processing.
    $ 

binary 10011000 = decimal 152 : it works !

Conclusion
----------

Writing  scripts  for  a  Turing  machine  is  an intellectual challenge
sometimes quite complex.  The  goal  is not to write  real applications.
It  would  be too difficult,  and  the  results  would  be  dramatically 
inefficient.

The  real  interest is the fact, as stated in the  Church-Turing thesis,
that any computable task  (compilation, text processing, calculation...)
can be written on a Turing machine.

While  emulating  a Turing machine in a Sed script,  we proove that this
language is sufficient to perform any computable task !


                                Christophe Blaess <ccb@club-internet.fr>
                                      http://perso.club-internet.fr/ccb/

