Metadata-Version: 2.1
Name: tdb-py
Version: 0.9.3
Summary: A pure Python library supporting Tdb “Text DataBase” format, a plain text human readable typed database storage format superior to CSV.
Home-page: https://github.com/mark-summerfield/tdb-py
Author: Mark Summerfield
Author-email: mark@qtrac.eu
License: GPLv3
Description: # Tdb Overview
        
        Tdb “Text DataBase” format is a plain text human readable typed database
        storage format.
        
        Tdb is an ideal alternative to CSV. A Tdb file can store any number of
        tables. Every table is named, and every field has a name and a type. Types
        are not-null by default, but can be nullable if required. The seven
        supported types include strings which respect all whitespace (including
        newlines), and which may contain any UTF-8 characters (using XML-escaping
        conventions), binary (e.g., for images), Booleans, numbers (integer and
        real), and dates and datetimes.
        
        Tdb libraries are available in Go and Python with a Rust library _in
        development_. The Tdb format is designed to be very easy to parse, so
        creating a Tdb library in virtually any language should be straightforward.
        
        - [Datatypes](#datatypes)
        - [Examples](#examples)
            - [CSV](#csv)
            - [Database](#database)
            - [Minimal Tdb Files](#minimal-tdb-files)
        - [Timezones and Metadata](#timezones-and-metadata)
        - [Libraries](#libraries) (Go, Python, Rust)
        - [BNF](#bnf)
        - [Supplementary](#supplementary)
            - [Vim Support](#vim-support)
            - [Tdb Logo](#tdb-logo)
        
        ## Datatypes
        
        Tdb supports the following seven built-in datatypes.
        
        |**Type**   |**Example(s)**        |**Notes**|
        |-----------|----------------------|---------|
        |`bool`     |`F`|A Tdb reader should also accept 'f', 'N', 'n', 't', 'Y', 'y', '0', '1'|
        |`bytes`    |`(20AC 65 66 48)`|There must be an even number of case-insensitive hex digits; whitespace (spaces, newlines, etc.) optional.|
        |`date`     |`2022-04-01`|Basic ISO8601 YYYY-MM-DD format.|
        |`datetime` |`2022-04-01T16:11:51`|ISO8601 YYYY-MM-DDTHH[:MM[:SS]] format; 1-sec resolution no timezone support.|
        |`int`      |`-192` `234` `7891409`|Standard integers.|
        |`real`     |`0.15` `0.7e-9` `2245.389`|Standard and scientific notation.|
        |`str`      |`<Some text which may include newlines>`|For &, <, >, use \&amp;, \&lt;, \&gt; respectively.|
        
        All fields are _not null_ by default and must contain a valid value of the
        field's type. To make a field _nullable_, append `?` to its typename, e.g.,
        `int?`.
        
        Strings may not include `&`, `<` or `>`, so if they are needed, they must be
        replaced by the XML/HTML escapes `&amp;`, `&lt;`, and `&gt;` respectively.
        Strings respect any whitespace they contain, including newlines.
        
        Each field value is separated from its neighbor by whitespace, and
        conventionally records are separated by newlines. However, in practice,
        since every field in every record must be present (even if only a null value
        or an empty bytes or string), records may be laid out however you like.
        
        Where whitespace is allowed (or required) it may consist of one or more
        spaces, tabs, or newlines in any combination.
        
        ## Examples
        
        ### CSV
        
        Although widely used, the CSV format is not standardized and has a number of
        problems. Tdb is a standardized alternative that can distinguish fieldnames
        from data records, can handle multiline text (including text with commas and
        quotes) without formality, and can store one—or more—tables in a single Tdb
        file.
        
        Here's a simple CSV file:
        
            Date,Price,Quantity,ID,Description
            "2022-09-21",3.99,2,"CH1-A2","Chisels (pair), 1in & 1¼in"
            "2022-10-02",4.49,1,"HV2-K9","Hammer, 2lb"
            "2022-10-02",5.89,1,"SX4-D1","Eversure Sealant, 13-floz"
        
        Here's a Tdb equivalent:
        
            [PriceList Date date Price real Quantity int ID str Description str
            %
            2022-09-21 3.99 2 <CH1-A2> <Chisels (pair), 1in &amp; 1¼in> 
            2022-10-02 4.49 1 <HV2-K9> <Hammer, 2lb> 
            2022-10-02 5.89 1 <SX4-D1> <Eversure Sealant, 13-floz> 
            ]
        
        Every table starts with a tablename followed by one or more fields. Each
        field consists of a fieldname and a type.
        
        Superficially this may not seem much of an improvement on CSV (apart from
        Tbd's superior string handling and strong typing), but as the next example
        shows, a Tdb file can contain one _or more_ tables, not just one like CSV.
        
        ### Database
        
        Database files aren't normally human readable and usually require
        specialized tools to read and modify their contents. Yet many databases are
        relatively small (both in size and number of tables), and would be more
        convenient to work with if human readable. For these, Tdb format provides a
        viable alternative. For example:
        
            [Customers CID int Company str Address str? Contact str Email str
            %
            50 <Best People> <123 Somewhere> <John Doe> <j@doe.com> 
            19 <Supersuppliers> ? <Jane Doe> <jane@super.com> 
            ]
            [Invoices INUM int CID int Raised_Date date Due_Date date Paid bool Description str?
            %
            152 50 2022-01-17 2022-02-17 no <COD> 
            153 19 2022-01-19 2022-02-19 yes ?
            ]
            [Items IID int INUM int Delivery_Date date Unit_Price real Quantity int Description str
            %
            1839 152 2022-01-16 29.99 2 <Bales of hay> 
            1840 152 2022-01-16 5.98 3 <Straps> 
            1620 153 2022-01-19 11.5 1 <Washers (1-in)> 
            ]
        
        In the Customers table the second customer's Address and in the Invoices
        table, the second invoice's Description both have nulls as their values. (No
        other fields may have nulls only these fields are nullable).
        
        ### Minimal Tdb Files
        
        	[T f int
        	%
        	]
        
        This file has a single table called `T` which has a single field called `f`
        of type `int`, and no records.
        
        	[T f int
        	%
        	0
        	]
        
        This is like the previous table but now with one record containing the value
        `0`.
        
        	[T f int?
        	%
        	0
        	?
        	]
        
        Again like the previous table, but now with two records, the first
        containing the value `0`, and the second containing null which is permitted
        since the field's type is nullable.
        
        ### Timezones and Metadata
        
        Tdb does not have direct timezone support. There are three simple solutions
        for this.
        
        If all the dates in the database are in the same timezone, then one approach
        is to store all the dates as UTC. Alternatively, add a tiny configuration
        table with the timezone data, for example:
        
            [Config key str value str?
            %
            <timezone> <+02:30>
            ]
        
        If, however, the dates being stored have varying timezones, then add another
        column specifically for the timezone. Something along these lines:
        
            [Readings meter str reading real when date timezone str
            %
            <EX194B4> 1932.49 2024-11-17 <-03:00>
            <V1938DX> 8492.1 2024-10-30 <+02:30>
            ]
        
        If comments or metadata are required, simply create an additional table to
        store this data and add it to the Tdb. For example, use a Config table as
        shown above.
        
        ## Libraries
        
        |**Library**|**Language**|**Homepage**                 |
        |-----------|------------|-----------------------------|
        |tdb-go|Go|https://pkg.go.dev/github.com/mark-summerfield/tdb-go|
        |tdb-py|Python|https://pypi.org/project/tdb-py|
        |tdb-rs|Rust|https://crates.io/crates/tdb-rs _(in development)_|
        
        We will happily add links to implementations in other languages.
        
        ## BNF
        
        A Tdb file consists of one or more tables.
        
            TDB         ::= TABLE+
            TABLE       ::= OWS '[' OWS TABLEDEF OWS '%' OWS RECORD* OWS ']' OWS
            TABLEDEF    ::= IDENFIFIER (RWS FIELDDEF)+ # IDENFIFIER is the tablename
            FIELDDEF    ::= IDENFIFIER RWS FIELDTYPE # IDENFIFIER is the fieldname
            FIELDTYPE   ::= ('bool' | 'bytes' | 'date' | 'datetime' | 'int' | 'real' | 'str') NULL?
            RECORD      ::= OWS VALUE (RWS VALUE)*
            VALUE       ::= BOOL | BYTES | DATE | DATETIME | INT | REAL | STR | NULL # NULL is only allowed for nullable field types
            BOOL        ::= /[FfTtYyNn01]/
            BYTES       ::= '(' (OWS [A-Fa-f0-9]{2})* OWS ')'
            DATE        ::= /\d\d\d\d-\d\d-\d\d/  # basic ISO8601 YYYY-MM-DD format
            DATETIME    ::= /\d\d\d\d-\d\d-\d\dT\d\d(\d\d(\d\d)?)?/ 
            INT         ::= /[-+]?\d+/ 
            REAL        ::= ... # standard or scientific notation
            STR         ::= /[<][^<>]*?[>]/ # newlines allowed, and &amp; &lt; &gt; supported i.e., XML
            NULL        ::= '?'
            IDENFIFIER  ::= /[_\p{L}]\w{0,31}/ # Must start with a letter or underscore; may not be a built-in constant
            OWS         ::= /[\s\n]*/
            RWS         ::= /[\s\n]+/ # in some cases RWS is actually optional
        
        _Notes_
        
        - Every field is _not null_ by default and must contain a valid value of the
          field's type. To make a field _nullable_, append `?` to its typename,
          e.g., `str?`; for nullable fields the value must either be one of the
          field's type (e.g., `str`) _or_ null `?`.
        - A Tdb file _must_ contain at least one table even if it is empty, i.e.,
          has no records.
        - A Tdb writer should always write ``bool``s as `F` or `T`; but a Tdb reader
          should accept any of `F`, `f`, `N`, `n`, `0`, for false, and any of `T`,
          `t`, `Y`, `y`, `1`, for true.
        - Within any `.tdb` file each tablename must be unique, and within each
          table each fieldname must be unique.
        - No tablename or fieldname (i.e., no identifier) may be the same as a
          built-in constant or `bool` value:  
          `bool`, `bytes`, `date`, `datetime`, `f`, `F`, `int`, `n`, `N`, `real`, `str`, `t`, `T`, `y`, `Y`
        
        ## Supplementary
        
        ### Vim Support
        
        If you use the vim editor, simple color syntax highlighting is available.
        Copy `tdb.vim` into your `$VIM/syntax/` folder and add this line (or
        similar) to your `.vimrc` or `.gvimrc` file:
        
            au BufRead,BufNewFile,BufEnter *.tdb set ft=tdb|set expandtab|set textwidth=80
        
        ### Tdb Logo
        
        ![tdb logo](tdb.svg)
        
        ---
        
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.8
Description-Content-Type: text/markdown
