Metadata-Version: 2.1
Name: lexpr
Version: 0.1
Summary: A parser for simple logical expressions of identifiers
Home-page: https://github.com/ggonnella/lexpr
Author: Giorgio Gonnella
Author-email: gonnella@zbh.uni-hamburg.de
License: ISC
Keywords: bioinformatics sequence features
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: End Users/Desktop
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: ISC License (ISCL)
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Software Development :: Libraries
Description-Content-Type: text/markdown
License-File: LICENSE.txt
License-File: AUTHORS.txt

# Lexpr: A simple logical expressions parser

Lexpr is a simple package containing a
logical expressions parser developed using a Lark grammar.

The expressions may contain:
- entity identifiers
- the binary operators ``|`` (or), ``&`` (and)
- the unary operator ``!`` (not).
- balanced pairs of round parentheses

## Installation

The package can be installed using ``pip install lexpr``.

## Usage

A parser is created and used for parsing the text, as in the
following example:
```
import lexpr
lp = lexpr.Parser()
lp.parse("(G1 & G2) | !G3")
#
# output:
#
#  Tree(
#    Token('RULE', 'start'),
#    [Tree(Token('RULE', 'entity'),
#      [Tree(Token('RULE', 'or_expr'),
#        [Tree(Token('RULE', 'entity'),
#          [Tree(Token('RULE', 'enclosed_expr'),
#            [Tree(Token('RULE', 'entity'),
#              [Tree(Token('RULE', 'and_expr'),
#                [Tree(Token('RULE', 'entity'), [Token('IDENTIFIER', 'G1')]),
#                 Tree(Token('RULE', 'entity'), [Token('IDENTIFIER', 'G2')])])]
#            )]
#          )]
#        ), Tree(Token('RULE', 'entity'),
#             [Tree(Token('RULE', 'not_expr'),
#               [Tree(Token('RULE', 'entity'), [Token('IDENTIFIER', 'G3')])]
#             )]
#           )]
#      )]
#    )]
#  )

```

In case of an invalid string is passed to the parser, an
exception is raised:
```
import lexpr
lp = lexpr.Parser()
lp.parse("G1 &")
# raises LexprParserError, unbalanced expression
lp.parse("G1 & G$")
# raises LexprParserError, invalid character in identifier
```

## Implementation

The grammar is contained in the file ``lexpr/data/lexpr.g``.
The parser is in the module ``lexpr/parser.py``.
Errors raised by the module are defined in ``lexpr/error.py``
and are instances of the class ``LexprError`` or its
subclasses.

## History

The package has been developed to support the parsing of the EGC format, for
expressing expectations about the contents of prokaryotic genomes. In this
format, groups of organisms can be combined using logical expressions of the
form parsed by this package. The main implementation of the format is based on
TextFormats, which, however does not support non-regular, indefinetly nested
expressions, such as the logical expressions parsed here. Thus the parsing of
this expressions has been developed separately in this package.

## Acknowledgements

This package has been created in context of the DFG project GO 3192/1-1
“Automated characterization of microbial genomes and metagenomes by collection
and verification of association rules”. The funders had no role in study
design, data collection and analysis.



