Metadata-Version: 2.1
Name: pii-extract-plg-regex
Version: 0.1.1
Summary: Regex modules for the extraction of PII from text chunks
Home-page: https://github.com/piisa/pii-extract-plg-regex
Download-URL: https://github.com/piisa/pii-extract-plg-regex/tarball/v0.1.1
Author: Paulo Villegas
Author-email: paulo.vllgs@gmail.com
License: Apache
Keywords: PIISA, PII
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Development Status :: 4 - Beta
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Provides-Extra: test
License-File: LICENSE

# Pii Extractor plugin: regex

This repository builds a Python package that installs a pii-extract-base
plugin to performs PII detection for text data based on regular expressions
(with optional context).

The PII Tasks in the package are structured by language & country, since many
of the PII elements are language- and/or -country dependent.


## Requirements

The package
 * needs at least Python 3.8
 * needs the pii-data and the pii-extract-base base packages
 * uses the python-stdnum package to validate numeric identifiers


## Usage

The package does not have any user-facing entry points, and it is used
automatically by the PIISA framework.


## Building

The provided Makefile can be used to process the package:
 * `make pkg` will build the Python package, creating a file that can be
   installed with `pip`
 * `make unit` will launch all unit tests (using pytest, so pytest must be
   available)
 * `make install` will install the package in a Python virtualenv. The
   virtualenv will be chosen as, in this order:
     - the one defined in the `VENV` environment variable, if it is defined
     - if there is a virtualenv activated in the shell, it will be used
     - otherwise, a default is chosen as `/opt/venv/bigscience` (it will be
       created if it does not exist)


## Contributing

To add a new PII processing task, please see the contributing instructions.


