Metadata-Version: 2.1
Name: aiutil
Version: 0.77.1
Summary: A utils Python package for data scientists.
Home-page: https://github.com/legendu-net/aiutil
License: MIT
Keywords: AI,Machine Learning,tools,utils
Author: Benjamin Du
Author-email: longendu@yahoo.com
Requires-Python: >=3.10,<3.11
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Provides-Extra: admin
Provides-Extra: all
Provides-Extra: cv
Provides-Extra: docker
Provides-Extra: jupyter
Provides-Extra: pdf
Requires-Dist: GitPython (>=3.0.0)
Requires-Dist: PyPDF2 (>=1.26.0) ; extra == "pdf" or extra == "all"
Requires-Dist: PyYAML (>=5.3.1)
Requires-Dist: black (>=22.12.0,<23.0.0) ; extra == "jupyter" or extra == "all"
Requires-Dist: dateparser (>=0.7.1)
Requires-Dist: docker (>=4.4.0) ; extra == "docker" or extra == "all"
Requires-Dist: dulwich (>=0.20.24)
Requires-Dist: loguru (>=0.3.2)
Requires-Dist: nbconvert (>=5.6.1) ; extra == "jupyter" or extra == "all"
Requires-Dist: nbformat (>=5.0.7) ; extra == "jupyter" or extra == "all"
Requires-Dist: networkx (>=2.5) ; extra == "docker" or extra == "all"
Requires-Dist: notifiers (>=1.2.1)
Requires-Dist: numba (>=0.53.0rc1.post1)
Requires-Dist: opencv-python (>=4.0.0.0) ; extra == "cv" or extra == "all"
Requires-Dist: pandas (>=1.2.0)
Requires-Dist: pandas-profiling (>=2.9.0)
Requires-Dist: pathspec (>=0.8.1)
Requires-Dist: pillow (>=7.0.0) ; extra == "cv" or extra == "all"
Requires-Dist: psutil (>=5.7.3) ; extra == "admin" or extra == "all"
Requires-Dist: pytest (>=3.0)
Requires-Dist: python-magic (>=0.4.0)
Requires-Dist: requests (>=2.20.0) ; extra == "docker" or extra == "all"
Requires-Dist: scikit-image (>=0.18.3)
Requires-Dist: sqlparse (>=0.4.1)
Requires-Dist: toml (>=0.10.0)
Requires-Dist: tqdm (>=4.59.0)
Project-URL: Repository, https://github.com/legendu-net/aiutil
Description-Content-Type: text/markdown

# [aiutil](https://github.com/legendu-net/aiutil): Data Science Utils

This is a Python pacakage that contains misc utils for Data Science.

1. Misc enhancement of Python's built-in functionalities.
    - string
    - collections
    - pandas DataFrame
    - datetime
2. Misc other tools
    - `aiutil.filesystem`: misc tools for querying and manipulating filesystems; convenient tools for manipulating text files.
    - `aiutil.url`: URL formatting for HTML, Excel, etc.
    - `aiutil.sql`: SQL formatting
    - `aiutil.cv`: some more tools (in addition to OpenCV) for image processing
    - `aiutil.shell`: parse command-line output to a pandas DataFrame
    - `aiutil.shebang`: auto correct SheBang of scripts
    - `aiutil.poetry`: tools for making it even easier to manage Python project using Poetry
    - `aiutil.pdf`: easy and flexible extracting of PDF pages
    - `aiutil.memory`: query and consume memory to a specified range
    - `aiutil.notebook`: Jupyter/Lab notebook related tools
    - `aiutil.dockerhub`: managing Docker images on DockerHub in batch mode using Python
    - `aiutil.hadoop`: 
        - A Spark application log analyzing tool for identify root causes of failed Spark applications.
        - Pythonic wrappers to the `hdfs` command.
        - A auto authentication tool for Kerberos.
        - An improved version of `spark_submit`.
        - Other misc PySpark functions. 
    
## Supported Operating Systems and Python Versions

Python >= 3.10 on Linux, macOS and Windows.

## Installation

```bash
pip3 install --user -U aiutil
```
Use the following commands if you want to install all components of aiutil. 
Available additional components are `cv`, `docker`, `pdf`, `jupyter`, `admin` and `all`.
```bash
pip3 install --user -U aiutil[all]
```

