Metadata-Version: 2.1
Name: ingesture
Version: 0.1.0
Summary: Ingest gesturally-structured data into models with multiple export
Home-page: https://github.com/auto-pi-lot/ingest
License: MPL-2.0
Author: sneakers-the-rat
Author-email: JLSaunders987@gmail.com
Requires-Python: >=3.9,<3.11
Classifier: License :: OSI Approved
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.9
Provides-Extra: docs
Provides-Extra: nwb
Requires-Dist: PyYAML (>=6.0,<7.0)
Requires-Dist: Sphinx (>=4.4.0,<5.0.0); extra == "docs"
Requires-Dist: furo (>=2022.3.4,<2023.0.0); extra == "docs"
Requires-Dist: pandas (>=1.1,<2.0)
Requires-Dist: parse (>=1.19.0,<2.0.0)
Requires-Dist: pydantic (>=1.9.0,<2.0.0)
Requires-Dist: pynwb (>=2.5.1,<3.0.0); extra == "nwb"
Requires-Dist: scipy (>=1.8.0,<2.0.0)
Project-URL: Repository, https://github.com/auto-pi-lot/ingest
Description-Content-Type: text/markdown

# ingesture
Ingest gesturally-structured data into models with multiple export

This package is **not** even close to usable, and is just a sketch at the moment.
If for some reason you see it and would like to work on it with me, feel free to
open an issue :)


# Declare your data

Even the most disorganized data system has *some* structure. We want to be able
to recover it without demanding that the entire acquisition process be reworked

To do that, we can use a family of specifiers to tell `ingest` where to get metadata

```python
from datetime import datetime
from ingesture import Schema, spec
from pydantic import Field

class MyData(Schema):
    # parse metadata in a filename
    subject_id: str = Field(..., 
        description="The ID of a subject of course!",
        spec = spec.Path('electrophysiology_{subject_id}_*.csv')
    )
    # parse multiple values at once
    date: datetime
    experimenter: str
    date, experimenter = Field(...,
        spec = spec.Path('{date}_{experimenter}_optodata.h5')
    )
    
    
    # from inside a .mat file
    other_meta: int = Field(...
        spec = spec.Mat(
            path='**/notebook.mat', # 2 **s mean we can glob recursively
            field = ('nb', 1, 'user') # index recursively through the .mat
        )
    )
    # and so on
```

Then, parse your schema from a folder

```python
data = MyData.make('/home/lab/my_data')
```

Or a bunch of them!

```python
data = MyData.make('/home/lab/my_datas/*')
```

## Multiple Strategies

`todo`

## Hierarchical Modeling

Our data is rarely a single type, often there is a repeatable substructure that
is paired with different macro-structures: eg. you have open-ephys data within a directory
with behavioral data in one experiment and paired with optical data in another.

Make submodels and recombine them freely...

`todo`


# Export Data

Once we have data in an abstract model, then we want to be able to export it to
multiple formats! To do that we need an interface that describes
the basic methods of interacting with that format (eg. .csv files are
written differently than hdf5 files) and a mapping from our model fields
to locations, attributes, and names in the target format.

## Pydantic base export

### json

## From the Field specification

```python
class MyData(Schema):
    subject_id: str = Field(
        spec = ...,
        nwb_field = "NWBFile:subject_id"
    )
```

## From a `Mapping` object

```python

class NWB_Map(Mapping):
    subject_id = 'NWBFile:subject_id'

class MyData(Schema):
    subject_id: str = Field(...)
    
    __mapping__ = NWB_Map

```
    
