Metadata-Version: 2.1
Name: pyinaturalist-convert
Version: 0.4.0
Summary: Convert iNaturalist observation data to and from multiple formats
Home-page: https://github.com/pyinat/pyinaturalist-convert
License: MIT
Keywords: inaturalist,biodiversity,export,convert,csv,darwin-core,dataframe,gpx
Author: Jordan Cook
Requires-Python: >=3.8,<4.0
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Typing :: Typed
Provides-Extra: all
Provides-Extra: db
Provides-Extra: docs
Provides-Extra: dwc
Provides-Extra: feather
Provides-Extra: geojson
Provides-Extra: gpx
Provides-Extra: hdf
Provides-Extra: html
Provides-Extra: odp
Provides-Extra: parquet
Provides-Extra: xlsx
Requires-Dist: boto3 (>=1.20); extra == "odp" or extra == "all"
Requires-Dist: flatten-dict (>=0.4.0,<0.5.0)
Requires-Dist: furo (>=2022.2.14.1,<2023.0.0.0); extra == "docs"
Requires-Dist: geojson (>=2.5); extra == "geojson" or extra == "all"
Requires-Dist: gpxpy (>=1.4.2,<2.0.0); extra == "gpx" or extra == "all"
Requires-Dist: myst-parser (>=0.17.0,<0.18.0); extra == "docs"
Requires-Dist: openpyxl (>=2.6); extra == "xlsx" or extra == "all"
Requires-Dist: pandas (>=1.2); extra == "feather" or extra == "hdf" or extra == "parquet" or extra == "xlsx" or extra == "all"
Requires-Dist: pyarrow (>=4.0); extra == "feather" or extra == "parquet" or extra == "all"
Requires-Dist: pyinaturalist (>=0.17.2)
Requires-Dist: sphinx (>=4.2.0,<5.0.0); extra == "docs"
Requires-Dist: sphinx-autodoc-typehints (>=1.17,<2.0); extra == "docs"
Requires-Dist: sphinx-automodapi (>=0.14,<0.15); extra == "docs"
Requires-Dist: sphinx-copybutton (>=0.5); extra == "docs"
Requires-Dist: sphinx-inline-tabs (>=2022.1.2b11,<2023.0.0); extra == "docs"
Requires-Dist: sphinx-panels (>=0.6.0,<0.7.0); extra == "docs"
Requires-Dist: sqlalchemy (>=1.4.36,<2.0.0); extra == "db" or extra == "all"
Requires-Dist: tables (>=3.6); extra == "hdf" or extra == "all"
Requires-Dist: tablib (>=3.0,<4.0)
Requires-Dist: xmltodict (>=0.12); extra == "dwc" or extra == "all"
Project-URL: Documentation, https://pyinaturalist-convert.readthedocs.io
Project-URL: Repository, https://github.com/pyinat/pyinaturalist-convert
Description-Content-Type: text/markdown

# pyinaturalist-convert
[![Build status](https://github.com/pyinat/pyinaturalist-convert/workflows/Build/badge.svg)](https://github.com/pyinat/pyinaturalist-convert/actions)
[![codecov](https://codecov.io/gh/pyinat/pyinaturalist-convert/branch/main/graph/badge.svg?token=Mt3V5H409C)](https://codecov.io/gh/pyinat/pyinaturalist-convert)
[![Docs](https://img.shields.io/readthedocs/pyinaturalist-convert/stable)](https://pyinaturalist-convert.readthedocs.io)
[![PyPI](https://img.shields.io/pypi/v/pyinaturalist-convert?color=blue)](https://pypi.org/project/pyinaturalist-convert)
[![Conda](https://img.shields.io/conda/vn/conda-forge/pyinaturalist-convert?color=blue)](https://anaconda.org/conda-forge/pyinaturalist-convert)
[![PyPI - Python Versions](https://img.shields.io/pypi/pyversions/pyinaturalist-convert)](https://pypi.org/project/pyinaturalist-convert)

This package provides tools to convert iNaturalist observation data to and from a wide variety of
useful formats. This is mainly intended for use with the iNaturalist API
via [pyinaturalist](https://github.com/niconoe/pyinaturalist), but also works with other data sources.

Complete project documentation can be found at [pyinaturalist-convert.readthedocs.io](https://pyinaturalist-convert.readthedocs.io).

# Formats
## Import
* CSV (From either [API results](https://www.inaturalist.org/pages/api+reference#get-observations)
 or the [iNaturalist export tool](https://www.inaturalist.org/observations/export))
* JSON (from API results)
* [`pyinaturalist.Observation`](https://pyinaturalist.readthedocs.io/en/stable/modules/pyinaturalist.models.Observation.html) objects
* Dataframes, Feather, Parquet, and anything else supported by [pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html)
* [iNaturalist GBIF Archive](https://www.inaturalist.org/pages/developers)
* [iNaturalist Taxonomy Archive](https://www.inaturalist.org/pages/developers)
* [iNaturalist Open Data on Amazon](https://github.com/inaturalist/inaturalist-open-data)
* Note: see [API Recommended Practices](https://www.inaturalist.org/pages/api+recommended+practices)
  for details on which data sources are best suited to different use cases

## Export
* CSV, Excel, and anything else supported by [tablib](https://tablib.readthedocs.io/en/stable/formats/)
* Dataframes, Feather, Parquet, and anything else supported by [pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html)
* Darwin Core
* GeoJSON
* GPX
* SQLite
* SQLite + FTS5 text search for taxonomy

# Installation
Install with pip:
```bash
pip install pyinaturalist-convert
```

Or with conda:
```bash
conda install -c conda-forge pyinaturalist-convert
```

To keep things modular, many format-specific dependencies are not installed by default, so you may
need to install some more packages depending on which features you want. Each module's docs lists
any extra dependencies needed, and a full list can be found in
[pyproject.toml](https://github.com/pyinat/pyinaturalist-convert/blob/main/pyproject.toml#L27).

For getting started, it's recommended to install all optional dependencies:
```bash
pip install pyinaturalist-convert[all]
```

# Usage

## Export
Get your own observations and save to CSV:
```python
from pyinaturalist import get_observations
from pyinaturalist_convert import *

observations = get_observations(user_id='my_username')
to_csv(observations, 'my_observations.csv')
```

Or any other supported format:
```python
to_dwc(observations, 'my_observations.dwc')
to_excel(observations, 'my_observations.xlsx')
to_feather(observations, 'my_observations.feather')
to_geojson(observations, 'my_observations.geojson')
to_gpx(observations, 'my_observations.gpx')
to_hdf(observations, 'my_observations.hdf')
to_parquet(observations, 'my_observations.parquet')
df = to_dataframe(observations)
```

## Import
Load your observations from the iNat Export tool, convert to be consistent with
API results, and save to Parquet:
```python
df = load_csv_exports('my_observations.csv')
df.to_parquet('my_observations.parquet')
```

## Download
Download the complete research-grade observations dataset:
```python
download_dwca_observations()
```

And load it into a SQLite database:
```python
load_dwca_observations()
```

And do the same with the complete taxonomy dataset:
```python
download_dwca_taxa()
load_dwca_taxa()
```

Load taxonomy data into a full text search database:
```python
load_taxon_fts_table(languages=['english', 'german'])
```

And get lightning-fast autocomplete results from it:
```python
ta = TaxonAutocompleter()
ta.search('aves')
ta.search('flughund', language='german')
```

