Metadata-Version: 2.1
Name: mudatasets
Version: 0.0.2
Summary: Multimodal Datasets in MuData format
Home-page: https://github.com/PMBio/mudatasets
Author: Danila Bredikhin
Author-email: danila.bredikhin@embl.de
Requires-Python: >= 3.7
Description-Content-Type: text/markdown
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Intended Audience :: Science/Research
Requires-Dist: mudata
Requires-Dist: tqdm
Requires-Dist: requests
Requires-Dist: muon
Requires-Dist: sphinx >= 4.0 ; extra == "docs"
Requires-Dist: sphinx-rtd-theme ; extra == "docs"
Requires-Dist: readthedocs-sphinx-search ; extra == "docs"
Requires-Dist: nbsphinx ; extra == "docs"
Requires-Dist: sphinx_automodapi ; extra == "docs"
Requires-Dist: insegel ; extra == "docs"
Requires-Dist: muon ; extra == "muon"
Project-URL: Documentation, https://mudatasets.readthedocs.io/en/latest/
Provides-Extra: docs
Provides-Extra: muon

# Multimodal Datasets

`mudatasets` provides some public datasets with multimodal data, primarily focusing on multimodal omics datasets.

[MuData library](https://github.com/PMBio/mudata) | [MuData documentation](https://mudata.readthedocs.io/)

## Installation

[![PyPi version](https://img.shields.io/pypi/v/mudatasets)](https://pypi.org/project/mudatasets)

```
# Stable, with muon
pip install "mudatasets[muon]"
# Dev
pip install git+https://github.com/gtca/mudatasets
```

## Getting started

```py
import mudatasets as mds
```

### Find available datasets

```py
mds.list_datasets()
```

### Load a dataset

```py
mdata = mds.load("pbmc3k_multiome")
print(mdata)
```

Some common attributes for `.load()` are:

- `data_dir=` for location to save the dataset (`~/mudatasets/` by default)
- `with_info=True` for also returning the second argument with dataset description as a dictionary (`False` by default)
- `backed=True` for reading data in a backed format, only for `.h5mu` and `.h5ad` files (`True` by default)
- `files=` for downloading specific files from the dataset
- `full=True` for downloading all the files defined for the dataset (`False` by default)

### Get dataset info

```py
mds.info("pbmc3k_multiome")
```

### List dataset file names

```py
mds.list_files("pbmc3k_multiome")
```

### Webpage with all the files

```py
mds.serve_webpage(port=8000)
```

This command will launch a server providing a simple (temporarily created) HTML page at http://localhost:8000 with files across all of the datasets listed.

