# `dag-cbor`: A Python implementation of [DAG-CBOR](https://ipld.io/specs/codecs/dag-cbor/spec/)

[![Generic badge](https://img.shields.io/badge/python-3.7+-green.svg)](https://docs.python.org/3.7/)
![PyPI version shields.io](https://img.shields.io/pypi/v/dag-cbor.svg)
[![PyPI status](https://img.shields.io/pypi/status/dag-cbor.svg)](https://pypi.python.org/pypi/dag-cbor/)
[![Checked with Mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](https://github.com/python/mypy)
[![Python package](https://github.com/hashberg-io/py-dag-cbor/actions/workflows/python-pytest.yml/badge.svg)](https://github.com/hashberg-io/py-dag-cbor/actions/workflows/python-pytest.yml)


This is a fully compliant Python implementation of the [DAG-CBOR codec](https://ipld.io/specs/codecs/dag-cbor/spec/), a subset of the [Concise Binary Object Representation (CBOR)](https://cbor.io/) supporting the [IPLD Data Model](https://ipld.io/docs/data-model/) and enforcing a unique (strict) encoded representation of items.

You can install this package with `pip`:

```
pip install dag-cbor
```

The [documentation](https://hashberg-io.github.io/py-dag-cbor/dag_cbor/index.html) for this package is automatically generated by [pdoc](https://pdoc3.github.io/pdoc/).

## Basic usage

The core functionality of the library is performed by the `encode` and `decode` functions:

```python
>>> import dag_cbor
>>> dag_cbor.encode({'a': 12, 'b': 'hello!'})
b'\xa2aa\x0cabfhello!'
>>> dag_cbor.decode(b'\xa2aa\x0cabfhello!')
{'a': 12, 'b': 'hello!'}
```

## Usage with binary streams

A buffered binary stream (i.e. an instance of `io.BufferedIOBase`) can be passed to the `encode` function using the optional keyword argument `stream`, in which case the encoded bytes are written to the stream rather than returned:

```python
>>> from io import BytesIO
>>> mystream = BytesIO()
>>> dag_cbor.encode({'a': 12, 'b': 'hello!'}, stream=mystream)
>>> mystream.getvalue()
b'\xa2aa\x0cabfhello!'
```

 A buffered binary stream can be passed to the decode function instead of a `bytes` object, in which case the contents of the stream are read in their entirety and decoded:

```python
>>> mystream = BytesIO(b'\xa2aa\x0cabfhello!')
>>> dag_cbor.decode(mystream)
{'a': 12, 'b': 'hello!'}
```

## Random data

The `random` module contains functions to generate random data compatible with DAG-CBOR encoding. The functions are named `rand_X`, where `X` is one of:

- `int` for uniformly distributed integers
- `float` for uniformly distributed floats, with fixed decimals
- `bytes` for byte-strings of uniformly distributed length, with uniformly distributed bytes
- `str` for strings of uniformly distributed length, with uniformly distributed codepoints (all valid UTF-8 strings, by rejection sampling)
- `bool` for `False` or `True` (50% each)
- `bool_none` for `False`, `True` or `None` (33.3% each)
- `list` for lists of uniformly distributed length, with random elements of any type
- `dict` for dictionaries of uniformly distributed length, with distinct random string keys and random values of any type
- `cid` for CID data (instance of `BaseCID` from the [`py-cid`](https://github.com/ipld/py-cid) package)

The function call `rand_X(n)` returns an iterator yielding a stream of `n` random values of type `X`:

```python
>>> import pprint
>>> import dag_cbor
>>> options = dict(min_codepoint=0x41, max_codepoint=0x5a, include_cid=False)
>>> with dag_cbor.random.rand_options(**options):
...     for d in dag_cbor.random.rand_dict(3):
...             pprint.pp(d)
...
{'BIQPMZ': b'\x85\x1f\x07/\xcc\x00\xfc\xaa',
 'EJEYDTZI': {},
 'PLSG': {'G': 'JFG',
          'HZE': -61.278,
          'JWDRKRGZ': b'-',
          'OCCKQPDJ': True,
          'SJOCTZMK': False},
 'PRDLN': 39.129,
 'TUGRP': None,
 'WZTEJDXC': -69.933}
{'GHAXI': 39.12,
 'PVUWZLC': 4.523,
 'TDPSU': 'TVCADUGT',
 'ZHGVSNSI': [-57, 9, -78.312]}
{'': 11, 'B': True, 'FWD': {}, 'GXZBVAR': 'BTDWMGI', 'TDICHC': 87}
```

The function call `rand_X()`, without the positional argument `n`, would instead yield an infinite stream of random values. The `rand_options(**options)` context manager is used to set options temporarily: in the example above, we set string characters to be uppercase alphabetic (codepoints `0x41`-`0x5a`) and we excluded CID values from being generated. For the full list of functions and options, please refer to the [`dag_cbor.random` documentation](https://hashberg-io.github.io/py-dag-cbor/dag_cbor/random.html).

