Metadata-Version: 2.1
Name: pandas-nosql
Version: 1.1.0
Summary: A Module to add read and write capabilities to pandas for several nosql databases
Home-page: https://bitbucket.org/blacklotus231/pandas-nosql
Author: James Baker Jr
License: Apache License 2.0
Keywords: pandas,nosql,mongo,mongodb,elasticsearch,redis,cassandra,apache-cassandra
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE

# Pandas-NoSQL
__Pandas-NoSQL adds read and write capabilities to pandas for several nosql databases__

---
Pandas read and write methods for:

* MongoDB
* Elasticsearch
* Redis
* Apache Cassandra

The dependencies used for the functions are:

* pymongo by MongoDB
* elasticsearch by Elastic
* redis-py by Redis
* cassandra-driver by DataStax

---

## Installation
```bash
$ pip install pandas-nosql
```
---
## Documentation

### MongoDB
```python
# Read a MongoDB collection into a DataFrame
# Defaults: host='localhost' port=27017
# For nested collections use normalize=True

import pandas as pd
import pandas_nosql

df = pd.read_mongo(database = 'test_db', collection = 'test_col', normalize = True)

# Write DataFrame to MongoDB collection
# modify_colletion parameter is to help prevent accidental overwrites

df.to_mongo(database = 'test_db', collection = 'test_col2', mode = 'a')
```

### Elasticsearch
```python
import pandas as pd
import pandas_nosql

# Read an Elastic index into a DataFrame
# To Access localhost: 
#   * hosts='https://localhost:9200'
#   * verify_certs=False
# If "xpack.security.enabled: false" use http instead
# To split out _source use split_source=True

elastic_cols = ('make', 'model', 'purchase_date', 'miles')

df = pd.read_elastic(
    hosts = 'https://localhost:9200',
    username = 'elastic',
    password = 'strong_password',
    index = 'test_index',
    fields = elastic_cols,
    verify_certs = False
    split_source = True
    )

# Write DataFrame to Elastic Index
df.to_elastic(
    hosts = 'https://localhost:9200',
    username = 'elastic',
    password = 'strong_password',
    index = 'test_index',
    mode = 'w'
    )
```

### Redis
```python
import pandas as pd
import pandas_nosql

# Read a DataFrame that was sent to Redis using the to_redis method
# A DataFrame not sent to Redis using the to_redis method is not guaranteed to be read properly

# To Access localhost:
#   * host='localhost'
# To persist the DataFrame in Redis use expire_seconds=None
# To set an expiration for the DataFrame pass an integer to expire_seconds

df = pd.read_redis(host = 'localhost', port = 6379, redis_key = 'test_key', expire_seconds = None)

# Write a DataFrame to Redis
df.to_redis(host = 'localhost', port = 6379, redis_key = 'test_key')


```

### Apache Cassandra
```python
import pandas as pd
import pandas_nosql

# Read an Apache Cassandra table into a Panda DataFrame
# To Access localhost:
#   * contact_points=['localhost']
#   * contact_points must be a list

df = pd.read_cassandra(
    contact_points = ['localhost'],
    port = 9042,
    keyspace = 'test_keyspace',
    table = 'test_table'
    )

# Append a DataFrame to Apache Cassandra
# DataFrame Columns must match table Columns
replication = {'class' : 'SimpleStrategy', 'replication_factor' : 1}
df.to_cassandra(
    contact_points = ['localhost'],
    port = 9042,
    keyspace = 'test_keyspace',
    table = 'test_table',
    mode = 'w',
    replication = replication
)
```
