Metadata-Version: 2.1
Name: memory-recommender
Version: 0.0.1
Summary: A memory-based recommender system
Home-page: https://github.com/k-w-lee/
Author: Morris Lee
Author-email: info.leekahwin@gmail.com
Project-URL: Bug Tracker, https://github.com/k-w-lee/
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE


# A package to make common data analysis easier

**Objective**: To implement memory-based recommender.

To install the package

```
pip install 
```

Let me show you how the package works

**Input [1]**:

```python
df2=df3
index_col='userId'
columns_col='title'
values_col='rating'
random_state_value =42
proposed_test_size=0.2

X_train, X_test, matrix_train_norm, matrix_train_norm_treated_pearson, matrix_test = recommender_pre_processing(df2, index_col=index_col, columns_col=columns_col, values_col=values_col, random_state_value =random_state_value, proposed_test_size=proposed_test_size)
```

**Output [1]**:

```
STATUS: Unique value for userId = 386
STATUS: The proportion of the stratified splitting is 0.4246 to be able to perform stratify split
STATUS: The dataframe is splitted with test size of 0.4246
STATUS: Dimension of "X_train" = (471, 4)
STATUS: Dimension of "X_test" = (349, 4)
STATUS: Pivoted for matrix training set
STATUS: Dimension of "train matrix" = (206, 422)
STATUS: Pivoted for matrix testing set
STATUS: Dimension of "test matrix" = (206, 312)
STATUS: Matrix train first 5 rows
```

**Input [2]**:
```python
# To identify whether there is any null values:
m.null(df,'df')

# To easy print dimension of a dataframe
m.shape(df, 'df')
```

**Output [2]**:
```
STATUS: There is null value in dataframe
STATUS: Nulls of df = {'col3': '1 (20.0%)', 'col4': '1 (20.0%)'} of total 5
STATUS: Dimension of "df" = (5, 4)
```

**Input [3]**:
```python
# To identify whether there is any duplicate values in a column:
m.duplicate(df, 'col3')
```
**Output [3]**:
```
STATUS: There are 1 duplicate values in the column of "col3"
```

**Input [4]**:
```python
# To easy print value counts of a column, including also percentage:
m.vc(df, 'col3')
```
**Output [4]**:
```
+----------+---------+------------------+
| col3     |   count |   percentage (%) |
+==========+=========+==================+
| dog      |       2 |               50 |
+----------+---------+------------------+
| dragon   |       1 |               25 |
+----------+---------+------------------+
| elephant |       1 |               25 |
+----------+---------+------------------+
```

**Input [5]**:
```python
# To easy drop a column:
m.drop(df, 'col3')
```
**Output [5]**:
```
+----+--------+--------+--------+
|    |   col1 |   col2 |   col4 |
+====+========+========+========+
|  0 |      1 |      3 |      9 |
+----+--------+--------+--------+
|  1 |      2 |      4 |      8 |
+----+--------+--------+--------+
|  2 |      3 |      5 |    nan |
+----+--------+--------+--------+
|  3 |      4 |      6 |      6 |
+----+--------+--------+--------+
|  4 |      5 |      7 |      5 |
+----+--------+--------+--------+
```
**Input [6]**:
```python
# To easy one_hot_encode a column:
m.one_hot_encode(df, 'col3')
```
**Output [6]**:
```
+----+--------+--------+--------+-------+----------+------------+
|    |   col1 |   col2 |   col4 |   dog |   dragon |   elephant |
+====+========+========+========+=======+==========+============+
|  0 |      1 |      3 |      9 |     1 |        0 |          0 |
+----+--------+--------+--------+-------+----------+------------+
|  1 |      2 |      4 |      8 |     0 |        0 |          0 |
+----+--------+--------+--------+-------+----------+------------+
|  2 |      3 |      5 |    nan |     1 |        0 |          0 |
+----+--------+--------+--------+-------+----------+------------+
|  3 |      4 |      6 |      6 |     0 |        0 |          1 |
+----+--------+--------+--------+-------+----------+------------+
|  4 |      5 |      7 |      5 |     0 |        1 |          0 |
+----+--------+--------+--------+-------+----------+------------+
```

## Merging -A simplified and smarter way to merge your dataset

```python
mergex(df1 ,df2, column1, column2, df1_name=None, df2_name=None)
```

This is contributed by [Morris Lee](http://www.morris-lee.com/).
