Whats-this-rock
================

<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
<p>
This project deploys a telegram bot that classifies rock images into 1
of 7 types.</br>
<img src="https://i.imgur.com/cDrrfqF.jpg" alt="What's my name?" width=20% align="right"/>
</p>

![GitHub Workflow
Status](https://img.shields.io/github/workflow/status/udaylunawat/Whats-this-rock/Lint%20Code%20Base.png)
![GitHub
issues](https://img.shields.io/github/issues-raw/udaylunawat/Whats-this-rock.png)
[![GitHub
Super-Linter](https://github.com/nvuillam/npm-groovy-lint/workflows/Lint%20Code%20Base/badge.svg)](https://github.com/marketplace/actions/super-linter)

![code-size](https://img.shields.io/github/languages/code-size/udaylunawat/Whats-this-rock.png)
![repo-size](https://img.shields.io/github/repo-size/udaylunawat/Whats-this-rock.png)
![top-language](https://img.shields.io/github/languages/top/udaylunawat/Whats-this-rock.png)

![Python](https://img.shields.io/badge/python-v3.8.0+-success.svg)
![Tensorflow](https://img.shields.io/badge/tensorflow-v2.9.0+-success.svg)

[![contributions
welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/dwyl/esta/issues)
[![HitCount](https://hits.dwyl.com/udaylunawat/Whats-this-rock.svg?style=flat)](http://hits.dwyl.com/udaylunawat/Whats-this-rock)

![](https://img.shields.io/twitter/follow/udaylunawat?style=social.png)

This package uses [tensorflow](https://github.com/tensorflow/tensorflow)
to accelerate deep learning experimentation.

MLOps workflow like - Experiment Tracking - Model Management -
Hyperparameter Tuning

was all done using [Weights & Biases](https://wandb.ai)

Additionally, [nbdev](https://github.com/fastai/nbdev) was used to -
develop the package - produce documentation based on a series of
notebooks. - CI - publishing to
[PyPi](https://pypi.org/project/rocks-classifier/)

# Inspiration

> [The common complaint that you need massive amounts of data to do deep
> learning  can be a very long way from the
> truth!](https://youtu.be/J6XcP4JOHmk?t=2029)

    You very often don’t need much data at all, a lot of people are looking for ways to share data and aggregate data, but that’s unnecessary.They assume they need more data than they do, cause they’re not familiar with the basics of transfer learning which is this critical technique for needing orders of magnitudes less data.

> [Jeremy
> Howards](https://en.wikipedia.org/wiki/Jeremy_Howard_(entrepreneur))

## Installation & Training Steps

### Install

To install, use `pip`:

    pip install git+https://github.com/udaylunawat/Whats-this-rock.git

### Use the Telegram Bot

You can try the bot [here](https://t.me/test7385_bot) on Telegram.

> Type `/help` to get instructions in chat.

### Deploy Telegram Bot

``` bash
rocks_deploy_bot
```

### Train Model

Run these commands

``` bash
rocks_train_model epochs=3
```

You can try different models and parameters by editing `config.json`.

By using Hydra it’s now much more easier to override parameters like
this

``` bash
rocks_train_model wandb.project=Whats-this-rockv \
                  dataset_id=[1,2] \
                  epochs=50 \
                  backbone=resnet
```

<p align="left">
<img src="https://i.imgur.com/1nBpPC5.png" alt="result" width=100%/>
</p>

### Wandb Sweeps (Hyperparameter Tuning)

Edit configs/sweeps.yaml

``` bash
wandb sweep \
--project Whats-this-rock \
--entity udaylunawat \
configs/sweep.yaml
```

This will return a command with \$sweepid

``` bash
wandb agent udaylunawat/Whats-this-rock/$sweepid
```

## Demo

|                                                                                                                                                                          |                                                                                                                                              |                                                                                                                                                                                     |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| ![alt colab](https://www.tensorflow.org/images/colab_logo_32px.png)[Run in Colab](https://colab.research.google.com/drive/1N1CIqdOKlJSJla5PU53Yn9KWSao47eMv?usp=sharing) | ![alt Source](https://www.tensorflow.org/images/GitHub-Mark-32px.png)[View Source on GitHub](https://github.com/udaylunawat/Whats-this-rock) | ![alt noteboook](https://www.tensorflow.org/images/download_logo_32px.png)[Download Notebook](https://github.com/udaylunawat/Whats-this-rock/blob/main/notebooks/03_training.ipynb) |

## Features

<table border="0" class="left">
<tr>
<td>
<b>\<style=‘font-size:37px’\>Features added</b>
</td>
<td>
<b>\<style=‘font-size:37px’\>Features planned</b>
</td>
</tr>
<tr>
<td>

- Wandb

- Datasets

  - 4 Datasets

- Augmentation

  - keras-cv
  - Regular Augmentation

- Sampling

  - Oversampling
  - Undersampling
  - Class weights

- Remove Corrupted Images

- Try Multiple Optimizers (Adam, RMSProp, AdamW, SGD)

- Generators

  - TFDS datasets
  - ImageDataGenerator

- Models

  - ConvNextTiny
  - BaselineCNN
  - Efficientnet
  - Resnet101
  - MobileNetv1
  - MobileNetv2
  - Xception

- LRScheduleer, LRDecay

  - Baseline without scheduler
  - Step decay
  - Cosine annealing
  - Classic cosine annealing with bathc steps w/o restart

- Model Checkpoint, Resume Training

- Evaluation

  - Confusion Matrix
  - Classification Report

- Deploy Telegram Bot

  - Heroku - Deprecated
  - Railway
  - Show CM and CL in bot

- Docker

- GitHub Actions

  - Deploy Bot when bot.py is updated.
  - Lint code using GitHub super-linter

- Configuration Management

  - ml-collections
  - Hydra

- Performance improvement

  - Convert to tf.data.Dataset

- Linting & Formatting

  - Black
  - Flake8
  - isort
  - pydocstyle

- Add Badges

  - Linting

- found the classes that the model is performing terribly on

- nbdev

- CI

- documentation

  </td>
  <td>

- [ ] Deploy to Huggingface spaces

- [ ] Accessing the model through FastAPI (Backend)

- [ ] Streamlit (Frontend)

- [ ] convert models.py to Classes and more OOP style

- [ ] Group Runs

  - [ ] kfold cross validation

- [ ] [WandB
  Tables](https://twitter.com/ayushthakur0/status/1508962184357113856?s=21&t=VRL-ZXzznXV_Hg2h7QnjuA)

- [ ] find the long tail examples or hard examples,

- [ ] Add Badges

  - [ ] Railway

  </td>

</tr>
</table>

## Technologies Used

|                                                                                                                                                                                                                                                    |                                                                                                                                                                                                                 |                                                                                                                                                                                                  |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [![Google Colab](https://img.shields.io/badge/Compute-Google%20Colab-F9AB00?logo=googlecolab&logoColor=fff&style=for-the-badge.png)](https://colab.research.google.com/drive/1N1CIqdOKlJSJla5PU53Yn9KWSao47eMv?usp=sharing "Google collaboratory") | [![python-telegram-bot](https://img.shields.io/badge/ChatBot-Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=black.png)](https://github.com/python-telegram-bot/python-telegram-bot "Telegram Bot") | [![Railway](https://img.shields.io/badge/Deployment-Railway-131415?style=for-the-badge&logo=railway&logoColor=black.png)](https://railway.app "Railway")                                         |
| [![Jupyter Notebook](https://img.shields.io/badge/Coding-jupyter-%23FA0F00.svg?style=for-the-badge&logo=jupyter&logoColor=black)](https://jupyter.org "Jupyter")                                                                                   | [![Python](https://img.shields.io/badge/Language-python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54.png)](https://www.python.org/ "Python")                                                         | [![GitHub Actions](https://img.shields.io/badge/CI-github%20actions-%232671E5.svg?style=for-the-badge&logo=githubactions&logoColor=black)](https://github.com/features/actions "Github Actions") |
| [![Weights & Biases](https://img.shields.io/badge/MLOps-Weights%20%26%20Biases-FFBE00?logo=weightsandbiases&logoColor=000&style=for-the-badge.png)](http://wandb.ai "Weights & Biases")                                                            | [![TensorFlow](https://img.shields.io/badge/ML_Framework-TensorFlow-%23FF6F00.svg?style=for-the-badge&logo=TensorFlow&logoColor=black)](https://www.tensorflow.org/ "Tensorflow")                               | [![macOS](https://img.shields.io/badge/OS-mac%20os-000000?style=for-the-badge&logo=macos&logoColor=F0F0F0.png)](https://apple.com/macos "macOS")                                                 |
| [![Docker](https://img.shields.io/badge/Container-docker-%230db7ed.svg?style=for-the-badge&logo=docker&logoColor=black)](http://docker.com "Docker")                                                                                               | [![Git](https://img.shields.io/badge/Version_Control-git-%23F05033.svg?style=for-the-badge&logo=git&logoColor=black)](https://git-scm.com "Git")                                                                | [![Hydra](https://img.shields.io/badge/config-hydra1.1-89b8cd?style=for-the-badge&labelColor=gray)](http://hydra.cc "Hydra")                                                                     |
| [![Black](https://img.shields.io/badge/code%20style-black-black.svg?style=for-the-badge&labelColor=gray)](http://github.com/psf/black "Black")                                                                                                     |                                                                                                                                                                                                                 |                                                                                                                                                                                                  |

<!-- ## Technologies Used
::: {layout-ncol=3}

[![][jupyter-shield]][Jupyter]

[![][wandb-shield]][wandb]

[![][git-shield]][git]

[![][black-shield]][black]

[![][hydra-shield]][Hydra]

[![][docker-shield]][Docker]

[![][colab-shield]][googlecolab]

[![][telegram-shield]][python-telegram-bot]

[![][railway-shield]][Railway]

[![][python-shield]][Python]

[![][githubactions-shield]][GitHubActions]

[![][tensorflow-shield]][TensorFlow]

[![][mac-shield]][Macos]

::: -->

## Directory Tree

    ├── imgs                              <- Images for skill banner, project banner and other images
    │
    ├── configs                           <- Configuration files
    │   ├── configs.yaml                  <- config for single run
    │   └── sweeps.yaml                   <- confguration file for sweeps hyperparameter tuning
    │
    ├── data
    │   ├── corrupted_images              <- corrupted images will be moved to this directory
    │   ├── misclassified_images          <- misclassified images will be moved to this directory
    │   ├── bad_images                    <- Bad images will be moved to this directory
    │   ├── duplicate_images              <- Duplicate images will be moved to this directory
    │   ├── sample_images                 <- Sample images for inference
    │   ├── 0_raw                         <- The original, immutable data dump.
    │   ├── 1_external                    <- Data from third party sources.
    │   ├── 2_interim                     <- Intermediate data that has been transformed.
    │   └── 3_processed                   <- The final, canonical data sets for modeling.
    │
    ├── notebooks                         <- Jupyter notebooks. Naming convention is a number (for ordering),
    │                                        the creator's initials, and a short `-` delimited description, e.g.
    │                                        1.0-jqp-initial-data-exploration`.
    │
    │
    ├── rocks_classifier                  <- Source code for use in this project.
    │   │
    │   ├── data                          <- Scripts to download or generate data
    │   │   ├── download.py
    │   │   ├── preprocess.py
    │   │   └── utils.py
    │   │
    │   ├── callbacks                     <- functions that are executed during training at given stages of the training procedure
    │   │   └── callbacks.py
    │   │
    │   ├── models                        <- Scripts to train models and then use trained models to make
    │   │   │                                predictions
    │   │   ├── evaluate.py
    │   │   ├── models.py
    │   │   ├── predict.py
    │   │   ├── train.py
    │   │   └── utils.py
    │   │
    │   │
    │   └── visualization                 <- Scripts for visualizations
    │
    ├── .dockerignore                     <- Docker ignore
    ├── .gitignore                        <- GitHub's excellent Python .gitignore customized for this project
    ├── LICENSE                           <- Your project's license.
    ├── README.md                         <- The top-level README for developers using this project.
    ├── CHANGELOG.md                      <- Release changes.
    ├── CODE_OF_CONDUCT.md                <- Code of conduct.
    ├── CONTRIBUTING.md                   <- Contributing Guidelines.
    ├── settings.ini                      <- configuration.
    ├── README.md                         <- The top-level README for developers using this project.
    ├── requirements.txt                  <- The requirements file for reproducing the analysis environment, e.g.
    │                                        generated with `pip freeze > requirements.txt`
    └── setup.py                          <- makes project pip installable (pip install -e .) so src can be imported

## Bug / Feature Request

If you find a bug (the site couldn’t handle the query and / or gave
undesired results), kindly open an issue
[here](https://github.com/udaylunawat/Whats-this-rock/issues) by
including your search query and the expected result.

If you’d like to request a new function, feel free to do so by opening
an issue [here](https://github.com/udaylunawat/Whats-this-rock/issues).
Please include sample queries and their corresponding results.

<!-- CONTRIBUTING -->

## Contributing

- Contributions make the open source community such an amazing place to
  learn, inspire, and create.
- Any contributions you make are **greatly appreciated**.
- Check out our [contribution guidelines](../CONTRIBUTING.md) for more
  information.

## License

LinkFree is licensed under the MIT License - see the [LICENSE](LICENSE)
file for details.

## Credits

- [Dataset 1 - by Mahmoud
  Alforawi](https://www.kaggle.com/datasets/mahmoudalforawi/igneous-metamorphic-sedimentary-rocks-and-minerals)
- [Dataset 2 - by
  salmaneunus](https://www.kaggle.com/datasets/salmaneunus/rock-classification)
- nbdev inspiration - [tmabraham](https://github.com/tmabraham/UPIT)
- 

## Support

This project needs a ⭐️ from you. Don’t forget to leave a star ⭐️

<br>
<p align="center">
Walt might be the one who knocks <br> but Hank is the one who rocks.
</br>
</p>
