Metadata-Version: 2.1
Name: mnn-meter
Version: 1.0.2
Summary: Tools for quickly building operator latency tables and for accurately predicting model latency (based on Pytorch and MNN)
Author: Haolin Yan
Author-email: haolinyan_xdu@163.com
License: UNKNOWN
Project-URL: Source, https://github.com/makerlin1/MMT
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.5
Description-Content-Type: text/markdown

![img.png](img.png)
---
Tools for quickly building operator 
latency tables and for accurately predicting 
model latency (based on Pytorch and MNN)

[中文版](README_zh.md)
## 1.Installation
MMT is used in both server-side and inference-side situations: 
* on the server side, the operator list is generated according to the specified operator space; 
the delay of a given model is predicted according to the operator delay table. 
* On the inference side, test the operator delay according to the operator list to obtain the operator latency table. 

The server side must install `Pytorch` and `MNN(C++)` at the same time, 
and the inference side must install `MNN(C++)` 

**Note: Be sure to add the `build` folder generated by compiling `MNN` to the environment variable!**

After configuring the above dependencies, install MMT
```
pip install mmn-meter
```

## 2.Start
### 2.1 Modify your models
For your custom model(layer), please override __repr__() with unique representation of the parameters, for example:
```python
    def __init__(self, ...)
     self.name = "ResNetBasicBlock-%d-%d-%d-%d-" % (in_channels, out_channels, stride, kernel)
    ...
    def __repr__(self):
        return self.name
```
**If the results returned by `__repr__()` cannot be differentiated for the same type of operator input with different parameters, 
it is very easy to cause running errors or measurement errors!**

[See how to modify your model](docs/configuration.md)
### 2.2 Write an operator description file
The parameters that determine the specific delay of 
an operator include (operator type, operator 
instantiation parameters, input shape). 
The specific operator space 
needs to be expressed in the following way:
```yaml
resnet18:
    ResNetBasicBlock:
        in_channels: [64, 128, 256, 512]
        out_channels: [64, 128, 256, 512]
        stride: [1]
        kernel: [3, 5, 7]
        input_shape: [[1, 64, 112, 112], [1, 128, 56, 56], [1, 256, 28, 28], [1, 512, 14, 14]]

torch.nn:
    Conv2d:
        in_channels: [3]
        out_channels: [64]
        kernel_size: [7]
        stride: [2]
        padding: [3]
        input_shape: [[1, 3, 224, 224]]

    BatchNorm2d:
        num_features: [64]
        input_shape: [[1, 64, 112, 112]]

    ReLU:
        no_params: true
        input_shape: [[1, 64, 112, 112]]
```
[Refer to how to describe your operator](docs/configuration.md)
### 2.3 Create a list of operators and export the operators to mnn format

```python
from mmt.converter import generate_ops_list

generate_ops_list("ops.yaml", "/path/ops_folder")
```
`ops.yaml` is the operator description file, 
`pathops_folder` is the directory where 
the operator is saved, and the corresponding 
`meta.pkl` will be generated in this directory 
to save the metadata information of the operator.

### 2.4 Record operator delays on the deployment side, and build an operator latency table

```python
from mmt.meter import meter_ops

meter_ops("./ops", times=100)
```
`ops` is the folder where the operator and `meta.pkl` are saved, 
`times` represents the number of repeated tests, 
run the modified program, the delay of the operator 
will be calculated, and the operator latency table will be 
saved as `.ops/meta_latency.pkl` . This file 
specifically records the metadata and corresponding 
latency of all operators.

### 2.5 Predicting model latency on the server side

```python
from mmt.parser import predict_latency

...
model = ResNet18()
pred_latency = predict_latency(model, path, [1, 3, 224, 224], verbose=False)
```
`path` is the path corresponding to `meta_latency.pkl`. 
Note that the shape of the input tensor must be 
the same as the `input_shape` set in the operator 
description.

## 3 Test the prediction error of MMT
will come soon~

