Metadata-Version: 2.1
Name: netml
Version: 0.0.2
Summary: Network anomaly detection via machine learning
Home-page: https://github.com/chicago-cdac/netml
License: UNKNOWN
Description: # netml
        
        `netml` is a network anomaly detection library written in Python.
        
        This library contains two primary submodules:
        
        * pcap parser: `pparser`\
        `pparser` is for parsing pcaps to flow features, using [Scapy](https://scapy.net/).
        
        * novelty detection modeling: `ndm`\
        `ndm` is for detecting novelty / anomaly, via different models, such as OCSVM.
        
            
        ## Installation
        
        `netml` is available on PyPI:
        
            pip install netml
        
        Or, from a repository clone:
        
            pip install .
        
        
        ## Use
        
        ### PCAP to features
        
        ```python3
        import os
        
        from netml.pparser.parser import PCAP
        from netml.utils.tool import dump_data
        
        RANDOM_STATE = 42
        
        pcap_file = 'data/demo.pcap'
        pp = PCAP(pcap_file, flow_ptks_thres=2, verbose=10, random_state=RANDOM_STATE)
        
        # extract flows from pcap
        pp.pcap2flows(q_interval=0.9)
        
        # label each flow with a label
        label_file = 'data/demo.csv'
        pp.label_flows(label_file=label_file)
        
        # extract features from each flow given feat_type
        feat_type = 'IAT'
        pp.flow2features(feat_type, fft=False, header=False)
        
        # dump data to disk
        X, y = pp.features, pp.labels
        out_dir = os.path.join('out', os.path.dirname(pcap_file))
        dump_data((X, y), out_file=f'{out_dir}/demo_{feat_type}.dat')
        
        print(pp.features.shape, pp.pcap2flows.tot_time, pp.flow2features.tot_time)
        ```
        
        ### Novelty detection
        
        ```python3
        import os
        
        from sklearn.model_selection import train_test_split
        
        from netml.ndm.model import MODEL
        from netml.ndm.ocsvm import OCSVM
        from netml.utils.tool import dump_data, load_data
        
        RANDOM_STATE = 42
        
        # load data
        data_file = 'out/data/demo_IAT.dat'
        X, y = load_data(data_file)
        # split train and test test
        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=RANDOM_STATE)
        
        # create detection model
        model = OCSVM(kernel='rbf', nu=0.5, random_state=RANDOM_STATE)
        model.name = 'OCSVM'
        ndm = MODEL(model, score_metric='auc', verbose=10, random_state=RANDOM_STATE)
        
        # learned the model from the train set
        ndm.train(X_train, y_train)
        
        # evaluate the learned model
        ndm.test(X_test, y_test)
        
        # dump data to disk
        out_dir = os.path.dirname(data_file)
        dump_data((model, ndm.history), out_file=f'{out_dir}/{ndm.model_name}-results.dat')
        
        print(ndm.train.tot_time, ndm.test.tot_time, ndm.score)
        ```
        
        For more examples, see the `examples/` directory in the source repository.
        
        
        ## Architecture
        
        - docs/: 
            includes all documents (such as APIs)
        - examples/: 
            includes toy examples and datasets for you to play with it 
        - ndm/: 
            includes different detection models (such as OCSVM)
        - pparser/: 
            includes pcap propcess (feature extraction from pcap) 
        - scripts/: 
            others (such as xxx.sh, make) 
        - tests/: 
            includes test cases
        - utils/: 
            includes common functions (such as load data and dump data)
        - visul/: 
            includes visualization functions
        - LICENSE.txt
        - readme.md
        - requirements.txt
        - setup.py
        
        
        ## To Do
        
        The current version just implements basic functions. We still need to further evaluate and optimize them continually. 
        
        - Evaluate 'pparser' performance on different pcaps
        - Add 'test' cases
        - Add license
        - Add more examples
        - Generated docs from docs-string automatically
        
        Welcome to make any comments to make it more robust and easier to use!
        
        
        ## Development
        
        Development dependencies may be installed via the `dev` extras (below assuming a source checkout):
        
            pip install --editable .[dev]
        
        (Note: the installation flag `--editable` is also used above to instruct `pip` to place the source checkout directory itself onto the Python path, to ensure that any changes to the source are reflected in Python imports.)
        
        Development tasks are then managed via [`argcmdr`](https://github.com/dssg/argcmdr) sub-commands of `manage …`, (as defined by the repository module `manage.py`), _e.g._:
        
            manage bump patch -m "initial release of netml" --build --release
        
        
        ## Thanks
        
        `netml` is based on the initial work of the ["Outlier Detection" library `odet`](https://github.com/Learn-Live/odet) 🙌
        
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: System :: Networking :: Monitoring
Requires-Python: >=3.7.3,<4
Description-Content-Type: text/markdown
Provides-Extra: dev
