Metadata-Version: 2.1
Name: accelo
Version: 0.0.1
Summary: UNKNOWN
Home-page: https://acceldata.io
Author: Acceldata
Author-email: support@acceldata.io
License: BSD
Description: # Acceldata ML Observability SDK
        
        The SDK helps Data organizations track their ML models, data that deliver business value. 
        
        ## Pre-requisites
        - Registering yourself with Acceldata Data Observability Cloud platform
        
        Driven through Acceldata Cloud Platform
        - Enabling ML Observability toolkit
        
        Driven through Acceldata Cloud Platform
        - Generating API keys
        
        Driven through Acceldata ML Observability UI
        - Setting up env vars
        ```bash
        export CLOUD_ACCESS_KEY=XXXX0000
        export CLOUD_SECRET_KEY=XXXX0000
        export ACCELO_API_ACCESS_KEY=XXXX0000
        export ACCELO_API_SECRET_KEY=XXXX0000
        export ACCELO_API_ENDPOINT=https://some_acceldata_endpoint
        ```
        - Install the SDK
        ```python
        pip install accelo
        ```
        
        Set Go!
        
        ## Sample Usage Patterns
        Before we delve into code, let's just see an example of a pattern in which you can use the SDK. 
        
        ### Project Creation
        #### Modes
        1. UI - Users will be able to create projects via the Catalog UI where they can either have a model view or a project view
        2. API - Users can create a project in their training pipeline. If a project already exists, API throws a custom error that can be used to avoid any failures in the training pipeline
        ### Model Registration and Baseline logging (training pipeline)
        - User registers a model against a project
        - Model registration API expects the project id, model name and bunch of other metadata that can be used to track models on the catalog UI
        ### Prediction logging (serving pipeline)
        - The serving pipeline can be used to log the predictions to Acceldata datastore
        - The API expects model id, model version, and predictions along with their id columns as mandatory params.
        ### Actual logging (actuals pipeline)
        The actuals for any features may arrive at a later point and the API provides 2 ways to log the actuals.
        - UUIDs: generated by the API during the serving pipeline stage; but the users are expected to keep track of them and map them to the appropriate actuals
        - ID COLUMNS: If users specify certain columns to considered as the ID’s, the API will be able to automatically log the actuals against the API’s and the backend services will be able to compare the actuals to predictions based on these ID COLUMNS
        
        **Note**: Please refer to the API documentation for more information. 
        
        ## Basic APIs
        Finally, let's see how you can annotate the SDK into your production code pipelines. Below are some examples of how a Data Scientist or ML Engineer can annotate the SDK into the 
        existing ML code and observe them using Acceldata ML Observability platform.
        
        ### Import the library
        ```python
        from accelo_mlops import AcceloClient
        ``` 
        
        ### Creating a client with a workspace
        The workspace is the top level name that you would want to associate your organization with. 
        This can also be thought of like a tenant name. 
        ```python
        client = AcceloClient(workspace='your_organization_name')
        ```
        
        ### Creating a Project
        Now, when it comes to code, the atomic unit is a `Project`. The project name can be a team name, domain name within 
        a company or any other logical separation Data Science groups.
        
        ```python
        client.create_project(name='marketing-team', 
                              description='All models related to the marketing team reside here. '
        )
        ```
        
        ### Register a Model
        Now, assuming that you have developed a model that you want to observe using the Acceldata ML Observability platform.
        The model object is called `classifier`.  
        ```python
        model_metadata = {
            'frequency': 'DAILY', 
            'model_type': 'binary_classification',
            'performance_metric': 'f1_score', 
            'model_obj': classifier
        }
        additional_params = {
            'owner': 'research@preview.com',
            'last_trained': '2021-08-01',
            'training_job_name': 'click_prediction_ml_pipeline',
            'label': 'flower_type',
            'total_consumers': 2
        }
        
        client.register_model(project_id=12, 
                              model_name='click_prediction_model', 
                              model_version='v1', 
                              model_metadata=model_metadata, 
                              **additional_params
        )
        ```
        
        Let's see what above variables mean.
        - **classifier**: this is the model object
        - **model_meatadata**: this is a mandatory dictionary users have to pass to the register model call to make most use of the ML observability platform. 
        - **additional_params**: this is a optional dictionary users can use to log any additional details about the model which might be useful when viewed in the ML Catalog.   
        
        Now, it's time to log the data that was used in model. 
        ### Log baseline data
        ```python
        client.log_baseline(
            model_id=client.model_id,
            model_version='v1',
            baseline_data=X_train,
            labels=y_train,
            label_name='click',
            id_cols=['campaign_id'],
            publish_date='2021-08-02'
        )
        ```
        This API call logs your baseline data to Acceldata data store and will be further used for analysis that you sign up for. 
        
        ### Log predictions
        ```python
        ids = client.log_predictions(
            model_id=client.model_id,
            model_version='v1',
            feature_data=feature_data,
            predictions=preds,
            publish_date='2021-06-02'
        )
        ```
        
        **Note**: As of now, we support batch predictions only but soon enough, will be able to support logging online 
        predictions. 
        
        ### Log actuals
        At a later time, when actuals arrive, you'd be able to log them using below API.
        ```python
        client.log_actuals(
            model_id=client.model_id,
            model_version='v1',
            id_cols_df=id_columns_frane,
            actuals=y_test,
            publish_date='2021-06-03'
        )
        ```
        
        You are now done logging both metadata and the data itself. 
        
        Detailed activity logs can be viewed in the `ad-mlops.log` file in the directory where your code file exists, however, location of the log file is configurable.
        
        ## What happens after you create a project and register a model?
        ### Metadata
        The model and the other metadata are now part of Acceldata ML Catalog and can be viewed on the UI. 
        
        ### Data
        The `baseline, prediction, actual` data are logged into the Acceldata Store. This data will be used for further analysis. 
        
        ### Dashboard
        You will be able to track model performance, data drifts, etc by visiting this dashboard. 
        
        ### Alerts
        You can set alerts on charts, generate reports, etc using the dashboard or the catalog.
        
        ## Contact Us
        Please get in touch with us at `contact@acceldata.io` for access to Acceldata catalog, dashboard, and assistance with bringing ML Observability into your organization.   
        
Keywords: accelo
Platform: UNKNOWN
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Description-Content-Type: text/markdown
