Configuration Setup JSON
------------------------
During the sensor and reference setup methods, a record of the setup configuration
is saved locally to a ``setup.json`` file. This file is used to indicate to the
ingestion module how the data should be interpreted into the
`sensortoolkit Data Formatting Scheme (SDFS) <../sdfs/index.html>`_.

This file is passed to a subroutine ``sensortoolkit.sensor_ingest.standard_ingest()``
to import the recorded dataset and convert headers and date/time-like columns to SDFS formatting.

Sensor ``setup.json``
^^^^^^^^^^^^^^^^^^^^^
Setup.json files for air sensors are generated by running the
``sensortoolkit.AirSensor.sensor_setup()`` module and contain information
about recorded sensor datasets that is used by the standard ingestion module

As sensors often record data with different formatting and header naming schemes,
these files assist in converting data recorded in their original format into SDFS
scheme for parameter data names and date/time formatting.

The sensor setup.json file is named ``[sensor_name]_setup.json`` where ``[sensor_name]``
is the name assigned to the sensor via ``sensor.name``. This file is located within
the users' project directory in the following relative path:
``\Data and Figures\sensor_data\[sensor_name]\[sensor_name]_setup.json``

.. code-block:: json

  {
      "path": "C:/Users/.../Documents/toucan_evaluation",
      "data_rel_path": "/data/sensor_data/Toco_Toucan/raw_data",
      "data_type": "sensor",
      "file_extension": ".csv",
      "header_iloc": 5,
      "data_row_idx": null,
      "sdfs_header_names": [
          "NO2",
          "O3",
          "PM25",
          "Temp",
          "RH",
          "DP"
      ],
      "col_headers": {
          "col_idx_0": {
              "Time": {
                  "sdfs_param": "DateTime",
                  "in_file_list_idx": [
                      0,
                      1,
                      2
                  ],
                  "header_class": "datetime",
                  "drop": false,
                  "dt_format": "%Y/%m/%d %H:%M:%S",
                  "dt_timezone": "EST"
              }
          },
          "col_idx_1": {
              "NO2 (ppb)": {
                  "sdfs_param": "NO2",
                  "in_file_list_idx": [
                      0,
                      1,
                      2
                  ],
                  "unit_transform": null,
                  "header_class": "parameter",
                  "drop": false
              }
          },
          "col_idx_2": {
              "O3 (ppb)": {
                  "sdfs_param": "O3",
                  "in_file_list_idx": [
                      0,
                      1,
                      2
                  ],
                  "unit_transform": null,
                  "header_class": "parameter",
                  "drop": false
              }
          },
          "col_idx_3": {
              "PM2.5 (\u00b5g/m\u00b3)": {
                  "sdfs_param": "PM25",
                  "in_file_list_idx": [
                      0,
                      1,
                      2
                  ],
                  "unit_transform": null,
                  "header_class": "parameter",
                  "drop": false
              }
          },
          "col_idx_4": {
              "TEMP (\u00b0C)": {
                  "sdfs_param": "Temp",
                  "in_file_list_idx": [
                      0,
                      1,
                      2
                  ],
                  "unit_transform": null,
                  "header_class": "parameter",
                  "drop": false
              }
          },
          "col_idx_5": {
              "RH (%)": {
                  "sdfs_param": "RH",
                  "in_file_list_idx": [
                      0,
                      1,
                      2
                  ],
                  "unit_transform": null,
                  "header_class": "parameter",
                  "drop": false
              }
          },
          "col_idx_6": {
              "DP (\u00b0C)": {
                  "sdfs_param": "DP",
                  "in_file_list_idx": [
                      0,
                      1,
                      2
                  ],
                  "unit_transform": null,
                  "header_class": "parameter",
                  "drop": false
              }
          },
          "col_idx_7": {
              "Inlet": {
                  "sdfs_param": "",
                  "in_file_list_idx": [
                      0,
                      1,
                      2
                  ],
                  "header_class": "parameter",
                  "drop": true
              }
          }
      },
      "name": "Toco_Toucan",
      "dataset_kwargs": {
          "name": "Toco_Toucan"
      },
      "_dataset_selection": "files",
      "file_list": [
          "C:/Users/.../Documents/toucan_evaluation\\data\\sensor_data\\Toco_Toucan\\raw_data\\toco_toucan_RT01_raw.csv",
          "C:/Users/.../Documents/toucan_evaluation\\data\\sensor_data\\Toco_Toucan\\raw_data\\toco_toucan_RT02_raw.csv",
          "C:/Users/.../Documents/toucan_evaluation\\data\\sensor_data\\Toco_Toucan\\raw_data\\toco_toucan_RT03_raw.csv"
      ],
      "encoding_predictions": {},
      "serials": {
          "1": "RT01",
          "2": "RT02",
          "3": "RT03"
      },
      "number_of_sensors": 3
  }

Reference ``setup.json``
^^^^^^^^^^^^^^^^^^^^^^^^

The reference setup.json file is named ``reference_setup.json`` and is located within
the users' project directory in the following relative path:
``\Data and Figures\reference_data\[data_type]\[site_name]_[site_id]\reference_setup.json``,
where ``[data_type]`` is the name of the reference data source (i.e., 'airnowtech', 'local', etc.),
``['site_name']`` is the name of the monitoring site, where spaces have been replaced by '_', and
``[site_id]`` is the AQS site ID (if applicable).

Below is an example reference_setup.json for a reference monitor dataset corresponding
to EPA's RTP campus ambient monitoring site for air sensor testing. The sensor and
reference setup.json files share many similar attributes, however highlighted sections
of code correspond to reference or monitoring site specific attributes that are important
for creating a processed (SDFS formatted) version of the reference dataset.

.. code-block:: json

  {
      "path": "C:\\Users\\...\\Documents\\sensortoolkit_testing",
      "data_rel_path": "/data/reference_data/local/raw/Burdens_Creek_370630099/",
      "data_type": "reference",
      "file_extension": ".csv",
      "header_iloc": 2,
      "data_row_idx": null,
      "sdfs_header_names": [
          "PM25",
          "PM10"
      ],
      "col_headers": {
          "col_idx_0": {
              "Date & Time": {
                  "sdfs_param": "DateTime",
                  "in_file_list_idx": [0, 1],
                  "header_class": "datetime",
                  "drop": false,
                  "dt_format": "%-m/%-d/%Y %-I:%M %p",
                  "dt_timezone": "EST"
              }
          },
          "col_idx_1": {
              "Grimm PM2.5": {
                  "sdfs_param": "",
                  "in_file_list_idx": [0, 1],
                  "header_class": "parameter",
                  "drop": true
              }
          },
          "col_idx_2": {
              "Grimm PM10": {
                  "sdfs_param": "",
                  "in_file_list_idx": [0, 1],
                  "header_class": "parameter",
                  "drop": true
              }
          },
          },
          "col_idx_3": {
              "T640_2_PM25": {
                  "sdfs_param": "PM25",
                  "in_file_list_idx": [0, 1],
                  "unit_transform": null,
                  "header_class": "parameter",
                  "drop": false
              }
          },
          "col_idx_4": {
              "T640_2_PM10": {
                  "sdfs_param": "PM10",
                  "in_file_list_idx": [0, 1],
                  "unit_transform": null,
                  "header_class": "parameter",
                  "drop": false
              }
          }
      },
      "dataset_kwargs": {
          "ref_data_source": "local",
          "site_name": "Burdens_Creek",
          "site_aqs": "370630099"
      },
      "agency": "OAQPS",
      "site_name": "Burdens Creek",
      "site_aqs": "37-063-0099",
      "site_lat": "35.88",
      "site_lon": "-78.87",
      "fmt_site_name": "Burdens_Creek",
      "fmt_site_aqs": "370630099",
      "ref_data_subfolder": "Burdens_Creek_370630099",
      "_dataset_selection": "files",
      "file_list": [
          "C:\\Users\\...\\Documents\\sensortoolkit_testing\\data\\reference_data\\local\\raw\\Burdens_Creek_370630099\\min_201908_PM.csv",
          "C:\\Users\\...\\Documents\\sensortoolkit_testing\\data\\reference_data\\local\\raw\\Burdens_Creek_370630099\\min_201909_PM.csv"
      ],
      "PM25_Unit": "Micrograms/cubic meter (LC)",
      "PM25_Param_Code": "Micrograms/cubic meter (LC)",
      "PM25_Method_Code": 238,
      "PM25_Method": "Teledyne T640X at 16.67 LPM",
      "PM25_Method_POC": "1",
      "PM10_Unit": "Micrograms/cubic meter (LC)",
      "PM10_Param_Code": "Micrograms/cubic meter (LC)",
      "PM10_Method_Code": 239,
      "PM10_Method": "Teledyne API T640X at 16.67 LPM",
      "PM10_Method_POC": "1"
  }
