kedro.extras.datasets.api.APIDataSet¶
-
class
kedro.extras.datasets.api.APIDataSet(url, method='GET', data=None, params=None, headers=None, auth=None, json=None, timeout=60, credentials=None)[source]¶ APIDataSetloads the data from HTTP(S) APIs. It uses the python requests library: https://requests.readthedocs.io/en/master/Example:
from kedro.extras.datasets.api import APIDataSet data_set = APIDataSet( url="https://quickstats.nass.usda.gov", params={ "key": "SOME_TOKEN", "format": "JSON", "commodity_desc": "CORN", "statisticcat_des": "YIELD", "agg_level_desc": "STATE", "year": 2000 } ) data = data_set.load()
Methods
exists()Checks whether a data set’s output already exists by calling the provided _exists() method.
from_config(name, config[, load_version, …])Create a data set instance using the configuration provided.
load()Loads data by delegation to the provided load method.
release()Release any cached data.
save(data)Saves data by delegation to the provided save method.
-
__init__(url, method='GET', data=None, params=None, headers=None, auth=None, json=None, timeout=60, credentials=None)[source]¶ Creates a new instance of
APIDataSetto fetch data from an API endpoint.- Parameters
url (
str) – The API URL endpoint.method (
str) – The Method of the request, GET, POST, PUT, DELETE, HEAD, etc…data (
Optional[Any]) – The request payload, used for POST, PUT, etc requests https://requests.readthedocs.io/en/master/user/quickstart/#more-complicated-post-requestsparams (
Optional[Dict[str,Any]]) – The url parameters of the API. https://requests.readthedocs.io/en/master/user/quickstart/#passing-parameters-in-urlsheaders (
Optional[Dict[str,Any]]) – The HTTP headers. https://requests.readthedocs.io/en/master/user/quickstart/#custom-headersauth (
Union[Iterable[str],AuthBase,None]) – Anythingrequestsaccepts. Normally it’s either('login', 'password'), orAuthBase,HTTPBasicAuthinstance for more complex cases. Any iterable will be cast to a tuple.json (
Union[List,Dict[str,Any],None]) – The request payload, used for POST, PUT, etc requests, passed in to the json kwarg in the requests object. https://requests.readthedocs.io/en/master/user/quickstart/#more-complicated-post-requeststimeout (
int) – The wait time in seconds for a response, defaults to 1 minute. https://requests.readthedocs.io/en/master/user/quickstart/#timeoutscredentials (
Union[Iterable[str],AuthBase,None]) – same asauth. Allows specifyingauthsecrets in credentials.yml.
- Raises
ValueError – if both
credentialsandauthare specified.
-
exists()¶ Checks whether a data set’s output already exists by calling the provided _exists() method.
- Return type
bool- Returns
Flag indicating whether the output already exists.
- Raises
DataSetError – when underlying exists method raises error.
-
classmethod
from_config(name, config, load_version=None, save_version=None)¶ Create a data set instance using the configuration provided.
- Parameters
name (
str) – Data set name.config (
Dict[str,Any]) – Data set config dictionary.load_version (
Optional[str]) – Version string to be used forloadoperation if the data set is versioned. Has no effect on the data set if versioning was not enabled.save_version (
Optional[str]) – Version string to be used forsaveoperation if the data set is versioned. Has no effect on the data set if versioning was not enabled.
- Return type
AbstractDataSet- Returns
An instance of an
AbstractDataSetsubclass.- Raises
DataSetError – When the function fails to create the data set from its config.
-
load()¶ Loads data by delegation to the provided load method.
- Return type
Any- Returns
Data returned by the provided load method.
- Raises
DataSetError – When underlying load method raises error.
-
release()¶ Release any cached data.
- Raises
DataSetError – when underlying release method raises error.
- Return type
None
-
save(data)¶ Saves data by delegation to the provided save method.
- Parameters
data (
Any) – the value to be saved by provided save method.- Raises
DataSetError – when underlying save method raises error.
FileNotFoundError – when save method got file instead of dir, on Windows.
NotADirectoryError – when save method got file instead of dir, on Unix.
- Return type
None
-