Metadata-Version: 2.1
Name: oai_repo
Version: 0.3.3
Summary: OAI-PMH Repository Server
Home-page: https://github.com/MSU-Libraries/oai_repo
Author: Nate Collins
Author-email: npcollins@gmail.com
License: Apache License 2.0
Platform: UNKNOWN
Classifier: Intended Audience :: Developers
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE

# oai_repo
The `oai_repo` Python module provides a configurable implementation of an
[OAI-PMH](http://openarchives.org/OAI/openarchivesprotocol.html) compatible repository.

At its simplest, using `oai_repo` involves:
1. Implementeing a `DataInterface` class to perform several pre-defined actions.
2. Adding a few lines of Python code similar to:
```python
import oai_repo
from .myoaidata import MyOAIData

# Create the repository, passing your implemented DataInterface
repo = oai_repo.OAIRepository(MyOAIData())

# Pass in URL arguments as a dict to be processed
response = repo.process( { "verb": "Identify" } )
print( type(response.root()) )  # lxml.etree.Element
print( bytes(response) )        # XML byte response
```
Resulting in a complete OAI response:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
    <responseDate>2022-03-24T05:50:06Z</responseDate>
    <request>https://d.lib.msu.edu/oai</request>
    <Identify>
        <repositoryName>MSU Libraries Digital Repository</repositoryName>
        <baseURL>https://d.lib.msu.edu/oai</baseURL>
        <protocolVersion>2.0</protocolVersion>
        <adminEmail>admin@example.edu</adminEmail>
        <earliestDatestamp>2012-08-21T13:49:50Z</earliestDatestamp>
        <deletedRecord>no</deletedRecord>
        <granularity>YYYY-MM-DDThh:mm:ssZ</granularity>
        <description>
            <oai-identifier xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai-identifier
              http://www.openarchives.org/OAI/2.0/oai-identifier.xsd">
                <scheme>oai</scheme>
                <repositoryIdentifier>d.lib.msu.edu</repositoryIdentifier>
                <delimiter>:</delimiter>
                <sampleIdentifier>oai:d.lib.msu.edu:123</sampleIdentifier>
            </oai-identifier>
        </description>
    </Identify>
</OAI-PMH>
```

## Features
* Completely customizable to work with any backend you have.
* Compliant to the OAI-PMH 2.0 specification.
* Easy to integrate within any Python application.

## Installation
Requires Python 3.10+

Installation via `pip` is recommended:
```
pip install oai_repo
```

## Implementing the DataInterface Class
In order for `oai_repo` to function, you must defined a custom class that defines
how to pull and process metadata for your repository. This custom class is of
the type `oai_repo.DataInterface`.  

TODO ToC for DataInterface methods and return classes

TODO mkdocstrings for DataInterface

TODO mkdocstrings for return classes:
- oai_data.Identify
- oai_data.MetadataFormat
- oai_data.RecordHeader
- oai_data.Set

## Available Helper Methods
To help in creating your custom `DataInterface` implementation, `oai_repo` comes
with a number of helpers to assist.

TODO ToC for helpers

TODO mkdocstrings for helper functions

TODO mkdocstrings for Transform class

### URL/Path Pairs
To have the OAI repository load data dymanically, the config file allows for
querying an API and using wither JSONPath or XPath on the result. In the config
field list, this is specified by `url`/`*path`. This can be either of:  

**URL/JSONPath**  
* `url` _(String)_: A URL to call
* `jsonpath` _(String)_: A JSONPath to call on the results of the URL, retrieving the first match.

**URL/XPath**  
* `url` _(String)_: A URL to call
* `xpath` _(String)_: An XPath to call on the results of the URL, retrieving the first match.

## The Code
Once the config file is defined, adding `oai_repo` to your application is simple.

Create respository instance, passing in config:
```python
import oai_repo
from .myoaidata import MyOAIData

repo = oai_repo.OAIRepository(MyOAIData())
```

Pass in URL arguments as a dict to process the request:
```python
response = repo.process( request.args )
```

The response can be accessed directly as XML:
```python
xml_root_element = response.root()
```

Or response can be cast into a fully formed XML document:
```python
xml_doc_as_bytes = bytes(response)
```

### Exceptions
The implementation of all methods in the `DataInterface` is required. Any
non-defined method will raise a `NotImplementedError`.  

With a fully implemented `DataInterface`, the `OAIRepository` may raise the
`OAIRepoInternalException` and `OAIRepoExternalException` exceptions.
```python
try:
    repo = oai_repo.OAIRepository(MyOAIData())
    response = repo.process( args )
except oai_repo.OAIRepoExternalException as exc:
    # An API call timed out or returned a non-200 HTTP code.
    # Log the failure and abort with server HTTP 503.
except oai_repo.OAIRepoInternalException as exc:
    # There is a fault in how the DataInterface was implemented.
    # Log the failure and abort with server HTTP 500.
```

## Author and License
The `oai_repo` module was developed at the Michigan State University Libraries.
It is released under the Apache License version 2.0.

## Copyright
Copyright (c) 2022 Michigan State University Board of Trustees


