Metadata-Version: 2.1
Name: FITSxtractor
Version: 0.9.2
Summary: This package extracts xml metadata from FITS output to a csv or xlsx file
Home-page: https://github.com/ovanov/FITSxtractor
Author: ovanov
Author-email: ovanov@protonmail.com
License: MIT
Keywords: FITS,xml,csv,CLI programm
Platform: any
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE

# FITSxtractor - The metadata extractor

With FITSxtractor, you can extract metadata from XML files (generated by FITS) and save them in a .cvs or .xlsx file by calling the program from the command line.


The code is made for the XML tree structure that the Harvard tool 'FITS' (File Information Tool Set) outputs. Pay a visit to their [Website](https://projects.iq.harvard.edu/fits/home) for more information about their project and have a look at the [Documentation](https://projects.iq.harvard.edu/fits/standard-metadata-schemas) regarding the output format.

This Project has been brought to life with the help of the [AfZ](https://www.afz.ethz.ch/) (Archive of Contemporary History) at [ETH Zürich](https://ethz.ch/en.html).

# Overview

The FITS metadata extractor was written to be implemented as a tool for the simplifaction of digital long term archiving workflows.

A key aspect in archiving is keeping track of the most usefull metadata while not overshooting with information. The extractor uses the vast extensive output, that FITS provides and saves only the following metadata:
- identification
    - identity
        - externalIdentifier
    - mimetype
    - format
- fileinfo
    - filepath
    - md5checksum
    - size
- filestatus
    - well-formed
    - well-formed status

More metadata will be added to the program if needed.
By creating a table from the metadata batch, the tool serves as a key component for an interface that keeps track of the data. Furthermore the user is able to work with the metadata by using MS Excel or other programs. This enables more in depth analysis of the recieved data.

## Guide

The following shows how to get and use FITSxtractor.

### Installation

    $ pip install FITSxtractor

Concider that you might have to add the installation folder to your PATH.

If you would rather like to customize the code to your needs, grab a stable version under "Releases". All the files are extensively commented as well, in order to make the files more user firendly.

### Usage

When in a terminal specify:

    $ FITSxtractor path_to_dir --output path_to_outputfile

The programm takes a directory, which is populated with **at least one** (!) FITS XML output file and takes an output file location  as a positional argument.

Accepted output formats are files ending with a *.csv* or *.xlsx* extension. If none are given, the program defaults to .xlsx. 

