Metadata-Version: 2.1
Name: xport
Version: 3.0.0
Summary: SAS XPORT file reader
Home-page: https://github.com/selik/xport
Author: Michael Selik
Author-email: michael.selik@gmail.com
License: MIT
Description: ########################################################################
          Xport
        ########################################################################
        
        .. sphinx-page-start
        
        Read and write SAS Transport files (``*.xpt``).
        
        SAS uses a handful of archaic file formats: XPORT/XPT, CPORT, SAS7BDAT.
        If someone publishes their data in one of those formats, this Python
        package will help you convert the data into a more useful format.  If
        someone, like the FDA, asks you for an XPT file, this package can write
        it for you.
        
        
        What's it for?
        ==============
        
        XPORT is the binary file format used by a bunch of `United States
        government agencies`_ for publishing data sets. It made a lot of sense
        if you were trying to read data files on your IBM mainframe back in
        1988.
        
        The official `SAS specification for XPORT`_ is relatively
        straightforward. The hardest part is converting IBM-format floating
        point to IEEE-format, which the specification explains in detail.
        
        There was an `update to the XPT specification`_ for SAS v8 and above.
        This module *has not yet been updated* to work with the new version.
        However, if you're using SAS v8+, you're probably not using XPT
        format. The changes to the format appear to be trivial changes to the
        metadata, but this module's current error-checking will raise a
        ``ValueError``. If you'd like an update for v8, please let me know by
        `submitting an issue`_.
        
        .. _United States government agencies: https://www.google.com/search?q=site:.gov+xpt+file
        
        .. _SAS specification for XPORT: http://support.sas.com/techsup/technote/ts140.pdf
        
        .. _update to the XPT specification: https://support.sas.com/techsup/technote/ts140_2.pdf
        
        .. _submitting an issue: https://github.com/selik/xport/issues/new
        
        
        
        Installation
        ============
        
        This project requires Python v3.7+.  Grab the latest stable version from
        PyPI.
        
        .. code:: bash
        
            $ python -m pip install --upgrade xport
        
        
        
        Reading XPT
        ===========
        
        This module follows the common pattern of providing ``load`` and
        ``loads`` functions for reading data from a SAS file format.
        
        .. code:: python
        
            import xport.v56
        
            with open('example.xpt', 'rb') as f:
                library = xport.v56.load(f)
        
        
        The XPT decoders, ``xport.load`` and ``xport.loads``, return a
        ``xport.Library``, which is a mapping (``dict``-like) of
        ``xport.Dataset``s.  The ``xport.Dataset``` is a subclass of
        ``pandas.DataFrame`` with SAS metadata attributes (name, label, etc.).
        The columns of a ``xport.Dataset`` are ``xport.Variable`` types, which
        are subclasses of ``pandas.Series`` with SAS metadata (name, label,
        format, etc.).
        
        If you're not familiar with `Pandas`_'s dataframes, it's easy to think
        of them as a dictionary of columns, mapping variable names to variable
        data.
        
        The SAS Transport (XPORT) format only supports two kinds of data.  Each
        value is either numeric or character, so ``xport.load`` decodes the
        values as either ``str`` or ``float``.
        
        Note that since XPT files are in an unusual binary format, you should
        open them using mode ``'rb'``.
        
        .. _Pandas: http://pandas.pydata.org/
        
        
        You can also use the ``xport`` module as a command-line tool to convert
        an XPT file to CSV (comma-separated values) file.  The ``xport``
        executable is a friendly alias for ``python -m xport``.
        
        .. code:: bash
        
            $ xport example.xpt > example.csv
        
        
        Writing XPT
        ===========
        
        The ``xport`` package follows the common pattern of providing ``dump``
        and ``dumps`` functions for writing data to a SAS file format.
        
        .. code:: python
        
            import xport
            import xport.v56
        
            ds = xport.Dataset()
            with open('example.xpt', 'wb') as f:
                xport.v56.dump(ds, f)
        
        
        Because the ``xport.Dataset`` is an extension of ``pandas.DataFrame``,
        you can create datasets in a variety of ways, converting easily from a
        dataframe to a dataset.
        
        .. code:: python
        
            import pandas as pd
            import xport
            import xport.v56
        
            df = pandas.DataFrame({'NUMBERS': [1, 2], 'TEXT': ['a', 'b']})
            ds = xport.Dataset(df, name='MAX8CHRS', label='Up to 40!')
            with open('example.xpt', 'wb') as f:
                xport.v56.dump(ds, f)
        
        
        SAS Transport v5 restricts variable names to 8 characters (with a
        strange preference for uppercase) and labels to 40 characters.  If you
        want the relative comfort of SAS Transport v8's limit of 246 characters,
        please `make an enhancement request`_.
        
        
        Feature requests
        ================
        
        I'm happy to fix bugs, improve the interface, or make the module
        faster. Just `submit an issue`_ and I'll take a look.
        
        .. _make an enhancement request: https://github.com/selik/xport/issues/new
        .. _submit an issue: https://github.com/selik/xport/issues/new
        
        
        Contributing
        ============
        
        This project is configured to be developed in a Conda environment.
        
        .. code:: bash
        
            $ git clone git@github.com:selik/xport.git
            $ cd xport
            $ make install          # Install into a Conda environment
            $ conda activate xport  # Activate the Conda environment
            $ make install-html     # Build the docs website
        
        
        Authors
        =======
        
        Original version by `Jack Cushman`_, 2012.
        
        Major revisions by `Michael Selik`_, 2016 and 2020.
        
        .. _Jack Cushman: https://github.com/jcushman
        
        .. _Michael Selik: https://github.com/selik
        
        Change Log
        ==========
        
        v0.1.0, 2012-05-02
          Initial release.
        
        v0.2.0, 2016-03-22
          Major revision.
        
        v0.2.0, 2016-03-23
          Add numpy and pandas converters.
        
        v1.0.0, 2016-10-21
          Revise API to the pattern of from/to <format>
        
        v2.0.0, 2016-10-21
          Reader yields regular tuples, not namedtuples
        
        v3.0.0, 2020-04-20
          Revise API to the load/dump pattern.
          Enable specifying dataset name, variable names, labels, and formats.
        
Keywords: sas,xport,xpt,cport,sas7bdat
Platform: any
Classifier: Development Status :: 4 - Beta
Classifier: Topic :: Text Processing
Classifier: Topic :: Utilities
Classifier: License :: OSI Approved :: MIT License
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Requires-Python: >=3.7
Provides-Extra: dev
