#!/usr/bin/env python
"""
This program (gristle_differ)  is used to compare two files and writes the
differences to five output files named after the inputs with these suffixes:
   .insert, .delete, .same, .chgold, .chgnew
The input files are typically provided as arguments, and are sometimes referred
to as 'the old file' and 'the new file' - a reflection of the fact that the
most common usage is to compare two snapshots of the same file.  However, this
occasional reference should not lead one to assume that the utility is limited
to this scenario.

The program sorts the two input files based on unique key columns, then removes
any duplicates based on this key (leaving 1 of any duplicate set behind).

Comparisons of matching records can be limited to specific columns in two ways:
    - using --compare-cols to assume no cols will be compared other than these
      specifically identified.
    - using --ignore-cols to assume all cols will be compared, and explicitly
      idenitify those not to compare.

After the comparison post-delta transformations can be performed in order to
ready the file for subsequent processing.  Examples of these transformations
include:
    - incrementing a key column in the .insert & .chgnew files
    - populating a delete flag in the .delete file
    - populating a row version starting timestamp in the .insert and .chgnew
      files
    - populating a row version ending timestamp in the .delete and .chgold
      files
    - populating a batch_id in any or all files

See gristle_differ on the wiki for the most thorough documentation here:
    https://github.com/kenfar/DataGristle/wiki/gristle_differ

Example 1: Simple Comparison
    $ gristle_differ file0.dat file1.dat --key-cols 0 2 --ignore-cols 19 22 33

    In this case the two files are each sorted and deduped on columns 0 & 2, and
    then all columns except 19, 22, and 33 are used for the comparison.

    Produces the following files:
       - file1.dat.insert
       - file1.dat.delete
       - file1.dat.same
       - file1.dat.chgnew
       - file1.dat.chgold

Example 2: Complex Operation
    $ gristle_differ file0.dat file1.dat --config-fn ./foo.yml  \
        --variables batchid:919 --variables pkid:82304

    Produces the same output file names as example 1.

    But in this case it gets the majority of its configuration items from
    the config file ('foo.yml').  This could include key columns, comparison
    columns, ignore columns, post-delta transformations, and other information.

    The two variables options are used to pass in user-defined variables that
    can be referenced by the post-delta transformations.  The batchid will get
    copied into a batch_id column for every file, and the pkid is a sequence
    that will get incremented and used for new rows in the insert, delete and
    chgnew files.

See the wiki page for more examples and information on the operation and config
file.

Positional arguments:
  files                 Specifies the two input files.  The first argument is
                        also referred to as the 'old file', the second as the
                        'new file'.

Keyword arguments:
  -h, --help            Show this help message and exit
  --long-help           Print more verbose help.
  -k [KEY_COLS [KEY_COLS ...]], --key-cols [KEY_COLS [KEY_COLS ...]]
                        Specifes the columns that constitute a unique row with
                        a comma-delimited list of field positions using a
                        0-offset.  If col_names are provided then names can be
                        used as well as positions.  This is a required option.
  -c [COMPARE_COLS [COMPARE_COLS ...]], --compare-cols [COMPARE_COLS [COMPARE_COLS ...]]
                        Columns to compare - described by their position using
                        a zero-offset.  If col_names are provided then names
                        can be used as well as positions.  This option is
                        mutually exclusive with ignore-cols and key-cols.
                        If no value is provided then all non-key cols will be compared.
  -i [IGNORE_COLS [IGNORE_COLS ...]], --ignore-cols [IGNORE_COLS [IGNORE_COLS ...]]
                        Columns to ignore - described by their position using a
                        zero-offset. If col_names are provided then names can be
                        used as well as positions.  This option is mutually
                        exclusive with compare-cols and key-cols.  If this column
                        is provided then all cols except for these identified here and
                        key-cols will be compared.
  --col_names [COL_NAMES [COL_NAMES ...]]
                        Column names - allows other args to reference col names
                        as well as positions.  If nothing is provided but the files
                        have headers they will automatically be added to col_names
                        with a couple of changes: all values are turned to lowercase,
                        and spaces are replaced with underscores.
  --variables [VARIABLES [VARIABLES ...]]
                        Variables to reference with post-delta assignments.
                        Typical examples are a batch_id, sequence starting num,
                        or extract timestamp to insert into output recs.  The
                        format is "<name>:<value>"
  --already-sorted      Causes program to bypass sorting step.
  --already-uniq        Causes program to bypass deduping step.
  --temp-dir TEMP_DIR   Used for temporary files.
  --out-dir OUT_DIR     Where the output files will be written.  Defaults to
                        the directory of the second file.
  --dry-run             Performs most processing except for final changes.
  --stats               Writes detailed processing stats
  --nostats             Turns off stats
  --config-name CONFIG_NAME
                        Name of config within XDG dir.  On linux this dir would
                        be: $HOME/.confg/gristle_differ.  The config is a YAML
                        file that can contain any of these arguments plus
                        others used for post-delta record transformations.
  --config-fn CONFIG_FN
                        Name of config file.  This allows the user to use any
                        name and any directory.
  -d DELIMITER, --delimiter DELIMITER
                        Specify a quoted single-column field delimiter. This
                        will otherwise be determined automatically by the pgm.
                        Specifying it explicitely is useful when the pgm cannot
                        accurately determine the csv dialect, especially with
                        very small files.
  --escapechar ESCAPECHAR
                        Specify escaping character. Otherwise, like delimiter,
                        the program will automatically determine it.
  --quoting {quote_all,quote_minimal,quite_nonnumeric,quote_none}
                        Specify field quoting.  Otherwise, like delimiter, the
                        pgm will automatically determine it.  Valid values are:
                        quote_none, quote_minimal, quote_nonnumeric, quote_all.
  --quotechar QUOTECHAR
                        Specify field quoting character.  Otherwise, like
                        delimiter the pgm will automatically determine it.
  --recdelimiter RECDELIMITER
                        Specify a quoted end-of-record delimiter.
  --has-header          Indicate that there is a header in the file. This is a
                        standard datagristle option - but not yet supported
                        within this pgm.
  --has-no-header       Indicate that there is no header in the file.  This is
                        a standard datagristle option - but not yet supported
                        within this pgm.
  -V, --version         Show program's version number and exits.

The config file is useful when the there are many arguments, especially when
using post-delta transforms.  The program can be referred to a config file in
either of two ways:
   - config-fn argument: that references a specific file name
   - config-name argument: that refers to a file name within the XDG standard
     directory.
        - On linux: $HOME/.config/gristle_differ/<name>.yml
        - On MacOS: $HOME/

This source code is protected by the BSD license.  See the file "LICENSE"
in the source code root directory for the full language or refer to it here:
    http://opensource.org/licenses/BSD-3-Clause
Copyright 2011-2020 Ken Farmer

"""
import sys
import logging
import copy
import csv
import os
from  os.path import exists, dirname, basename
from pprint import pprint as pp
from signal import signal, SIGPIPE, SIG_DFL
from typing import List, Union, Dict, Any, Tuple

import cletus.cletus_config as conf

import datagristle.common       as comm
from   datagristle.common  import abort
import datagristle.file_type    as gftype
import datagristle.file_delta   as gdelta
import datagristle.file_sorter  as gsorter
import datagristle.file_deduper as gdeduper

import jsonschema

#Ignore SIG_PIPE and don't throw exceptions on it...
#(http://docs.python.org/library/signal.html)
signal(SIGPIPE, SIG_DFL)

APP_NAME = basename(__file__)
logging.basicConfig()



def main():
    """ runs all processes:
            - gets args & config
            - compares files
            - writes counts
    """
    # collect any config file items then override them with args and assemble the
    # consolidated info in the 'config' var:
    desc = ("gristle_differ is used to compare two files and writes the differences to "
            "five output files named after the inputs with the suffixes: "
            ".insert, .delete, .same, .chgold, .chgnew.  The program sorts the "
            "two files based on unique key columns, can ignore certain columns "
            "for the comparison, and then can perform some simple transformations."
            "The input files are referred to as old & new, file0 & file1, or f0 & f1 "
            "this inconsistency reflects the fact that the program may be used "
            "to compare two historical snapshots of a file (old vs new) or two independent "
            "files (file0 vs file1)."
            "\n"
            "   example:  gristle_differ file0rdat file1.dat --compare-cols '0, 1' --ignore_cols '19,22,33'"
            "\n\n")
    arg_processor = ArgProcessor(desc, __doc__)
    args = vars(arg_processor.args)
    config = setup_config(args)

    # either file may be empty - but at least one must have data in it.
    assert len(config['files']) == 2
    if (os.path.getsize(config['files'][0]) == 0
            and os.path.getsize(config['files'][1]) == 0):
        return 1

    dialect = gftype.get_dialect(config['files'], config['delimiter'], config['quoting'],
                                 config['quotechar'], config['recdelimiter'], config['has_header'])

    config['col_names'] = comm.get_best_col_names(config, dialect)
    print_config(config, dialect)

    file_delta = delta_runner(config, dialect)

    if config['stats']:
        print('')
        print('Read Stats:')
        print('   File0/oldfile records:  %d' % file_delta.old_read_cnt)
        print('   File1/newfile records:  %d' % file_delta.new_read_cnt)
        print('')
        print('Write Stats:')
        print('   *.delete records:       %d' % file_delta.out_counts.get('delete', 0))
        print('   *.insert records:       %d' % file_delta.out_counts.get('insert', 0))
        print('   *.same records:         %d' % file_delta.out_counts.get('same', 0))
        print('   *.chgold records:       %d' % file_delta.out_counts.get('chgold', 0))
        print('   *.chgnew records:       %d' % file_delta.out_counts.get('chgnew', 0))

    return 0



def delta_runner(config: Dict[str, Any], dialect: csv.Dialect) -> gdelta.FileDelta:
    """ sets up config items for delta class,
        prepares input files,
        instantiates delta class and runs comparison
        removes temporary files
    """
    adj_temp_dir = config['temp_dir'] or dirname(config['files'][1])
    adj_out_dir = config['out_dir']  or dirname(config['files'][1])

    #--- handle all key, compare and ignore logic --------------
    def convert_cols(col_title: str, lookup_cols: Dict[str, Any]):
        try:
            off0 = comm.colnames_to_coloff0(config['col_names'], lookup_cols)
        except (KeyError, ValueError, TypeError) as err:
            print(err)
            print('Column lookup list: %s' % ','.join(lookup_cols))
            print('Column names known: %s' % ','.join(config['col_names']))
            abort('Invalid config cols for %s: ' % col_title)
        return off0

    keys_off0 = convert_cols('key_cols', config['key_cols'])
    compare_off0 = convert_cols('compare_cols', config['compare_cols'])
    ignore_off0 = convert_cols('ignore_cols', config['ignore_cols'])

    delta = gdelta.FileDelta(adj_out_dir, dialect)
    for col in keys_off0:
        delta.set_fields('join', col)
    for col in ignore_off0:
        delta.set_fields('ignore', col)
    for col in compare_off0:
        delta.set_fields('compare', col)

    for kv_pair in config['variables']:
        key, value = kv_pair.split(':')
        delta.dass.set_special_values(key, value)

    #--- handle all assignment logic --------------
    if 'assignments' in config:
        asgn_offsets = get_assign_with_offsets_for_names(config['col_names'],
                                                         config['assignments'])
        for asgn in asgn_offsets:
            delta.dass.set_assignment(**asgn)

    #--- calc any sequences that refer to old file: -------
    delta.dass.set_sequence_starts(dialect, config['files'][0])

    #--- sort & dedupe the two source files -----
    f0_sorted_uniq_fn, _ = prep_file(config['files'][0],
                                     dialect, keys_off0,
                                     adj_temp_dir, adj_out_dir,
                                     config['already_sorted'],
                                     config['already_uniq'])
    f1_sorted_uniq_fn, _ = prep_file(config['files'][1],
                                     dialect, keys_off0,
                                     adj_temp_dir, adj_out_dir,
                                     config['already_sorted'],
                                     config['already_uniq'])

    #--- finally, run the comparison ---
    delta.compare_files(f0_sorted_uniq_fn, f1_sorted_uniq_fn,
                        dry_run=config['dry_run'])

    #--- housekeeping ---
    os.remove(f0_sorted_uniq_fn)
    os.remove(f1_sorted_uniq_fn)

    return delta


def get_assign_with_offsets_for_names(col_names: List[str],
                                      assign_config: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
    """ Return assignment config section after converting col names to offsets.

    Args:
        col_names: a list of column names
        assign_config: assignment section of config structure
    Returns:
        copy of config - with colnames replaced with 0-offsets
    Raises:
        KeyError - if colname from config not in col_names
                 - or if col position from config beyond col_name list
    """
    new_assign_config: List[Dict[str, Any]] = copy.deepcopy(assign_config)
    for asgn in new_assign_config:
        if asgn.get("src_field", None) is not None:
            asgn["src_field"] = comm.colnames_to_coloff0(col_names, [asgn["src_field"]])[0]
        if asgn.get("dest_field", None) is not None:
            asgn["dest_field"] = comm.colnames_to_coloff0(col_names, [asgn["dest_field"]])[0]
    return new_assign_config



def prep_file(filename: str,
              dialect: csv.Dialect,
              key_cols: List[int],
              temp_dir: str,
              out_dir: str,
              already_sorted: bool,
              already_uniq: bool) -> Tuple[str, int]:
    """ Set up a file for the delta class comparison.

    Args:
        filename: name of input file
        dialect:  csv dialect
        key_cols: list of key columns, offsets only
        temp_dir: used for sorting
        out_dir:  directory used to write intermittent & prepared files
        already_sorted: boolean, if True will skip sort
        already_uniq: boolean, if True will skip deduping
    Returns:
        final_name: provides final name of prepared file
        dups_removed: count of duplicates removed
    Raises:
        Multiple (see CSVSorter & CSVDeDuper
    Misc:
        - sorted_fn - ends in .sorted or .csv (if already-sorted)
        - deduped_fn - ends in .uniq
        - final_name - ends in .uniq or .csv (if already-sorted)
    """
    if already_sorted:
        sorted_fn = filename
    else:
        sorter = gsorter.CSVSorter(dialect, key_cols, temp_dir, out_dir)
        sorted_fn = sorter.sort_file(filename)

    if already_uniq:
        final_name = sorted_fn
        dups_removed = 0
    else:
        deduper = gdeduper.CSVDeDuper(dialect, key_cols, out_dir)
        final_name, read_cnt, write_cnt = deduper.dedup_file(sorted_fn)
        dups_removed = read_cnt - write_cnt
        if sorted_fn != filename:
            os.remove(sorted_fn)

    return final_name, dups_removed




class ArgProcessor(comm.ArgProcessor):
    """" Manages standard datagristle and custom differ arguments.

    Args:
         desc:  a short description of gristle_differ
         long_desc: a long detailed desc of gristle_differ
    Returns:
         nothing
    Raises:
         see comm.ArgProcessor
    """

    def add_custom_args(self):

        self.add_positional_file_args(stdin=False)

        par_add_arg = self.parser.add_argument
        par_add_arg('-k', '--key-cols',
                    default=None,
                    nargs='*',
                    help='Specify the columns that constitute a unique row with a comma-'
                         'delimited list of numbers using a 0-offset.\n'
                         'This is a required option.')
        par_add_arg('-c', '--compare-cols',
                    nargs='*',
                    default=None,
                    help=('Columns to compare - described by their position using a zero-offset.\n'
                          'Mutually exclusive with ignore-cols and key-cols.\n'
                          'If no value is provided then all columns except for key-cols will be compared.\n'
                          'Indicate multiple columns with a space-delimited list of numbers.'))
        par_add_arg('-i', '--ignore-cols',
                    nargs='*',
                    default=None,
                    help=('Columns to ignore - described by their position using a zero-offset.\n'
                          'Mutually exclusive with compare-cols.\n'
                          'If this option is provided then all columns will be compared except '
                          'for those specified here and key-cols.\n'
                          'Indicate multiple columns with a space-delimited list of numbers.'))
        par_add_arg('--col-names',
                    nargs='*',
                    default=None,
                    help=('Column names - allows other cols to use names rather than offsets.\n'
                          'This will default to the file headers if they exist.\n'
                          'Note that those file header values will be converted to lowercase and have spaces converted to underscores.'))
        par_add_arg('--variables',
                    nargs='*',
                    default=None,
                    help=('Variables to reference with assignments.  Format is "<name>:<value>"'))

        par_add_arg('--already-sorted',
                    default=False,
                    action='store_true',
                    help='Causes program to bypass sorting step.')
        par_add_arg('--already-uniq',
                    default=False,
                    action='store_true',
                    help='Causes program to bypass deduping step.')
        par_add_arg('--temp-dir',
                    help='Used for keeping temporary files.')
        par_add_arg('--out-dir',
                    help=('Where the output files will be written.  Defaults to the'
                          ' directory of the second file.'))

        self.add_option_dry_run()
        self.add_option_stats()
        self.add_option_config_name()
        self.add_option_config_fn()
        self.add_option_csv_dialect()
        #self.add_option_logging()


def setup_config(args: Dict[str, Any]) -> Dict[str, Any]:
    """ Manages config file reading, config consolidation and config validation.

    Used to consolidate CLI args & config file items into a single config
    dictionary in which the CLI overrides the file:

    1. read in config file if a config name or config filename were provided
       as args.
    2. remove any args that have a value of None
    3. read in the args - override any config file items with the same key.
    4. apply defaults to any missing config items.
    5. apply any needed conversions or fixes to config values
    6. validate config dict based on json schema
    7. validate config dict based on custom validation code

    Args:
        args: a dictionary of CLI args and their values
    Returns:
        confd: a dictionary of all config items and their values
    Raises:
        sys.exit - most errors if caught here will terminate pgm directly
        see cletus ConfigManager
    Note:
        Since multiple types of validation exist - the code is in multiple
        places.  Some of this redundancy is intended to provide improved
        messaging for common errors.
    """
    config_schema = {'type': 'object',
                     'properties': {
                         'files':           {"type":     "array"},
                         'temp_dir':        {"required": False},
                         'out_dir':         {"type":     ["null", "string"],
                                             "required": False},
                         'config_fn':       {"required": False},
                         'config_name':     {"required": False},
                         'col_names':       {"type":     "array",
                                             "required": False},
                         'key_cols':        {"type":     "array"},
                         'ignore_cols':     {"type":     "array"},
                         'compare_cols':    {"type":     "array"},
                         'variables':       {"type":     "array"},
                         'already_sorted':  {"type":     "boolean"},
                         'already_uniq':    {"type":     "boolean"},
                         'assignments':     {"type":     "array",
                                             "required": False,
                                             "properties": {
                                                 "dest_file":       {"type":    "string"},
                                                 "dest_field":      {"type":    "string"},
                                                 "src_type":        {"type":    "string"},
                                                 "src_val":         {"type":    "string"},
                                                 "src_file":        {"type":    "string"},
                                                 "src_field":       {"type":    "string"},
                                                 "comments":        {"type":    "string"}}},
                         'delimiter':       {"type":     ["null", "string"]},
                         'has_header':      {'type':     ["null", "boolean"]},
                         'recdelimiter':    {"type":     ["null", "string"]},
                         'quoting':         {'type':     ["null", "string"],
                                             "enum":     ["quote_none", "quote_all",
                                                          "quote_minimal", "quote_nonnumeric",
                                                          "0", "1", "2", "3"]},
                         'quotechar':       {"type":     ["null", "string"]},
                         'escapechar':      {"type":     ["null", "string"],
                                             "required": False},
                         'stats':           {"type":     "boolean",
                                             "required": False},
                         'dry_run':         {"type":     "boolean"},
                         'long_help':       {"type":     "boolean",
                                             "required": False},
                         'help':            {"type":     "boolean",
                                             "required": False},
                         'version':         {'type':     'string',
                                             "required": False},
                         },
                     'additionalProperties': False
                    }
    config_schema2 = {'type': 'object',
                     'properties': {
                         'files':           {"type":     "array"},
                         'temp_dir':        {},
                         'out_dir':         {"type":     ["null", "string"]},
                         'config_fn':       {},
                         'config_name':     {},
                         'col_names':       {},
                         'key_cols':        {"type":     "array"},
                         'ignore_cols':     {"type":     "array"},
                         'compare_cols':    {"type":     "array"},
                         'variables':       {"type":     "array"},
                         'already_sorted':  {"type":     "boolean"},
                         'already_uniq':    {"type":     "boolean"},
                         'assignments':     {"type":     "array",
                                             "properties": {
                                                 "dest_file":       {"type":    "string"},
                                                 "dest_field":      {"type":    "string"},
                                                 "src_type":        {"type":    "string"},
                                                 "src_val":         {"type":    "string"},
                                                 "src_file":        {"type":    "string"},
                                                 "src_field":       {"type":    "string"},
                                                 "comments":        {"type":    "string"}}},
                         'delimiter':       {"type":     ["null", "string"]},
                         'has_header':      {'type':     ["null", "boolean"]},
                         'recdelimiter':    {"type":     ["null", "string"]},
                         'quoting':         {'type':     ["null", "string"],
                                             "enum":     ["quote_none", "quote_all",
                                                          "quote_minimal", "quote_nonnumeric",
                                                          "0", "1", "2", "3"]},
                         'quotechar':       {"type":     ["null", "string"]},
                         'escapechar':      {"type":     ["null", "string"]},
                         'stats':           {"type":     "boolean"},
                         'dry_run':         {"type":     "boolean"},
                         'long_help':       {"type":     "boolean"},
                         'help':            {"type":     "boolean"},
                         'version':         {'type':     'string'},
                         },
                     'additionalProperties': False,
                     'required': []
                    }


    config_defaults = {'dry_run':      False,
                       'key_cols':     [],
                       'compare_cols': [],
                       'ignore_cols':  [],
                       'variables':    [],
                       'col_names':    [],
                       'quotechar':    None,
                       'already_sorted': False}

    config = conf.ConfigManager(config_schema)

    #--- if the args refer to a config file load it now: ---
    if 'config_fn' in args and args['config_fn']:
        config.add_file(config_fqfn=args['config_fn'])
    elif 'config_name' in args and args['config_name']:
        config.add_file(app_name=APP_NAME,
                        config_fn='%s.yml' % args['config_name'])

    #--- remove empty args to enable config to provide them
    if args.get('files', None) == []:
        args.pop('files')

    #--- load cli args
    config.add_iterable(args)

    #--- fix correctables - this handles tabs, etc for delimiters:
    if config.cm_config.get('delimiter') is not None:
        new_val = comm.dialect_del_fixer(config.cm_config.get('delimiter'))
        config.add_iterable({'delimiter': new_val})

    #--- finally add defaults
    config.add_defaults(config_defaults)

    #--- validate the consolidated config: ---
    jsonschema.validate(instance=config.cm_config, schema=config_schema2)


    try:
        config.validate()
    except validictory.validator.RequiredFieldValidationError as err:
        if '''Required field '<obj>.files' is missing''' in str(err):
            abort(summary='Error: no files were provided',
                  details='For usage: gristle_differ --help')
        else:
            print(__doc__)
        raise
    config_bonus_validator(config.cm_config)

    return config.cm_config



def config_bonus_validator(config: Dict[str, Any]) -> None:
    """ Provide additional validation of config.

    Purpose is to provide additional validation beyond what's done by JSON
    Schema.  Generally either because we want a better message for a common
    error or because that validation isn't able to be performed by JSON Schema.

    Args:
        config:  the configuration dictionary
    Returns:
        nothing
    Raises:
        sys.exit
    """

    if len(config['files']) != 2:
        abort("Error: Two file names must be provided, what was found: %s" % config['files'])
    elif not exists(config['files'][0]):
        abort("Error: The first file does not exist: %s" % config['files'][0])
    elif not exists(config['files'][1]):
        abort("Error: The second file does not exist: %s" % config['files'][1])

    if len(config['key_cols']) == 0:
        abort("Error: at least one key_col must be provided")

    if config['compare_cols'] and config['ignore_cols']:
        abort("Error: Provide only one of compare_cols or ignore_cols, not both")

    if len(list(set(config['ignore_cols']) & set(config['key_cols']))) > 0:
        config['ignore_cols'] = [x for x in config['ignore_cols'] if x not in config['key_cols']]
        print("Warning: some key-cols removed from ignore-cols")
        print("Revised config['ignore_cols']: %s" % config.get('ignore_cols', None))
    elif len(list(set(config['compare_cols']) & set(config['key_cols']))) > 0:
        config['compare_cols'] = [x for x in config['compare_cols'] if x not in config['key_cols']]
        print("Warning: some key-cols removed from compare-cols")
        print("Revised config['compare_cols']: %s" % config.get('compare_cols', None))

    if config['has_header']:
        abort("Error: has_header is not supported yet.")

    for kv_pair in config['variables']:
        if ':' not in kv_pair:
            abort('Invalid variable: must be name:value.  Was: %s' % kv_pair)

    if 'assignments' in config:
        for assign in config['assignments']:
            if isinstance(assign['src_field'], list):
                abort('Assignment src_field must be a string (refers to col_name) '
                      'or an integer - it is a list')
            if isinstance(assign['dest_field'], list):
                abort('Assignment dest_field must be a string (refers to col_name)'
                      'or an integer - it is a list')


def print_config(config, dialect):
    print("config['files']: %s" % config.get('files'))
    print("config['key_cols']: %s" % config.get('key_cols', None))
    print("config['compare_cols']: %s" % config.get('compare_cols', None))
    print("config['ignore_cols']: %s" % config.get('ignore_cols', None))
    print("config['variables']: %s" % config.get('variables', None))
    print("config['col_names']: %s" % config.get('col_names', None))
    print("dialect.has_header']: %s" % dialect.has_header)


if __name__ == '__main__':
    sys.exit(main())
