Metadata-Version: 2.1
Name: temporython
Version: 0.8.1
Summary: Generate temporary Python scripts to quickly process lines of text or whole text files.
Home-page: https://github.com/waterimp/temporython
Author: Lee Bush
License: MIT
Description: # temporython
        
        Generate temporary Python scripts to quickly process lines of text or whole text files.
        
        ## Synpopsis
        
        `temporython` is both a command line tool and a Python library. It creates boilerplate Python scripts to help you quickly solve text processing problems.
        
        If you want to build quick, one-off scripts in Python to process lines of text from files or standard input, this tool may speed up your productivity by setting you up with a boilerplate script that already has command line and input processing handled.
        All you need to do is edit the code to process the lines of input, which may be as little as one line of code.
        
        ## Quick example
        
        Let's say your manager emails you two text files in a random format and you need to analyze/convert/transform/correct/import that data in some manner.
        The files are named `bounced_email_logs_yesterday.log` and `bounced_email_logs_today.log`.
        She asks if you can get your processing work done quickly because an important customer is waiting.
        If you find yourself writing custom Python scripts or writing in an interactive shell, then `temporython` could help accelerate getting you set up. Let's see how that is done.
        
        You type the command...
        
        ```console
        temporython lines process_bounced_email_logs.py
        ```
        ...and the file `process_bounced_email_logs.py` is created. Here is what is inside of that file.
        
        ```python
        #! /usr/bin/env python3
        
        # a quick script initially generated by `temporython` that:
        #   * reads in a list of filenames provided on the command line, or defaults to stdin.
        #   * processes each line
        
        # nice reference for Python 3 text processing: https://docs.python.org/3/library/text.html
        import string
        import textwrap
        import re
        
        from temporython import main
        
        
        ### custom processing #########################################################
        
        class LineProcessor:
            def __init__(self):
                """
                called before processing begins
                """
                pass
        
            def process_line(self, filename, line_number, line):
                # NOTE: filename can be '-' if processing stdin.
                line = line.strip()
                print(line)
        
            def post_process(self):
                """
                called once after all lines in all files have been processed.
                """
                pass
        
        
        if __name__ == '__main__':
            main(LineProcessor)
        ```
        
        Nice! We have a pretty clear boilerplate script. Now let's customize it.
        
        You open up your new script in your favorite text editor and edit the `process_line()` function to your liking. You write code to parse the line and print out the digested results if certain conditions are met (as per your boss's email).
        
        Now let's process the files with the new script.
        
        ```terminal
        $ ./process_bounced_email_logs.py bounced_email_logs_yesterday.log bounced_email_logs_today.log
        ```
        
        Or we can pipe the files in like this instead...
        
        ```terminal
        $ cat bounced_email_logs_yesterday.log bounced_email_logs_today.log | ./process_bounced_email_logs.py
        ```
        
        And now the data is processed and you manager is happy because you completed the task and you completed it quickly.
        Yay!
        `temporython` set up a great boilerplate to get you processing the log data quickly and allowed you to focus on writing your custom logic.
        
        ### Features
        
        #### template types
        `temporython` generates three main kinds of text processing scripts:
        * **lines** - boilerplate is set up so that you can process lines of text, and know which file and line number each line comes from. The lines can be piped in to `stdin` or filenames can be provided to the generated script via command line arguments.
        * **pipe** - boilerplate is set up so that you can process lines of text that only come from `stdin`.
        * **files** - boilerplate is set up so that you can process contents of whole files. Filenames are provided as command line arguments.
        
        #### importing (default) vs. inlining
        
        By default, the generated scripts will rely on the `temporython` library to provide functionality.
        This helps keep the generated scripts short.
        But if you do not want your scripts to depend on the `temporython` library, you can use the `--inline` option.
        
        ## Installation
        
        
        ```bash
        pip3 install temporython
        ```
        
        
        ### Requirements
        
        This software requires Python 3.5 or above.
        
        
        ## Usage
        
        ### Display help
        
        You can display help with `temporython --help` or `temporython -h`.
        
        ```console
        $ temporython --help
        usage: temporython [-h] [-i] {files,lines,pipe} [FILENAME]
        
        positional arguments:
          {files,lines,pipe}
          FILENAME            Name of file to generate
        
        optional arguments:
          -h, --help          show this help message and exit
          -i, --inline        Inline the temporython library in the generated code instead
                              of including it.
        ```
        
        ### Command line switches
        
        * **`-i`, `--inline`** - Inline the required pieces of the `temporython` library into the generated script instead of relying on `temporython` being available for import when the script is run. This option is useful if you want to create a script that has no external dependencies.
        
        ### Generate a script to process lines of text
        
        This command generates a script called 'my_cool_script.py'
        
        ```console
        $ temporython lines my_cool_script.py
        ```
        
        Note: If you do not provide a name, then a default name of `process_lines.tmp` will be chosen.
        
        Now edit the code to your liking.
        
        * `LineProcessor.__init__()` - add code in here that you wish to run only once when your script begins.
        * `LineProcessor.process_line(filename, line_number, line)` - edit the code here to process each line as you wish.
        * `LineProcessor.post_process()` - add code here that you wish to run once after all files & lines have been processed.
        
        
        You can process lines by running your new script like this...
        
        ```console
        $ ./my_cool_script.py file1.txt file2.text file3.text
        ```
        
        Or you can process lines by piping them in like this...
        
        ```console
        $ cat file1.txt file2.text file3.text | ./my_cool_script.py
        ```
        
        
        
        ### Generate a script to process text piped in via `stdin`
        
        This command generates a script called 'my_cool_script.py'
        
        ```console
        $ temporython pipe my_cool_script.py
        ```
        
        Note: If you do not provide a name, then a default name of `process_pipe.tmp` will be chosen.
        
        Now edit the code to your liking.
        
        * `LineProcessor.__init__()` - add code in here that you wish to run only once when your script begins.
        * `LineProcessor.process_line(filename, line_number, line)` - edit the code here to process each line as you wish.
        * `LineProcessor.post_process()` - add code here that you wish to run once after all files & lines have been processed.
        
        You can run your script by piping in content like this...
        
        ```console
        $ cat file1.txt file2.text file3.text | ./my_cool_script.py
        ```
        
        
        
        ### Generate a script to process whole files
        
        **performance warning when processing large files:** - `temporython` generates a script that will load entire files into memory at one time. If you are processing really large files, your script may run really slowly or run out of memory.
        
        This command generates a script called 'my_cool_script.py'
        
        ```console
        $ temporython files my_cool_script.py
        ```
        
        Note: If you do not provide a name, then a default name of `process_files.tmp` will be chosen.
        
        Now edit the code to your liking.
        
        * `FileProcessor.__init__()` - add code in here that you wish to run only once when your script begins.
        * `FileProcessor.process_file(filename, contents)` - edit the code here to process the contents of each file as you wish.
        * `FileProcessor.post_process()` - add code here that you wish to run once after all files & lines have been processed.
        
        You can process lines by running your new script like this...
        
        ```console
        $ ./my_cool_script.py file1.txt file2.text file3.text
        ```
        
        
        You can also pipe data into your file processing script and it will process all input as on large file named `-`.
        
        ```console
        $ cat file1.txt file2.text file3.text | ./my_cool_script.py
        ```
        
        
        ## What if my temporary script isn't temporary any more?
        
        Perhaps after creating your script you find yourself reusing or maintaining what was supposed to be a one-off script.
        This is not a problem.
        
        You could ensure your script is named well and then check it in to the code repo of the project it supports.
        You could install `temporython` in your environment if your script imports the libary, or you do not need to install it if you used the `--inline` option.
        
        It is also possible that your script may outgrow `temporython`.
        You can refactor your script to remove the dependency on `temporython`.
        This may include writing your own command line argument parser using Python's built-in `argparse` library.
        
        ## Alternatives
        
        Text processing problems can be solved in a variety of ways other than using `temporython`.
        
        Here is only a short list of possibilities...
        * shell scripts - using `grep`, `sed`, `awk`, etc.
        * spreadsheets
        * Jupyter notebooks
        * interactive Python shell
        * other text processing Python libraries
        * other code generators
        * text editor automation
        * manual editing
        
        Each of these have have advantages and disadvantages, but in the end it the choice of tool(s) comes down to your personal preference, your comfort level, and the constraints/requirements of the environment that you work in.
        
        If you are comfortable writing ad hoc code to slice and dice strings in Python, `temporython` may be a great tool to add to your toolbelt.
        You can use `temporython` instead of or or along with the alternatives listed above depending on the text processing problem you face.
        
        ## License
        
        MIT
        
Keywords: python code generator data processing
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Topic :: Software Development :: Code Generators
Classifier: Topic :: Text Processing
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.5
Description-Content-Type: text/markdown
