Metadata-Version: 2.1
Name: psola
Version: 0.0.1
Summary: Time-domain pitch-synchronous overlap-add
Home-page: https://github.com/maxrmorrison/psola
Author: Max Morrison
Author-email: maxrmorrison@gmail.com
License: GPLv3
Description: # Time-domain pitch-synchronous overlap-add (TD-PSOLA)
        [![PyPI](https://img.shields.io/pypi/v/psola.svg)](https://pypi.python.org/pypi/psola)
        [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
        <!-- [![Downloads](https://pepy.tech/badge/psola)](https://pepy.tech/project/psola) -->
        
        This module permits contant- and variable-rate pitch-shifting and
        time-stretching of speech. It is a wrapper around the
        [`parselmouth`](https://github.com/YannickJadoul/Parselmouth) [1]
        wrapper around the Praat [2] implementation of TD-PSOLA [3]. Pitch-shifting
        is performed by providing a numpy array of target pitch values equally spaced
        over time. Variable-rate time stretching uses forced phoneme alignment via
        [`pypar`](https://github.com/maxrmorrison/pypar).
        
        If you need to extract pitch features or phoneme alignments, see
        [`torchcrepe`](https://github.com/maxrmorrison/torchcrepe) for pitch estimation
        and [`pyfoal`](https://github.com/maxrmorrison/pyfoal) for forced alignment.
        If you only want to perform pitch-shifting, you do not need to extract
        forced alignments. If you want to do variable-rate time stretching, you do not
        need to perform pitch estimation.
        
        
        ## Installation
        
        `pip install psola`
        
        
        ## Usage
        
        If you want to perform pitch-shifting or time-stretching on audio already
        loaded into memory, use `psola.vocode`. If you want to do this with audio
        saved in a file, use `psola.from_file`. You can use `psola.to_file` or
        `psola.from_file_to_file` to save the results to a file. To process many
        files at once with multiprocessing, use `psola.from_files_to_files`.
        Each of these functions is documented below. The command-line interface
        wraps the arguments of `psola.from_files_to_files` and is described in
        the next section.
        
        
        ### `psola.vocode`
        
        ```python
        """Performs pitch vocoding using Praat
        
        Arguments
            audio : np.array(shape=(samples,))
                The speech signal to process
            sample_rate : int
                The audio sampling rate.
            source_alignment : pypar.Alignment
                The current alignment if performing time-stretching
            target_alignment : pypar.Alignment
                The target alignment if performing time-stretching
            target_pitch : np.array(shape=(frames,))
                The target pitch contour
            constant_stretch : float or None
                A constant value for time-stretching
            fmin : int
                The minimum allowable frequency in Hz.
            fmax : int
                The maximum allowable frequency in Hz.
        
        Returns
            audio : np.array(shape=(samples,))
                The vocoded audio
        """
        ```
        
        
        ### `psola.from_file`
        
        ```python
        """Performs vocoding using Praat
        
        Arguments
            audio_file : string
                The file containing the speech signal to process
            source_alignment_file : string or None
                The file containing the original alignment
            target_alignment_file : string or None
                The file containing the target alignment
            target_pitch_file : string or None
                The file containing the target pitch
            constant_stretch : float or None
                A constant value for time-stretching
            fmin : int
                The minimum allowable frequency in Hz.
            fmax : int
                The maximum allowable frequency in Hz.
        
        Returns
            audio : np.array(shape=(samples,))
                The vocoded audio
            sample_rate : int
                The audio sampling rate
        """
        ```
        
        
        ### `psola.to_file`
        
        ```python
        """Performs pitch vocoding and saves audio to disk
        
        Arguments
            audio : np.array(shape=(samples,))
                The speech signal to process
            sample_rate : int
                The audio sampling rate
            output_file : string
                The file to save the vocoded speech
            source_alignment : pypar.Alignment
                The current alignment if performing time-stretching
            target_alignment : pypar.Alignment
                The target alignment if performing time-stretching
            target_pitch : np.array(shape=(frames,))
                The target pitch contour
            constant_stretch : float or None
                A constant value for time-stretching
            fmin : int
                The minimum allowable frequency in Hz.
            fmax : int
                The maximum allowable frequency in Hz.
        """
        ```
        
        
        ### `psola.from_file_to_file`
        
        ```python
        """Performs vocoding using Praat and save to disk
        
        Arguments
            audio_file : string
                The file containing the speech signal to process
            output_file : string
                The file to save the vocoded speech
            source_alignment_file : string or None
                The file containing the original alignment
            target_alignment_file : string or None
                The file containing the target alignment
            target_pitch_file : string or None
                The file containing the target pitch
            constant_stretch : float or None
                A constant value for time-stretching
            fmin : int
                The minimum allowable frequency in Hz.
            fmax : int
                The maximum allowable frequency in Hz.
        """
        ```
        
        
        ### `psola.from_files_to_files`
        
        ```python
        """Performs vocoding using Praat and save to disk
        
        Arguments
            audio_files : list
                The files containing the speech signals to process
            output_files : list
                The files to save the vocoded speech
            source_alignment_files : string or None
                The files containing the original alignments
            target_alignment_files : list or None
                The files containing the target alignments
            target_pitch_files : list or None
                The files containing the target pitch
            constant_stretch : float or None
                A constant value for time-stretching
            fmin : int
                The minimum allowable frequency in Hz.
            fmax : int
                The maximum allowable frequency in Hz.
        """
        ```
        
        
        ## Command-line interface
        
        ```
        usage: python -m psola
            [-h]
            [--audio_files AUDIO_FILES [AUDIO_FILES ...]]
            [--source_alignment_files SOURCE_ALIGNMENT_FILES [SOURCE_ALIGNMENT_FILES ...]]
            [--target_alignment_files TARGET_ALIGNMENT_FILES [TARGET_ALIGNMENT_FILES ...]]
            [--constant_stretch CONSTANT_STRETCH]
            [--target_pitch_files TARGET_PITCH_FILES [TARGET_PITCH_FILES ...]]
            [--fmin FMIN]
            [--fmax FMAX]
            [--output_files OUTPUT_FILES [OUTPUT_FILES ...]]
        
        optional arguments:
          -h, --help            show this help message and exit
          --audio_files AUDIO_FILES [AUDIO_FILES ...]
                                The speech signal to process
          --source_alignment_files SOURCE_ALIGNMENT_FILES [SOURCE_ALIGNMENT_FILES ...]
                                The files containing the original alignments
          --target_alignment_files TARGET_ALIGNMENT_FILES [TARGET_ALIGNMENT_FILES ...]
                                The files containing the target alignments
          --constant_stretch CONSTANT_STRETCH
                                A constant value for time-stretching
          --target_pitch_files TARGET_PITCH_FILES [TARGET_PITCH_FILES ...]
                                The target pitch contour
          --fmin FMIN           The minimum allowable frequency in Hz
          --fmax FMAX           The maximum allowable frequency in Hz
          --output_files OUTPUT_FILES [OUTPUT_FILES ...]
                                Where to save the vocoded audio
        ```
        
        
        ## References
        
        [1] Y. Jadoul, B. Thompson, and B. De Boer, "Introducing parselmouth: A python interface to praat," Journal of Phonetics, vol. 71, pp. 1â€“15, 2018.
        
        [2] P. Boersma, "Praat: doing phonetics by computer", http://www.praat.org/, 2006.
        
        [3] E. Moulines and F. Charpentier, "Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones," Speech communication, 1990.
        
Keywords: audio,duration,pitch,speech,stretch,vocode
Platform: UNKNOWN
Description-Content-Type: text/markdown
