Metadata-Version: 2.1
Name: dpdata
Version: 0.1.19
Summary: Manipulating data formats of DeePMD-kit, VASP, QE, PWmat, and LAMMPS, etc.
Home-page: https://github.com/deepmodeling/dpdata
Author: Han Wang
Author-email: wang_han@iapcm.ac.cn
License: UNKNOWN
Description: 
        **dpdata** is a python package for manipulating DeePMD-kit, VASP, LAMMPS data formats.
        dpdata only works with python 3.x.
        
        Installation
        ============
        
        One can download the source code of dpdata by 
        
        .. code-block:: bash
        
           git clone https://github.com/deepmodeling/dpdata.git dpdata
        
        then use ``setup.py`` to install the module
        
        .. code-block:: bash
        
           cd dpdata
           python setup.py install
        
        ``dpdata`` can also by install via pip
        
        .. code-block:: bash
        
           pip3 install dpdata
        
        Quick start
        ===========
        
        This section gives some examples on how dpdata works. Firstly one needs to import the module in a python 3.x compatible code.
        
        .. code-block:: python
        
           import dpdata
        
        The typicall workflow of ``dpdata`` is 
        
        
        #. Load data from vasp or lammps or deepmd-kit data files.
        #. Manipulate data 
        #. Dump data to in a desired format
        
        Load data
        ---------
        
        .. code-block:: python
        
           d_poscar = dpdata.System('POSCAR', fmt = 'vasp/poscar')
        
        or let dpdata infer the format (\ ``vasp/poscar``\ ) of the file from the file name extension
        
        .. code-block:: python
        
           d_poscar = dpdata.System('my.POSCAR')
        
        The number of atoms, atom types, coordinates are loaded from the ``POSCAR`` and stored to a data ``System`` called ``d_poscar``.
        A data ``System`` (a concept used by `deepmd-kit <https://github.com/deepmodeling/deepmd-kit>`_\ ) contains frames that has the same number of atoms of the same type. The order of the atoms should be consistent among the frames in one ``System``. 
        It is noted that ``POSCAR`` only contains one frame.
        If the multiple frames stored in, for example, a ``OUTCAR`` is wanted, 
        
        .. code-block:: python
        
           d_outcar = dpdata.LabeledSystem('OUTCAR')
        
        The labels provided in the ``OUTCAR``\ , i.e. energies, forces and virials (if any), are loaded by ``LabeledSystem``. It is noted that the forces of atoms are always assumed to exist. ``LabeledSystem`` is a derived class of ``System``.
        
        The ``System`` or ``LabeledSystem`` can be constructed from the following file formats with the ``format key`` in the table passed to argument ``fmt``\ :
        
        .. list-table::
           :header-rows: 1
        
           * - Software
             - format
             - multi frames
             - labeled
             - class
             - format key
           * - vasp
             - poscar
             - False
             - False
             - System
             - 'vasp/poscar'
           * - vasp
             - outcar
             - True
             - True
             - LabeledSystem
             - 'vasp/outcar'
           * - vasp
             - xml
             - True
             - True
             - LabeledSystem
             - 'vasp/xml'
           * - lammps
             - lmp
             - False
             - False
             - System
             - 'lammps/lmp'
           * - lammps
             - dump
             - True
             - False
             - System
             - 'lammps/dump'
           * - deepmd
             - raw
             - True
             - False
             - System
             - 'deepmd/raw'
           * - deepmd
             - npy
             - True
             - False
             - System
             - 'deepmd/npy'
           * - deepmd
             - raw
             - True
             - True
             - LabeledSystem
             - 'deepmd/raw'
           * - deepmd
             - npy
             - True
             - True
             - LabeledSystem
             - 'deepmd/npy'
           * - gaussian
             - log
             - False
             - True
             - LabeledSystem
             - 'gaussian/log'
           * - gaussian
             - log
             - True
             - True
             - LabeledSystem
             - 'gaussian/md'
           * - siesta
             - output
             - False
             - True
             - LabeledSystem
             - 'siesta/output'
           * - siesta
             - aimd_output
             - True
             - True
             - LabeledSystem
             - 'siesta/aimd_output'
           * - cp2k
             - output
             - False
             - True
             - LabeledSystem
             - 'cp2k/output'
           * - cp2k
             - aimd_output
             - True
             - True
             - LabeledSystem
             - 'cp2k/aimd_output'
           * - QE
             - log
             - False
             - True
             - LabeledSystem
             - 'qe/pw/scf'
           * - QE
             - log
             - True
             - False
             - System
             - 'qe/cp/traj'
           * - QE
             - log
             - True
             - True
             - LabeledSystem
             - 'qe/cp/traj'
           * - Fhi-aims
             - output
             - True
             - True
             - LabeledSystem
             - 'fhi_aims/md'
           * - Fhi-aims
             - output
             - False
             - True
             - LabeledSystem
             - 'fhi_aims/scf'
           * - quip/gap
             - xyz
             - True
             - True
             - MultiSystems
             - 'quip/gap/xyz'
           * - PWmat
             - atom.config
             - False
             - False
             - System
             - 'pwmat/atom.config'
           * - PWmat
             - movement
             - True
             - True
             - LabeledSystem
             - 'pwmat/movement'
           * - PWmat
             - OUT.MLMD
             - True
             - True
             - LabeledSystem
             - 'pwmat/out.mlmd'
           * - Amber
             - multi
             - True
             - True
             - LabeledSystem
             - 'amber/md'
           * - Gromacs
             - gro
             - False
             - False
             - System
             - 'gromacs/gro'
        
        
        The Class ``dpdata.MultiSystems``  can read data  from a dir which may contains many files of different systems, or from single xyz file which contains different systems.
        
        Use ``dpdata.MultiSystems.from_dir`` to read from a  directory, ``dpdata.MultiSystems`` will walk in the directory 
        Recursively  and  find all file with specific file_name. Supports all the file formats that ``dpdata.LabeledSystem`` supports.
        
        Use  ``dpdata.MultiSystems.from_file`` to read from single file. Now only support quip/gap/xyz  format file.
        
        For example, for ``quip/gap xyz`` files, single .xyz file may contain many different configurations with different atom numbers and atom type.
        
        The following commands relating to ``Class dpdata.MultiSystems`` may be useful.
        
        .. code-block:: python
        
           # load data
        
           xyz_multi_systems = dpdata.MultiSystems.from_file(file_name='tests/xyz/xyz_unittest.xyz',fmt='quip/gap/xyz')
           vasp_multi_systems = dpdata.MultiSystems.from_dir(dir_name='./mgal_outcar', file_name='OUTCAR', fmt='vasp/outcar')
        
           # use wildcard
           vasp_multi_systems = dpdata.MultiSystems.from_dir(dir_name='./mgal_outcar', file_name='*OUTCAR', fmt='vasp/outcar')
        
           # print the multi_system infomation
           print(xyz_multi_systems)
           print(xyz_multi_systems.systems) # return a dictionaries
        
           # print the system infomation
           print(xyz_multi_systems.systems['B1C9'].data)
        
           # dump a system's data to ./my_work_dir/B1C9_raw folder
           xyz_multi_systems.systems['B1C9'].to_deepmd_raw('./my_work_dir/B1C9_raw')
        
           # dump all systems
           xyz_multi_systems.to_deepmd_raw('./my_deepmd_data/')
        
        You may also use the following code to parse muti-system:
        
        .. code-block::
        
           from dpdata import LabeledSystem,MultiSystems
           from glob import glob
           """
           process multi systems
           """
           fs=glob('./*/OUTCAR')  # remeber to change here !!!
           ms=MultiSystems()
           for f in fs:
               try:
                   ls=LabeledSystem(f)
               except:
                   print(f)
               if len(ls)>0:
                   ms.append(ls)
        
           ms.to_deepmd_raw('deepmd')
           ms.to_deepmd_npy('deepmd')
        
        Access data
        -----------
        
        These properties stored in ``System`` and ``LabeledSystem`` can be accessed by operator ``[]`` with the key of the property supplied, for example
        
        .. code-block:: python
        
           coords = d_outcar['coords']
        
        Available properties are (nframe: number of frames in the system, natoms: total number of atoms in the system)
        
        .. list-table::
           :header-rows: 1
        
           * - key
             - type
             - dimension
             - are labels
             - description 
           * - 'atom_names'
             - list of str
             - ntypes
             - False
             - The name of each atom type
           * - 'atom_numbs'
             - list of int
             - ntypes
             - False
             - The number of atoms of each atom type
           * - 'atom_types'
             - np.ndarray
             - natoms
             - False
             - Array assigning type to each atom
           * - 'cells'
             - np.ndarray
             - nframes x 3 x 3
             - False
             - The cell tensor of each frame
           * - 'coords'
             - np.ndarray
             - nframes x natoms x 3
             - False
             - The atom coordinates
           * - 'energies'
             - np.ndarray
             - nframes
             - True
             - The frame energies
           * - 'forces'
             - np.ndarray
             - nframes x natoms x 3
             - True
             - The atom forces
           * - 'virials'
             - np.ndarray
             - nframes x 3 x 3
             - True
             - The virial tensor of each frame
        
        
        Dump data
        ---------
        
        The data stored in ``System`` or ``LabeledSystem`` can be dumped in 'lammps/lmp' or 'vasp/poscar' format, for example:
        
        .. code-block:: python
        
           d_outcar.to('lammps/lmp', 'conf.lmp', frame_idx=0)
        
        The first frames of ``d_outcar`` will be dumped to 'conf.lmp'
        
        .. code-block:: python
        
           d_outcar.to('vasp/poscar', 'POSCAR', frame_idx=-1)
        
        The last frames of ``d_outcar`` will be dumped to 'POSCAR'.
        
        The data stored in ``LabeledSystem`` can be dumped to deepmd-kit raw format, for example
        
        .. code-block:: python
        
           d_outcar.to('deepmd/raw', 'dpmd_raw')
        
        Or a simpler command:
        
        .. code-block:: python
        
           dpdata.LabeledSystem('OUTCAR').to('deepmd/raw', 'dpmd_raw')
        
        Frame selection can be implemented by
        
        .. code-block:: python
        
           dpdata.LabeledSystem('OUTCAR').sub_system([0,-1]).to('deepmd/raw', 'dpmd_raw')
        
        by which only the first and last frames are dumped to ``dpmd_raw``.
        
        replicate
        ---------
        
        dpdata will create a super cell of the current atom configuration.
        
        .. code-block:: python
        
           dpdata.System('./POSCAR').replicate((1,2,3,) )
        
        tuple(1,2,3) means don't copy atom configuration in x direction, make 2 copys in y direction, make 3 copys in z direction.
        
        perturb
        -------
        
        By the following example, each frame of the original system (\ ``dpdata.System('./POSCAR')``\ ) is perturbed to generate three new frames. For each frame, the cell is perturbed by 5% and the atom positions are perturbed by 0.6 Angstrom. ``atom_pert_style`` indicates that the perturbation to the atom positions is subject to normal distribution. Other available options to ``atom_pert_style`` are\ ``uniform`` (uniform in a ball), and ``const`` (uniform on a sphere).
        
        .. code-block:: python
        
           perturbed_system = dpdata.System('./POSCAR').perturb(pert_num=3, 
               cell_pert_fraction=0.05, 
               atom_pert_distance=0.6, 
               atom_pert_style='normal')
           print(perturbed_system.data)
        
Keywords: lammps vasp deepmd-kit
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3.6
Classifier: License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)
Description-Content-Type: text/markdown
