Metadata-Version: 2.1
Name: pgn2data
Version: 0.0.6
Summary: Converts a chess pgn file into a csv dataset containing game information and move information
Home-page: 
Author: zq99
Author-email: zq99@hotmail.com
License: GPL-3.0+
Keywords: CHESS,PGN,NOTATION,DATA,FORSYTH–EDWARDS NOTATION,CSV,DATASET,DATABASE,NORMALIZATION,TABULATION,STRUCTURED DATA
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Education
Classifier: Operating System :: Microsoft :: Windows :: Windows 10
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
License-File: LICENSE

# pgn2data

This library converts chess pgn files into CSV tabulated data sets.

A pgn file can contain one or multiple chess games. The library parses the pgn file and creates two csv files:

- Games file: contains high level information (e.g. date, site, event, score, players etc...)

- Moves file: contains the moves for each game  (e.g. notation, squares, fen position, is in check etc...)

The two files can be mapped together using a GUID which the process inserts into both files.


## Installation

The library requires Python 3.7 or later.  
 
To install, type the following command on the python terminal:

    pip install pgn2data
    
  
## Implementation

Here is a basic example of how to convert a PGN file:

    from converter.pgn_data import PGNData
    
    pgn_data = PGNData("tal_bronstein_1982.pgn")
    pgn_data.export()

The following is an example of grouping multiple files into the same output file ("output.csv").

    pgn_data = PGNData(["file1.pgn","file2.pgn"],"output")
    pgn_data.export()
    
The export function has a return object which allows you to quickly check the size and location of the files created:

    pgn_data = PGNData("tal_bronstein_1982.pgn")
    result = pgn_data.export()
    result.print_summary()

If you want to check if the files have been created before doing further processing you can do the following:

    pgn_data = PGNData("tal_bronstein_1982.pgn")
    result = pgn_data.export()
    if result.is_complete:
        print("Files created!")
    else:
        print("Files not created!")

The result object also provides methods to import the created files into pandas dataframes:

    pgn_data = PGNData("tal_bronstein_1982.pgn")
    result = pgn_data.export()
    if result.is_complete:
        
        # read the games file
        games_df = result.get_games_df()
        
        # read the moves file
        moves_df = result.get_moves_df()
        
        # read both files joined together
        combined_df = result.get_combined_df()
        
        print(games_df.head())
        print(moves_df.head())
        print(combined_df.head())


## Examples

The folder 'samples' in this repository, has some examples of the output from the library.

You can also go [here](https://www.kaggle.com/datasets/zq1200/magnus-carlsen-lichess-games-dataset) to see a Kaggle project that converted all of Magnus Carlsen's online Bullet games
into CSV format. 


## Columns

This is a full list of the columns in each output file:

### Games File

| Field                 | Description                        |
|-----------------------|------------------------------------|
| game_id               | ID of game generated by process    |
| game_order            | Order of game in PGN file          |
| event                 | Event                              |
| site                  | Site                               |
| date_played           | Date played                        |
| round                 | Round                              |
| white                 | White player                       |
| black                 | Black player                       |
| result                | Result                             |
| white_elo             | White player rating                |
| white_rating_diff     | White rating difference from Black |
| black_elo             | Black player rating                |
| black_rating_diff     | Black rating difference from White |
| white_title           | Player title                       |
| black_title           | Player title                       |
| winner                | Player name                        |
| winner_elo            | Player rating                      |
| loser                 | Losing player                      |
| loser_elo             | Player rating                      |
| winner_loser_elo_diff | Diff in rating                     |
| eco                   | Opening                            |
| termination           | How game ended                     |
| time_control          | Time control                       |
| utc_date              | Date played                        |
| utc_time              | Time played                        |
| variant               | Game type                          |
| ply_count             | Ply Count                          |
| date_created          | Extract date                       |
| file_name             | PGN source file                    |


### Moves File

| Field                          | Description                                                             |
|--------------------------------|-------------------------------------------------------------------------|
| game_id                        | ID of game that maps to games file                                      |
| move_no                        | Order of moves                                                          |
| move_no_pair                   | Chess move number                                                       |
| player                         | Player name                                                             |
| notation                       | Standard notation of move                                               |
| move                           | Before and after piece location                                         |
| from_square                    | Piece location before                                                   |
| to_square                      | Piece location after                                                    |
| piece                          | Initial of piece name                                                   |
| color                          | Piece color                                                             |
| fen                            | Fen position                                                            |
| is_check                       | Is check on board                                                       |
| is_check_mate                  | Is checkmate on board                                                   |
| is_fifty_moves                 | Is 50 move complete                                                     |
| is_fivefold_repetition         | Is 5 fold reptition on board                                            |
| is_game_over                   | Is game over                                                            |
| is_insufficient_material       | Is game over from lack of mating material                               |
| white_count                    | Count of white pieces                                                   |
| black_count                    | Count of black pieces                                                   |
| white_{piece}_count            | Count of white specifed piece                                           |
| black_{piece}_count            | Count of black specifed piece                                           |
| captured_score_for_white       | Total of black pieces captured                                          |
| captured_score_for_black       | Total of white pieces captured                                          |
| fen_row{number}_{colour)_count | Number of pieces for the specified colour on this row of the board      |
| fen_row{number}_{colour}_value | Total value of pieces for the specified colour on this row of the board |
| move_sequence                  | Sequence of moves upto current position                                 |


## Acknowledgements

This project makes use of the [python-chess](https://github.com/niklasf/python-chess) library.
