Metadata-Version: 2.1
Name: tf-word2vec
Version: 1.0.7
Summary: Word2Vec implentation with Tensorflow Estimators and Datasets
Home-page: https://github.com/akb89/word2vec
Author:  Alexandre Kabbach
Author-email: akb@3azouz.net
License: MIT
Download-URL: https://github.com/akb89/word2vec/archive/1.0.3.tar.gz
Description: # Word2Vec
        
        [![GitHub release][release-image]][release-url]
        [![PyPI release][pypi-image]][pypi-url]
        [![Build][travis-image]][travis-url]
        [![MIT License][license-image]][license-url]
        
        This is a re-implementation of Word2Vec relying on Tensorflow
        [Estimators](https://www.tensorflow.org/guide/estimators) and
        [Datasets](https://www.tensorflow.org/guide/datasets_for_estimators).
        
        Works with python >= 3.6 and Tensorflow v2.0.
        
        ## Install
        via pip:
        ```shell
        pip3 install tf-word2vec
        ```
        or, after a git clone:
        ```shell
        python3 setup.py install
        ```
        
        ## Get data
        You can download a sample of the English Wikipedia here:
        ```shell
        wget http://129.194.21.122/~kabbach/enwiki.20190120.sample10.0.balanced.txt.7z
        ```
        
        ## Train Word2Vec
        ```shell
        w2v train \
          --data /absolute/path/to/enwiki.20190120.sample10.0.balanced.txt \
          --outputdir /absolute/path/to/word2vec/models \
          --alpha 0.025 \
          --neg 5 \
          --window 2 \
          --epochs 5 \
          --size 300 \
          --min-count 50 \
          --sample 1e-5 \
          --train-mode skipgram \
          --t-num-threads 20 \
          --p-num-threads 25 \
          --keep-checkpoint-max 3 \
          --batch 1 \
          --shuffling-buffer-size 10000 \
          --save-summary-steps 10000 \
          --save-checkpoints-steps 100000 \
          --log-step-count-steps 10000
        ```
        
        [release-image]:https://img.shields.io/github/release/akb89/word2vec.svg?style=flat-square
        [release-url]:https://github.com/akb89/word2vec/releases/latest
        [pypi-image]:https://img.shields.io/pypi/v/tf-word2vec.svg?style=flat-square
        [pypi-url]:https://pypi.org/project/tf-word2vec/
        [travis-image]:https://img.shields.io/travis/akb89/word2vec.svg?style=flat-square
        [travis-url]:https://travis-ci.org/akb89/word2vec
        [license-image]:http://img.shields.io/badge/license-MIT-000000.svg?style=flat-square
        [license-url]:LICENSE.txt
        
Keywords: word2vec,word embeddings,tensorflow,estimators,datasets
Platform: any
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Linguistic
Description-Content-Type: text/markdown
