|
pycrossword
0.4
Pure-Python implementation of a crossword puzzle generator and editor
|
A single import task to import words from a DIC file (downloaded from the Hunspell repo) to an SQLite database *.db file. More...
Public Member Functions | |
| def | __init__ (self, lang, dicfile=None, posrules=None, posrules_strict=False, posdelim='/', lcase=True, replacements=None, remove_hyphens=True, filter_out=None, rows=None, commit_each=1000, on_stopcheck=None, id=0) |
| def | run (self) |
| Overridden worker method called when the task is started: does the import job. More... | |
Public Attributes | |
| signals | |
HunspellImportSignals signals emiited by the import task More... | |
| lang | |
str short name of the language, e.g. More... | |
| dicfile | |
str | None full path to the DIC file to import words from More... | |
| posrules | |
dict part-of-speech regular expression parsing rules More... | |
| posrules_strict | |
bool import only the indicated or all parts of speech More... | |
| posdelim | |
str delimiter delimiting the word and its part of speech (default = '/') More... | |
| lcase | |
bool import words in lower case More... | |
| replacements | |
dict character replacement rules More... | |
| remove_hyphens | |
bool remove all hyphens from words More... | |
| filter_out | |
dict regex-based rules to exclude words More... | |
| rows | |
2-tuple | None the start and end rows (indices) of the words to import More... | |
| commit_each | |
int threshold of DB insert operations after which the changes are written to the DB More... | |
| on_stopcheck | |
callback callback function called periodically to check for interrupt condition More... | |
| id | |
int unique ID of this task (in the thread pool) More... | |
Private Member Functions | |
| def | _delete_db (self, db) |
| Deletes the existing DB file. More... | |
| def | _get_pos (self, cur) |
| Retrieves the list of parts of speech present in the DB. More... | |
A single import task to import words from a DIC file (downloaded from the Hunspell repo) to an SQLite database *.db file.
Derived from QtCore.QRunnable so the task can be run in a thread pool concurrently with other tasks.
| def pycross.dbapi.HunspellImportTask.__init__ | ( | self, | |
| lang, | |||
dicfile = None, |
|||
posrules = None, |
|||
posrules_strict = False, |
|||
posdelim = '/', |
|||
lcase = True, |
|||
replacements = None, |
|||
remove_hyphens = True, |
|||
filter_out = None, |
|||
rows = None, |
|||
commit_each = 1000, |
|||
on_stopcheck = None, |
|||
id = 0 |
|||
| ) |
| lang | str short name of the language, e.g. 'en' |
| dicfile | str | None full path to the DIC file to import words from (None means the default path will be assumed: pycross/assets/dic/<LANGUAGE>.dic) |
| posrules | dict part-of-speech regular expression parsing rules in the format: {'N': 'regex for nouns', 'V': 'regex for verb', ...}
Possible keys are: 'N' [noun], 'V' [verb], 'ADV' [adverb], 'ADJ' [adjective],
'P' [participle], 'PRON' [pronoun], 'I' [interjection],
'C' [conjuction], 'PREP' [preposition], 'PROP' [proposition],
'MISC' [miscellaneous / other], 'NONE' [no POS]
|
| posrules_strict | bool if True (default), only the parts of speech present in posrules dict will be imported [all other words will be skipped]. If False, such words will be imported with 'MISC' and 'NONE' POS markers. |
| posdelim | str delimiter delimiting the word and its part of speech [default = '/'] |
| lcase | bool if True (default), found words will be imported in lower case; otherwise, the original case will remain |
| replacements | dict character replacement rules in the format: {'char_from': 'char_to', ...}
None (no replacements) |
| remove_hyphens | bool if True (default), all hyphens ['-'] will be removed from the words |
| filter_out | dict regex-based rules to filter out [exclude] words in the format: {'word': ['regex1', 'regex2', ...], 'pos': ['regex1', 'regex2', ...]}
None (no filter rules apply). |
| rows | 2-tuple | None the start and end rows (indices) of the words to import; e.g. (20, 100) means start import from row 20 and end import after row 100. If the second element in the tuple is negative (e.g. -1), only the start row will be considered and the import will go on till the last word in the source DIC file. None means ALL available words. |
| commit_each | int threshold of insert operations after which the transaction will be committed (default = 1000) |
| on_stopcheck | callback callback function called periodically to check for interrupt condition; takes 3 parameters:
|
| id | int unique ID of this task (in the thread pool) |
|
private |
Deletes the existing DB file.
| db | Sqlitedb a single SQLite database to delete |
|
private |
Retrieves the list of parts of speech present in the DB.
| cur | SQLite cursor object the DB cursor |
list parts of speech in the short form, e.g. ['N', 'V'] | def pycross.dbapi.HunspellImportTask.run | ( | self | ) |
Overridden worker method called when the task is started: does the import job.
| pycross.dbapi.HunspellImportTask.commit_each |
int threshold of DB insert operations after which the changes are written to the DB
| pycross.dbapi.HunspellImportTask.dicfile |
str | None full path to the DIC file to import words from
| pycross.dbapi.HunspellImportTask.filter_out |
dict regex-based rules to exclude words
| pycross.dbapi.HunspellImportTask.id |
int unique ID of this task (in the thread pool)
| pycross.dbapi.HunspellImportTask.lang |
str short name of the language, e.g.
'en'
| pycross.dbapi.HunspellImportTask.lcase |
bool import words in lower case
| pycross.dbapi.HunspellImportTask.on_stopcheck |
callback callback function called periodically to check for interrupt condition
| pycross.dbapi.HunspellImportTask.posdelim |
str delimiter delimiting the word and its part of speech (default = '/')
| pycross.dbapi.HunspellImportTask.posrules |
dict part-of-speech regular expression parsing rules
| pycross.dbapi.HunspellImportTask.posrules_strict |
bool import only the indicated or all parts of speech
| pycross.dbapi.HunspellImportTask.remove_hyphens |
bool remove all hyphens from words
| pycross.dbapi.HunspellImportTask.replacements |
dict character replacement rules
| pycross.dbapi.HunspellImportTask.rows |
2-tuple | None the start and end rows (indices) of the words to import
| pycross.dbapi.HunspellImportTask.signals |
HunspellImportSignals signals emiited by the import task
1.8.17