# This is a non-breaking prefix list for the Turkish language.
# The file is used for sentence tokenization (text -> sentence splitting).
#
# The file is home-made by a programmer (not a linguist) who doesn't even speak Turkish so it surely can be improved.
#

# Anything in this file, followed by a period (and an upper-case word), does NOT
# indicate an end-of-sentence marker.
# Special cases are included for prefixes that ONLY appear before 0-9 numbers.

# Any single upper case letter followed by a period is not a sentence ender
# (excluding I occasionally, but we leave it in).
# Usually upper case letters are initials in a name.
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z

# Usually upper case letters are initials in a name (Turkish alphabet)
Ç
Ğ
İ
Ö
Ş
Ü

# Roman Numerals
I
II
III
IV
V
VI
VII
VIII
IX
X
XI
XII
XIII
XIV
XV
XVI
XVII
XVIII
XIX
XX

# English -- but these work globally for all languages:
Mr
Mrs
No
pp
St
no
Sr
Jr
Bros
etc
vs
esp
Fig
fig
Jan
Feb
Mar
Apr
Jun
Jul
Aug
Sep
Sept
Oct
Okt
Nov
Dec
Ph.D
PhD
# in "et al."
al
cf
Inc
Ms
Gen
Sen
Prof
Dr
Corp
Co

# http://en.wiktionary.org/wiki/Category:Turkish_abbreviations
Av

# http://en.bab.la/phrases/business/abbreviations/english-turkish/
no
Başk.yard
Bşk.yrd

# http://www.learningpracticalturkish.com/turkish-acronyms.html
Akad
Alb
Alm
anat
ant
Apt
Ar
Ar. Gör
ark
As
Asb
Asist
astr
astrol
Atğm
atm
Av
bağ
Bçvş
B.E
bitb
biy
bk
Bl
Bn
Bnb
bot
Böl
bs
Bşk
Bul
Bulg
C
Cad
coğ
çev
Çvş
D
dam
db
dbl
Doç
doğ
Dr
Dz. Kuv. K
dzş
e
Ecz
ed
ekon
Ens
Erm
F
f
Fak
Far
fel
fil
fiz
Fr
Gen
geom
gn
Gnkur
Gön
H.O
Hv. Kuv
Hv. Kuv. K
Hz
İbr
is
İsp
İt
A
AA
AAM
AB
ABB
ABD
ABO
ABS
ABTA
ADF
ADSL
AEA
AET
AFFB
AGE
AGIK
AGIT
AGM
AGY
AHA
AIHAD
AIHM
AK
AKBIL
AKDTYK
AKM
AKPM
AKT
AKUT
ALB
ALM
ANAT
ANT
AO
AOÇ
AÖF
APO
APS
APT
AR
ARAD
ARGE
ARK
ARMTS
ARS G
AS
AS IZ
ASB
ASELSAN
ASKI
ASO
ASPAVA
AST
ASTB
ASTI
ASTR
ASTROL
AT
ATAA
ATC
ATCA
ATFA
ATFC
ATGM
ATKY
ATM
ATO
AÜ
AUA
AV
AYB
B
BÜ
BAE
BAG
BAG-KUR
BAIB
BB
BBB
BCVS
BDDK
BDT
BE
BEDD
BEK
BELGEç
BIFO
BITB
BIY
BK
BL
BM
BMCYT
BMIYK
BMT
BMTT
BMUTB
BN
BNB
BO
BOB
BOL
BOSB
BOT
BOTAS
BS
BSK
BTTO
BUL
BULG
BYOB
C
CÜ
CAD
ÇAYKUR
CE
ÇEV
CHP
CM
CMUK
COG
COK
COMECON
COS
ÇS
CSO
ÇÜ
CUF
ÇUKOBIRLIK
CUM BSK
ÇVS
D
DÜ
DAKA
DAL
DAM
DAP
DB
DBL
DDY
DEÜ
DG
DGM
DGS
DHMI
DIE
DK
dL
DLH
DM
DMO
DNA
DOC
DOG
DOKA
DPT
DR
DRL
DSÖ
DSI
DSÖ
DT
DTÖ
DTCF
DTÖ
DTP
DVDB
DVK
DZ KUV
DZ KUV K
DZKK
DZL
E
EAT
EBKT
EBYB
ECZ
ED
EFT
EGB
EGO
EIEI
EKON
EMK
ENS
EPDK
ERDEMIR
ERM
ESHOT
ET
ETAE
ETTO
EÜ
EYY
F
FAB
FAK
FAR
FBA
FBSK
FEL
FIL
FISKOBIRL…
FIZ
FIZY
FKB
FM
FR
FÜ
G
GAHD
GAMHK
GAP
GATA
GAZÜ
GB
GD
GEN
GEOM
GHAT
GN
GNKUR
GON
GOÜ
GR
GRT
GSMH
GSYH
GTIB
GTIP
GÜ
H
HÜ
HA
HAVHG
HDIUB
HEK.
HIV
HKMO
HL
HLK
HMUK
HO
HOCG
HS UZM
HST
HUK
HV KUV
HV KUV K
HZ
HZ OZ
HZL
IÜ
IAL
IBB
IBR
IETT
IFM
IFMO
IGHAO
IKÖ
IKB
IMKB
ING
IRTK
IS
ISDEMIR
ISIB
ISKI
ISKUR
ISL
ISMD
ISO
ISOT
ISP
IT
ITÜ
ITB
ITHS
ITO
ITTU
ITU
IZTB
IZTO
J
JAP
JEOL
JGK
JHY
JKDB
JTAD
K
KÜ
KAAT
KAG
KAL
KARDEMIR
KAUR
KB
KBB
KD
KDV
KEIB
KG
KH
KHK
KIBB
KIK
KIM
KIT
KKK
KKTC
KM
KMFB
KOB
KOBI
KOI
KOOR
KOR
KORA
KORG
KOSGEB
KPDS
KRS
KTÜ
KTFKD
KTLN
KUP
KUR
KUR BSK
L
LAT
LBGFO
LCV
LES
LIM
M
MAC
MAH
MAN
MAO
MAT
MCT
MD
MEB
MEC
MESAM
MGK
MHAT
MIM
MIN
MIT
MK
MKE
MKS
MKYK
MM
MÖ
MPI
MPM
MS
MSÜ
MSN
MTA
MTCT
MTDS
MTFS
MTTB
MÜ
MUH
MUR
MUSAD
MUZ
MYK
Naber
NBDK
NETIS
NKFVAS
NO
NO SB
NTIC
NTNU
NU
ODTU
ÖIB
OKT
ONB
OPR
OR
ORA
ORD
ORG
ORT
OSB
OSM T
ÖSS
ÖSYM
ÖT
OTSI
ÖTV
OYAK
ÖYK
ÖYS
ÖZ
PED
PETKIM
PIEAUO
PIEUT
PK
PO
POAS
PORT
PROF
PSIKOL
PTT
QTM
RNA
RTÜK
RUM
RUS
S
SÜ
SA
SAS
SAT
SATBA
SB
SBF
SBTA
SEK
SEKA
SF
SHÇEK
SHKD
SKT
SL
SN
SNT
SOK
SOS
SP
SPG
SPK
SSCB
SSK
SSM
SSSG
STÖ
STFF
STK
SYS
T
TÜ
TÜFE
TAAB
TAAM
TACA
TACC
TAD
TAEK
TAI
TALA
TAN
TAO
TAR
TARC
TAS
TASM
TB
TBB
TBMM
TC
TCAA
TCCE
TCDD
TCK
TCL
TCLR
TCMA
TCMB
TCZB
TDÇI
TDD
TDI
TDK
TEAS
TEDAS
TEFE
TEIAS
TEK
TEL
TELG
TETAS
TFSA
TFSC
TGC
TGIC
TGM
TGNA
TGOD
TGS
THA
THK
THS
THSD
THTDC
THUS
THY
TIGEM
TIKA
TIM
TIRET
TISE
TISK
TITC
TIY
TJEM
TJG
TJOS
TKAE
TKB
TKF
TKI
TL
TLAR
TLCE
TLCP
TLFC
TLKS
TLS
TM
Tmm
TMMOB
TMO
TNPS
TOBB
TODAIE
TOKI
TOP
TP
TPAO
TPF
TPLA
TPOC
TPTO
TR
TRNC
TRT
TRTC
TSBO
TSE
TSI
TSIG
TSK
TSKA
TSMS
TSO
TSOR
TT
TTB
TTK
TTKB
TTOK
TTPP
TÜBA
TÜBITAK
TUGA
TUGG
TÜIK
TUM
TÜMA
TÜMG
TUOK
TÜPRAS
TURANT
TÜRDOK
TURK
TÜRKSAT
TÜRSAB
TUS
TÜTAV
TV
TZDK
TZOB
Ü
UA
UAA
UAAF
UAD
UAEA
UAGF
UAK
UAT
UBAM
Ubb
UBE
UBF
UBOK
UCPOLK
UCT
ÜçVS
ÜDS
ÜFE
UFFA
UHD
UHTB
UHUHM
UK
UKA
UKB
UKK
UKYT
UMT
UN
UNESCO
UNICEF
ÜNL
UOK
UPB
UPF
USA
USAS
USDN
USF
USHÖ
USIAD
USKN
USMN
USO
ÜT
ÜTGM
UTWM
UÜ
UYB
UYF
UYIU
UZM
VA
VB
VD
VDMK
VET
VGHD
VIOP
VS
WTWA
Y. MIM
Y. MüH
YAY
YB
YD SB
YDOIT
YDS
YGK
YKR
YOK
YÖS
YRD DOC
YSE
YSK
YTÜ
YTL
YTML
YTUDAK
YURTKUR
YY
YYÜ
YZB
ZF
ZM
ZMO
ZOOL

# Number indicators
# add #NUMERIC_ONLY# after the word if it should ONLY be non-breaking when a 0-9 digit follows it
hayır

# Ordinals are (apparently) done with . in Turkish - "1." = "1st" in English
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99