Module tf.writing.hebrew
Hebrew characters
Disclaimer
This just a look-up table, not a full exposition of the organization of the Masoretic system.
Transcriptions
The ETCBC transcription is used by the ETCBC.
It has entries for all accents, but not
for text-critical annotations such as uncertainty, and correction.
The Abegg transcription is used in the Dead Sea scrolls.
It has no entries for
accents, but it has a repertoire of text-critical marks.
We have back translated the latter to ETCBC
-compatible variants
and entered them in the
ETCBC
column, although they are not strictly ETCBC
marks.
Phonetics
The phonetic representation is meant as a tentative 1-1 correspondence with pronunciation, not with the script. See phono.ipynb, where the phonetic transcription is computed and thoroughly documented.
Consonants
Details
- For most
consonants: an inner dot is a
dagesh forte
. - For the בגדכפת consonants:
an inner dot is either a
dagesh forte
or adagesh lene
. - When the ה contains a dot, it is called a
mappiq
.
transcription (ETCBC ) |
transcription (Abegg) | glyph | phonetic | remarks | name | UNICODE |
---|---|---|---|---|---|---|
> | a | א | ʔ | when not mater lectionis | letter alef | 05D0 |
B | b | ב | bb b v |
forte lene normal |
letter bet | 05D1 |
G | g | ג | gg g ḡ |
forte lene normal |
letter gimel | 05D2 |
D | d | ד | dd d ḏ |
forte lene normal |
letter dalet | 05D3 |
H | h | ה | h | also with mappiq; when not mater lectionis | letter he | 05D4 |
W | w | ו | ww w û |
forte when not part of a long vowel with dagesh as vowel |
letter vav | 05D5 |
Z | z | ז | zz z |
forte normal |
letter zayin | 05D6 |
X | j | ח | ḥ | letter het | 05D7 | |
V | f | ט | ṭ | letter tet | 05D8 | |
J | y | י | yy y ʸ |
forte when not part of long vowel in front of final ו |
letter yod | 05D9 |
K | k | כ | kk k ḵ |
forte lene normal |
letter kaf | 05DB |
k | K | ך | k ḵ |
forte normal |
letter final kaf | 05DA |
L | l | ל | ll l |
forte normal |
letter lamed | 05DC |
M | m | מ | mm m |
forte normal |
letter mem | 05DE |
m | M | ם | m | letter final mem | 05DD | |
N | n | נ | nn n |
forte normal |
letter nun | 05E0 |
n | N | ן | n | letter final nun | 05DF | |
S | s | ס | ss s |
forte normal |
letter samekh | 05E1 |
< | o | ע | ʕ | letter ayin | 05E2 | |
P | p | פ | pp p f |
forte lene normal |
letter pe | 05E4 |
p | P | ף | p f |
forte normal |
letter final pe | 05E3 |
Y | x | צ | ṣṣ ṣ |
forte normal |
letter tsadi | 05E6 |
y | X | ץ | ṣ | letter final tsadi | 05E5 | |
Q | q | ק | qq q |
forte normal |
letter qof | 05E7 |
R | r | ר | rr r |
forte normal |
letter resh | 05E8 |
# | C | ש | ŝ | letter shin without dot | 05E9 | |
C | v | שׁ | šš š |
forte normal |
letter shin with shin dot | FB2A |
F | c | שׂ | śś ś |
forte normal |
letter shin with sin dot | FB2B |
T | t | ת | tt t ṯ |
forte lene normal |
letter tav | 05EA |
Vowels
Qere Ketiv
The phonetics follows the qere, not the ketiv,
when they are different.
In that case a *
is added.
Tetragrammaton
The tetragrammaton יהוה
is (vowel)-pointed in different ways;
the phonetics follows the pointing, but the tetragrammaton
is put between [ ]
.
transcription (ETCBC ) |
transcription (Abegg) | glyph | phonetic | remarks | name | UNICODE |
---|---|---|---|---|---|---|
A | A Å | ַ | a ₐ |
normal *furtive* |
point patah | 05B7 |
:A | S | ֲ | ᵃ | point hataf patah | 05B2 | |
@ | D ∂ Î | ָ | ā o |
gadol qatan |
point qamats | 05B8 |
:@ | F ƒ Ï | ֳ | ᵒ | point hataf qamats | 05B3 | |
E | R ® ‰ | ֶ | e eʸ |
normal with following י |
point segol | 05B6 |
:E | T | ֱ | ᵉ ᵉʸ |
normal with following י |
point hataf segol | 05B1 |
; | E é ´ | ֵ | ê ē |
with following י alone |
point tsere | 05B5 |
I | I ˆ î Ê | ִ | î i |
with following י alone |
point hiriq | 05B4 |
O | O ø | ֹ | ô ō |
with following ו alone |
point holam | 05B9 |
U | U ü ¨ | ֻ | u | point qubuts | 05BB | |
: | V √ J ◊ | ְ | ᵊ | left out if silent | point sheva | 05B0 |
Other points and marks
transcription (ETCBC ) |
transcription (Abegg) | glyph | phonetic | remarks | name | UNICODE |
---|---|---|---|---|---|---|
. | ; … Ú ¥ Ω | ּ | point dagesh or mapiq | 05BC | ||
.c | ׁ | point shin dot | 05C1 | |||
.f | ׂ | point sin dot | 05C2 | |||
, | ֿ | point rafe | 05BF | |||
35 | ֽ | ˈ | point meteg | 05BD | ||
45 | ֽ | ˈ | point meteg | 05BD | ||
75 | ֽ | ˈ | point meteg | 05BD | ||
95 | ֽ | ˈ | point meteg | 05BD | ||
52 | ׄ | ˈ | mark upper dot | 05C4 | ||
53 | ׅ | ˈ | mark lower dot | 05C5 | ||
* | ֯ | mark masora circle | 05AF |
Punctuation
Details
Some specialities in the Masoretic system are not reflected in the phonetics:
setumah
ס;petuhah
ף;nun-hafuka
̇׆.
transcription (ETCBC ) |
transcription (Abegg) | glyph | phonetic | remarks | name | UNICODE |
---|---|---|---|---|---|---|
00 | . | ׃ | . | punctuation sof pasuq | 05C3 | |
ñ | ׆ | punctuation nun hafukha | 05C6 | |||
& | - | ־ | - | punctuation maqaf | 05BE | |
_ | (non breaking space) | space | 0020 | |||
0000 | ± | ׃׃ | Dead Sea scrolls. We use as Hebrew character a double sof pasuq. | paleo-divider | 05C3 05C3 | |
' | / | ׳ | Dead Sea scrolls. We use as Hebrew character a geresh. | morpheme-break | 05F3 |
Hybrid
Details
There is a character that is mostly punctuation, but that can also influence the nature of some accents occurring in the word before. Such a character is a hybrid between punctuation and accent. See also the documentation of the BHSA about cantillation.
transcription | glyph | phonetic | remarks | name | UNICODE | |
---|---|---|---|---|---|---|
05 | ׀ | punctuation paseq | 05C0 |
Accents
Details
Some accents play a role in deciding whether a schwa
is silent or mobile
and whether a qamets
is gadol
or qatan
.
In the phonetics those accents appear as ˈ
or ˌ
.
Implied accents are also added.
transcription | glyph | phonetic | remarks | name | UNICODE |
---|---|---|---|---|---|
94 | ֧ | ˈ | accent darga | 05A7 | |
13 | ֭ | ˈ | accent dehi | 05AD | |
92 | ֑ | ˈ | accent etnahta | 0591 | |
61 | ֜ | ˈ | accent geresh | 059C | |
11 | ֝ | ˈ | accent geresh muqdam | 059D | |
62 | ֞ | ˈ | accent gershayim | 059E | |
64 | ֬ | ˈ | accent iluy | 05AC | |
70 | ֤ | ˈ | accent mahapakh | 05A4 | |
71 | ֥ | ˌ | accent merkha | 05A5 | |
72 | ֦ | ˈ | accent merkha kefula | 05A6 | |
74 | ֣ | ˈ | accent munah | 05A3 | |
60 | ֫ | ˈ | accent ole | 05AB | |
03 | ֙ | accent pashta | 0599 | ||
83 | ֡ | ˈ | accent pazer | 05A1 | |
33 | ֨ | ˈ | accent qadma | 05A8 | |
63 | ֨ | ˌ | accent qadma | 05A8 | |
84 | ֟ | ˈ | accent qarney para | 059F | |
81 | ֗ | ˈ | accent revia | 0597 | |
01 | ֒ | accent segol | 0592 | ||
65 | ֓ | ˈ | accent shalshelet | 0593 | |
04 | ֩ | accent telisha qetana | 05A9 | ||
24 | ֩ | accent telisha qetana | 05A9 | ||
14 | ֠ | accent telisha gedola | 05A0 | ||
44 | ֠ | accent telisha gedola | 05A0 | ||
91 | ֛ | ˈ | accent tevir | 059B | |
73 | ֖ | ˌ | accent tipeha | 0596 | |
93 | ֪ | ˈ | accent yerah ben yomo | 05AA | |
10 | ֚ | ˈ | accent yetiv | 059A | |
80 | ֔ | ˈ | accent zaqef qatan | 0594 | |
85 | ֕ | ˈ | accent zaqef gadol | 0595 | |
82 | ֘ | ˈ | accent zarqa | 0598 | |
02 | ֮ | ˈ | accent zinor | 05AE |
Numerals
Details
These signs occur in the Dead Sea scrolls.
We represent them with conventional Hebrew characters for numbers
and use the geresh
accent or another accent to mark the letter
as a numeral.
The ETCBC codes are obtained by translating back from the UNICODE.
transcription (ETCBC) | transcription (Abegg) | glyph | remarks | name |
---|---|---|---|---|
>' | A | א֜ | number 1 | |
>52 | å | אׄ | alternative for 1, often at the end of a number, we use the upper dot to distinguish it from the other 1 | number 1 |
>53 | B | אׅ | alternative for 1, often at the end of a number, we use the lower dot to distinguish it from the other 1 | number 1 |
>35 | ∫ | אֽ | alternative for 1, often at the end of a number, we use the meteg to distinguish it from the other 1 | number 1 |
J' | C | י֜ | number 10 | |
k' | D | ך֜ | number 20 | |
Q' | F | ק֜ | number 100 | |
& | + | ־ | we use the maqaf to represent addition between numbers | add |
Text-critical
Details
These signs occur in the Dead Sea scrolls. They are used to indicate uncertainty and editing acts by ancient scribes or modern editors. They do not have an associated glyph in UNICODE.
The ETCBC does not have codes for them, but we propose an ETCBC-compatible encoding for them. The ETCBC codes are surrounded by space, except for the brackets, where a space at the side of the ( or ) is not necessary.
Codes that are marked as flag apply to the preceding character.
Codes that are marked as brackets apply to the material within them.
transcription (Abegg) | transcription (ETCBC ) |
remarks | name |
---|---|---|---|
0 | ε | token | missing |
? | ? | token | uncertain (degree 1) |
\ | # | token | uncertain (degree 2) |
� | #? | token | uncertain (degree 3) |
Ø | ? | flag, applies to preceding character | uncertain (degree 1) |
« | # | flag, applies to preceding character | uncertain (degree 2) |
» | #? | flag, applies to preceding character | uncertain (degree 3) |
| | ## | flag, applies to preceding character | uncertain (degree 4) |
« » | (# #) | brackets | uncertain (degree 2) |
≤ ≥ | (- -) | brackets | vacat (empty space) |
( ) | ( ) | brackets | alternative |
[ ] | [ ] | brackets | reconstruction (modern) |
{ } | { } | brackets | removed (modern) |
{{ }} | {{ }} | brackets | removed (ancient) |
< > | (< >) | brackets | correction (modern) |
<< >> | (<< >>) | brackets | correction (ancient) |
^ ^ | (^ ^) | brackets | correction (supralinear, ancient) |
Expand source code Browse git
"""
.. include:: ../docs/writing/hebrew.md
"""