Gustek <gustek@riseup.net>PGP Key My code |
Welcome. This page shouldn't even exist. I initially intended to publish a GPLv3 translation on this site, but I thought it would be weird to only publish a single, un-introduced file. Anyway, I'm Gustek, a 2nd year university student and I live in France. I mostly program in Rust, C and Common Lisp but I enjoy some more functional stuff as well. I enjoy learning about languages, I speak French, Polish, English and Spanish.
I wrote this page using a single 68KiB Lisp expression because it's fun, I guess. You can find it
here
if you want (it's Beerware so feel free to do whatever you want with it). It was written for SBCL though thus you might need to modify it a bit, at least the shebang, to be able to run it with your implementation. |
As consonants, ي, و, ه, ن, م, ل, ك, ق, ف, غ, ش, س, ز, ر, ذ, د, خ, ج, ث, ت, ب are simply rendered as b, t, th, j, kh, d, dh, r, z, s, sh, gh, f, q, k, l, m, n, h, w, y.
As vowels, ي, و, ى/آ/ا are romanised as ā, ū, ī or â, û, î. The short vowels ـُ, ـِ, ـَ are rendered as a, i, u. The tanwīn vowels ـٌ, ـٍ, ـً are never written.
ء, whether on top of a long vowel or not, is transliterated as ʔ, ʾ, `, '. Similarly, ع is transliterated as ʕ, ʿ, ´. If a vowel follows the ع or ء after the definite marker ال, and the ʔ/ʕ form is not used, the vowel shall be capitalised. For instance, العقيدة could be written as both al-ʿAqīdah and al-ʕaqīdah.
ة is rendered as ah . Tanwīn vowels are never written after ة.
The pharyngeal(ised) consonants ظ, ط, ض, ص, ح can be written either ḥ, ṣ, ḍ, ṭ, ẓ or ẖ, s̱, ḏ, ṯ, ẕ.
Before shamsiyyah letters, ال is not written as al, rather the following consonant is doubled. For instance, الدين is written ad-Dīn. Monosyllabic prepositions and proclitics are linked to the word by an hyphen. When these precede a definite word, the a of ال is dropped. For instance, في الدين is written fī-d-Dīn.
The grammatical short vowel endings are never written, thus ال is never rendered as ul or il.
Here are a few examples:
Default indentation is super-rigid and it manages to get math mode wrong every single time when editing LaTeX.
Kazakh language is mainly written in two alphabets: Cyrillic (majority) and Latin (official). They are both bad, but for different reasons.
Let's start with the Cyrillic alphabet. It goes like this:
А /ɑ̝/ Ә /æ̝/
Б /b/ Г /g/ Ғ /ʁ/ Д /d/
Е /je̘/
Ж /ʑ/ З /z/
И /əj/
Й /j/
К /k/ Қ /q/ Л /l/ М /m/ Н /n/ Ң /ŋ/
О /o̞/ Ѳ /ɵ/
П /p/ Р /ɾ/ С /s/
Т /t/
У /ʊw/ /w/ Ұ /o̙/ Ү /ʉ/
Х /χ/ Ш /ɕ/
Ы /ə/ І /ɪ̞/ Э /e/
I removed the letters found only in loanwords because it's not important here. I put the vowels in bold typeface because they are the problem here.
First of all, as we can see, the /w/ and /ʊw/ sounds are represented by the same letter. This is problematic, for obvious reasons. The improvement idea is, as the /v/ sound is only found in loanwords, to use the В letter to represent /w/, as it is done in Kalmyk and Mongolian.
The next problem are the Ә and Ы letters. First, having a letter visually identical to the schwa, not represent the schwa /ə/ sound, is at best asinine. In other Turkic (and also in Mongolic) languages using the Cyrillic alphabet, ы usually represents /ɯ/, and it is illogical to have a yer ⟨ы⟩, visually containing the letter ⟨i⟩, representing a sound far apart from the /i/-ish sounds. The improvement idea is to write the /æ/ sound Ӕ (as done in Ossetian), and the /ə/ sound Ә, as simple as that.
Next is Ұ. Although it is not a major problem, it feels a bit un-natural to have a very у-looking sound representing not a /u/-ish nor /y/-ish letter. The improvement idea is, once again, as the /ju/ sound only exists in loanwords, to use the Ю letter to write the /o̙/ sound.
Last but not least: И and І. It simply makes no sense to write /əj/ with a letter representing /i/ everywhere else. Thus, the improvement idea is to simply write it ӘЙ. Then І is not a problem in itself, as there is precedent (notably in Ruthenian) for writing the /ɪ̞/ sound like this, although it may be simpler to write it И.
Now, the Latin alphabet (things get really worse).
A /ɑ̝/ Ä /æ/ B /b/ D /d/ E /e/ /je/ G /g/ Ğ /ʁ/ H /χ/ I /ɪ̞/ İ /j/ /əj/
J /ʑ/ K /k/ L /l/ M /m/ N /n/ Ñ /ŋ/ O /o/ Ö /ɵ/ P /p/ Q /q/
R /ɾ/ S /s/ Ş /ɕ/ T /t/ U /ʊw/ /w/ Ū /o̙/ Ü /ʉ/ Y /ə/ Z /z/
The processus of latinization hoped to be completed by 2025 by the Kazakh government is but an attempt to break away and to destroy the memoryof the Soviet period and of its historical ties to Russia, while showing a desperate desire to befriend the West. This is of the uttermoststupidity and ungratefulness. But we're not here to discuss politics, thus I'll simply show why their Latin alphabet is pure trash.
First and maybe the biggest of problems, ⟨ñ⟩. It's also used by the ALA-LC (it's made by Americans, thus it's not a surprise to find such stupidities) and the Common Turkic Alphabet project, serving the same role as the latinization of Kazakh. Anyways, in its original language (Spanish) and in more than 16 languages, it represents /ɲ/. Its use for representing /ŋ/ simply makes no sense at all. Simple solution: just use the IPA symbol ⟨Ŋ⟩ or latinize the Cyrillic innovation (which is to me more visually pleasant) ⟨Ꞑ⟩.
Now, Ū. It's obviously a simple latinization of the Cyrillic letter, and it doesn't make more sense that it does in Cyrillic. Thus, the solution would be more or less the same: in this case, simply use Ō, it's visually more straightforward.
For U, the solution is identical: move the /w/ to W, and only keep one sound per letter.
Then, İ, I and Y. The desire to resemble Turkish is obvious, and it shouldn't be blamed, Kazakh is a Turkic language after all. But representing both /əj/ and /j/ with the same letter is more than simian. Similarly, representing a neighbourg of /i/ with I instead of İ is actually the contrary of what is done in the Turkish language. My solution here is to represent /j/ with Y, /ə/ with I, /ɪ̞/ with İ and /əj/ with IY. As simple as that.
Next is E. It has been taken as phonetically identical to the similarly looking Cyrillic letter, and the Э has been merged in it. Once again, it does nothing but confuse the language learner, thus it should simply represent /e/ with E, and /je̘/ with YE.
Lastly, Ğ. Once again, its use arises from the desire of resembling Turkish. But it makes little sense here. Although it can be deduced just by looking at it if you know Kazakh phonology, it can be confusing and accidentaly read like in Turkish. The solution here would be to use either Ģ or Ġ, the latter already being used for representing a closely neighbouring sound in Arabic.
To summarize, here are the two alphabets, with my modifications added:
А Ӕ Б В Г Ғ Д Е Ж З И Й К Қ Л М Н Ң О Ѳ П Р С Т У Ү Х Ш Ә ӘЙ Э Ю
A Ä B D E Ə G Ģ H I İ J K L M N Ꞑ O Ō Ö P Q R S Ş T U Ü W Y YƏ Z
The Cyrillic orthography I drafted for the Polish language this year is exceptionally bad, for various reasons:
For these reasons, I came up with another proposal for a Cyrillic writing system for Polish, available here
It sucks mainly for two reasons: the first one is the extreme difficulty that comes with trying to use a different writing system than Latin or even some special diacritics, let alone actually mixing different writing systems. If anyone knows how to do this only loading packages (I can accept redefining one command or two, but not writing 20 lines of code for each writing system I wanna load), please send me an email, I really need this. The next problem, sadly the biggest one and the most impossible to fix, is error reporting. If you haven't seen the error fifty times yet, you will struggle to find out where it happened and why it happened. It's still the greatest typesetting tool for a lot of reasons though. Use LATEX.
תורה לסוף— סנהדרין 72a
כָּךְ הָיָה הַקָּדוֹשׁ בָּרוּךְ הוּא מַבִּיט בַּתּוֹרָה וּבוֹרֵא אֶת הָעוֹלָם, וְהַתּוֹרָה אָמְרָה בְּרֵאשִׁית בָּרָא אֱלֹהִים. וְאֵין רֵאשִׁית אֶלָּא תּוֹרָה, הֵיאַךְ מָה דְּאַתְּ אָמַר (משלי ח, כב): ה' קָנָנִי רֵאשִׁית דַּרְכּוֹ.— בּראשׁית רבּה, א:א
אז, ברא אלוהים את העלם באמצעות Lisp.
— source
The Lisp programming language has 4 different pillars. Lacking even a single one of them disqualifies a language from being considered as a Lisp.
READ, EVAL, PRINT
and
LOOP.
Those do not need to form a
REPL
as we intend it nowadays. Rather, they must be present and available to the user for him to be able to write a complete
REPL
using a code of the following form:
(LOOP (PRINT (EVAL (READ))))The
READ
and
PRINT
functions must also satisfy the following equality:
id = read ∘ print
Arabic — rablermorna-pa — رابلهرمۆرنا-پا
An Arabic orthography already exists for Lojban, called
rablermorna,
but I think it has multiple problems, making it difficult to use and largely suboptimal.
|
Hebrew — xeblermorna — חֵבּלֵרמֹרןַThis Hebrew orthograhy for Lojban cannot simply be derived from the Arabic orthography, most notably regarding vowels: whereas Standard Hebrew is richer than Arabic regarding consonants and vowels, the lack of diversity of languages using the Hebrew Alphabet prevents the existence of non-native long vowels such as the Kurdish [ɔ] mentionned earlier. Thus, the orthography is required to rely on diacritics, called nīqqūd to represent vowels.
|
Here is my proposal for adapting the cyrillic alphabet to suit the Polish language. Don't use it, it's terrible, but everybody makes mistakes, right?
I wanted to keep most of the differences that exist in the current alphabet, even for likewise-sounded letters.
Following this principle, I tried to keep some letters that could have been omitted without impacting readability too much.
This can lead to the illogical situation of the cyrillic text being the same length as the latin one, the shorthands offered by automatically managing palatalisation being left unused.
I don't like digraphs, thus I used some ligatures from the Serbian cyrillic alphabet, but due to the lack of ligatures for the [ɕ] and [ʑ] sounds, I needed to pick the letters from somewhere else, but unfortunately, the letter corresponding to the [ʑ] sound is not present in any Slavic-language alphabet, thus I decided to pick it from the Tatar alphabet instead.
The overall design is influenced by the Russian, Belarusian and Serbian cyrillic alphabets.
Polish letter(s) | Cyrillic equivalent(s) | Notes |
---|---|---|
A a | А a | |
Ą ą | Ѫ ѫ | Stole that from Common Slavonic. |
B b | Б б | |
C c | Ц ц | |
Ć ć | Ћ ћ | |
Cz cz | Ч ч | |
Ci ci | Ћи ћи | Note that if the „ci” is followed by a vowel, the soft version of the latter shall be used, e.g. „Ciebie”: « Ћебе ».This applies to all consonants (although the special ones mentionned in this table shall be softened using a ь or a ligature). |
D d | Д д | |
Dz dz | Ѕ ѕ | |
Dź dź | Ђ ђ | |
Dż dż | Џ џ | |
Dzi dzi | Ђи ђи | Ligatures spares us the hideous « Дзьи ». |
E e | Э э | |
Ę ę | Ѧ ѧ | See ‚ą’. |
F f | Ф ф | |
G g | Г г | |
H h / Ch ch | Х х | |
I i | И и | |
J j | Й й | |
K k | К к | |
L l | Л л | |
Ł ł | Љ љ | I first wanted to use the Belarusian « ў », but it would have lost etymology, thus I decided to use the serbian « љ », although it is originally designed for the „lj” sound. |
M m | М м | |
N n | Н н | |
Ń ń | Њ њ | |
Ni ni | Њи њи | |
O o | О о | |
Ó ó | О́ о́ | For the sake of etymology. It was either this or 1) an also ugly « у » with an acute accent or 2) losing etymology. |
P p | П п | |
R r | Р р | |
Rz rz | Р̌ р̌ | This sucks. I use it to preserve etymology, but it sucks. |
S s | С с | |
Ś ś | Щ щ | Decided to reuse the letter making the same sound in Russian. |
Sz sz | Ш ш | |
T t | Т т | |
U u | У у | |
W w | В в | |
Y y | Ы ы | |
Z z | З з | |
Ź ź | Җ җ | Here comes the Tatar alphabet. It looks kind of symetrical to the Russian Щ thus I like it. |
Ż ż | Ж ж | |
Ja ja | Я я | |
Ją ją | Ѭ ѭ | See also ‚ą’ and ‚ę’. |
Je je | Е е | |
Ję ję | Ѩ ѩ | See also ‚ą’, ‚ę’ and ‚ją’. |
Ji ji / Jy jy | І і | See below. |
Jo jo | Ё ё | |
Jó jó / Ju ju | Ю ю | Never seen a „jó” in my life, thus I assume that it does not exist and mix it with the „ju”. |
Here is the alphabet in traditional Eastern-Slavic order:
А Б В Г Д Ѕ Ђ Џ Е Ё Ж Җ З И Й К Л Љ М Н Њ О О́ П Р Р̌ С Т У Ф Х Ц Ћ Ч Ш Щ Ы Ь Э Ю Я І Ѫ Ѧ Ѭ Ѩ
This leaves us with 46 letters (including ligatures), to 35 letters for the latin-ish alphabet, although the latter uses a lot of digraphs.
Following this line comes an example of the script in action on the beginning of
Le Petit Prince
by Antoine de Saint-Exupéry:
Mały KsiążęAntoine de Saint-Exupéry |
Мaљы КщѭжѧАнтљaн ды Сaнкт-Экзупэри |
|
---|---|---|
Kiedy miałem sześć lat, zobaczyłem pewnego razu wspaniały obrazek w ksziążce o dżungli |
|
Кеды мяљэм шэщћ лат, зобачыљэм пэвнэо разu вспаниаљы образэк в кщѭжцэ о џунгли |
The (Internet) Voskhod Protocol is a small Internet protocol designed for fast and simple document exchange on a small to medium size network.
It is roughly similar to Gopher, while trying to fix its flaws. You can find its specification
here
Here 's my attempt to translate the GNU General Public License version 3 into French. It probably contains a lot of typos and incomprehensible parts, if you spot some, please notify them to me by e-mail.
By the way, quick note on the GPL, that I cannot include in the translation: please
don't
add the “or any later version” clause to your copyright headers.
Please don't accept automatically a license you didn't read. Even if GNU says that
Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns
Those details can cause problems, like the tivoization clause caused some to quite a lot of people while updating from version 2 to version 3.
So please just say “GPL-3.0-only”.