Understanding how Singlish maps English characters to Sinhala phonemes is key to using the library effectively.Documentation Index
Fetch the complete documentation index at: https://developers.remeinium.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Phonetic Transliteration
Singlish uses a phonetic mapping scheme, inspired by the standard method popularized by Helakuru. This means you type the sound of the letter rather than the key position (as in Wijesekara layout).Writing using the Phonetic Method “The Phonetic Method is a Sinhalese typing system first introduced by Helakuru for mobile phone keyboards. It allows you to type Sinhala characters by using English letters that match their sound.” — Helakuru.lkWe acknowledge and credit Helakuru for their innovation in establishing this now-standard mapping for digital Sinhala typing.
- Short vowels:
a(අ),i(ඉ),u(උ),e(එ),o(ඔ) - Long vowels: Double the letter (
aa-> ආ,ii-> ඊ) - Consonant base tokens: A bare consonant maps to a hal form (
k->ක්,g->ග්). - Inherent vowel commit: Appending
acommits the consonant with inherent vowel (ka->ක,ga->ග). - Aspirated (Mahaprana): Add
h(kh->ඛ්,kha->ඛ). - Retroflex / legacy forms: Uppercase and specific digraphs map to retroflex/aspirated variants (
D->ඪ්,Da->ඪ,N->ණ්).
Greedy Tokenization
The engine uses a greedy tokenization strategy. It always tries to match the longest possible pattern from the current position. Example:zdha vs z + d + h + a
The engine checks patterns in descending order of length:
- Does
zdhamatch a known pattern? Yes (MatchesSANYAKA_DHA-> ඳ). - It consumes all 4 characters and outputs ඳ.
zda:
- Does
zdamatch? Yes (MatchesSANYAKA_DA-> ඬ).
ka:
- It resolves
kas a consonant base with hal (ක්). - Then
aapplies the inherent vowel commit. - Final output becomes
ක.
The Trie Structure
To support real-time typing efficiently, Singlish uses a Prefix Trie. This allows the engine to:- Instantly validate if the current input could be the start of a longer sequence.
- “Look ahead” without converting prematurely.
k, the engine converts it to ක් (with hal).
If you then type a, the engine sees ka, backtracks/updates the buffer, and produces ක.
This buffering is handled transparently by the SinglishIME class and the useSinglishConverter hook.
