Cree SRO/Syllabics

Function documentation

cree_sro_syllabics.sro2syllabics(sro: str, hyphens: str = '\u202f', sandhi: bool = True) → str

Convert Cree words written in SRO text to syllabics.

Finds instances of SRO words in strings, and converts them all to syllabics.

>>> sro2syllabics('Eddie nitisiyihkâson')
'Eddie ᓂᑎᓯᔨᐦᑳᓱᐣ'

You should be able to write words in Y-dialect (a.k.a., Plains Cree):

>>> sro2syllabics('niya')
'ᓂᔭ'

…and Th-dialect (a.k.a., Woods Cree):

>>> sro2syllabics('nitha')
'ᓂᖬ'

Any word that does not have the “structure” of a Cree word is not converted:

>>> sro2syllabics('Maskêkosihk trail')
'ᒪᐢᑫᑯᓯᕽ trail'
>>> sro2syllabics('Maskêkosihk tireyl')
'ᒪᐢᑫᑯᓯᕽ ᑎᕒᐁᕀᓬ'

Roman full-stops/periods (“.”) are converted into syllabics full-stops:

>>> sro2syllabics('Eddie nitisiyihkâson.')
'Eddie ᓂᑎᓯᔨᐦᑳᓱᐣ᙮'

Note that the substitution of full-stops only takes place after syllabics; if it is obviously not Cree (like most English), it will not be converted:

>>> sro2syllabics("tânisi. ninêhiyawân.")
'ᑖᓂᓯ᙮ ᓂᓀᐦᐃᔭᐚᐣ᙮'
>>> sro2syllabics("Howdy, English text.")
'Howdy, English text.'

sro2syllabics() can handle variations in orthography. For example, it can convert circumflexes (âêîô):

>>> sro2syllabics('êwêpâpîhkêwêpinamahk')
'ᐁᐍᐹᐲᐦᑫᐍᐱᓇᒪᕽ'

It can convert macrons (āēīō):

>>> sro2syllabics('ēwēpâpīhkēwēpinamahk')
'ᐁᐍᐹᐲᐦᑫᐍᐱᓇᒪᕽ'

And it can convert an unaccented “e” just as if it had the appropriate accent:

>>> sro2syllabics('ewepapihkewepinamahk')
'ᐁᐍᐸᐱᐦᑫᐍᐱᓇᒪᕽ'

Additionally, apostrophes are interpreted as short-i’s. For example, converting “tânsi” will not work as expected:

>>> sro2syllabics("tânsi")
'ᑖᐣᓯ'

However, add an apostrophe after the ‘n’ and it will work correctly:

>>> sro2syllabics("tân'si")
'ᑖᓂᓯ'

Hyphens in Cree words are replaced with <U+202F NARROW NO-BREAK SPACE> (NNBSP) by default. This is a space that is narrower than the normal space character. NNBSP also prevents breaking the word across line breaks. We chose the NNBSP character as the default, as it helps visually distinguish between meaningful sub-elements within words, while being less likely to be mistaken as word-separating whitespace by most text processing applications.

Compare the following hyphen replacement schemes:

Replace hyphens with kâ-mahihkani-pimohtêt isiyihkâsow
(nothing) ᑳᒪᐦᐃᐦᑲᓂᐱᒧᐦᑌᐟ ᐃᓯᔨᐦᑳᓱᐤ
NNBSP ᑳ ᒪᐦᐃᐦᑲᓂ ᐱᒧᐦᑌᐟ ᐃᓯᔨᐦᑳᓱᐤ
Space ᑳ ᒪᐦᐃᐦᑲᓂ ᐱᒧᐦᑌᐟ ᐃᓯᔨᐦᑳᓱᐤ

We discourage using an ordinary space character (U+0020), as it is often interpreted as separating words, both by computers and people alike. If you are viewing this documentation in a web browser, try double clicking the syllabics rendition of “kâ-mahihkani-pimohtêt” with NNBSP separators versus the one with space separators. Double clicking typically selects an entire word by default, and this is often the case when double clicking the word with NNBSP characters; however this fails for the rendition with space characters.

Despite this, you can chose any character of your liking to replace hyphens in syllabics by providing the hyphens= keyword argument:

>>> sro2syllabics('kâ-mahihkani-pimohtêt', hyphens='\N{NARROW NO-BREAK SPACE}')
'ᑳ ᒪᐦᐃᐦᑲᓂ ᐱᒧᐦᑌᐟ'
>>> sro2syllabics('kâ-mahihkani-pimohtêt', hyphens='')
'ᑳᒪᐦᐃᐦᑲᓂᐱᒧᐦᑌᐟ'
>>> sro2syllabics('kâ-mahihkani-pimohtêt', hyphens=' ')
'ᑳ ᒪᐦᐃᐦᑲᓂ ᐱᒧᐦᑌᐟ'

In SRO, the most orthographically correct way to write certain compounds is to separate two morphemes with a hyphen. For example:

pîhc-âyihk — inside
nîhc-âyihk — outside

However, both words are pronounced as if discarding the hyphen:

pîhcâyihk — inside
nîhcâyihk — outside

This is called sandhi. When transliterated into syllabics, the transcription should follow the latter, blended interpretation, rather than the former, separated interpretation. By default, sro2syllabics() applies the sandhi rule and joins the syllable as if there were no hyphen:

>>> sro2syllabics('pîhc-âyihk')
'ᐲᐦᒑᔨᕽ'

However, if this is not desired, you can set sandhi=False as a keyword argument:

>>> sro2syllabics('pîhc-âyihk', sandhi=False)
'ᐲᐦᐨ ᐋᔨᕽ'
Parameters:
  • sro (str) – the text with Cree words written in SRO.
  • hyphens (str) – what to replace hyphens with (default: <U+202F NARROW NO-BREAK SPACE>).
  • sandhi (bool) – whether to apply sandhi orthography rule (default: True).
Returns:

the text with Cree words written in syllabics.

Return type:

str

cree_sro_syllabics.syllabics2sro(syllabics: str, produce_macrons=False) → str

Convert Cree words written in syllabics to SRO.

Finds all instances of syllabics in the given string, and converts it to SRO. Anything that is not written in syllabics is simply ignored:

>>> syllabics2sro('Eddie ᓂᑎᓯᔨᐦᑳᓱᐣ᙮')
'Eddie nitisiyihkâson.'

You should be able to convert words written in Y-dialect (a.k.a., Plains Cree):

>>> syllabics2sro('ᓂᔭ')
'niya'

… and Th-dialect (a.k.a., Woods Cree):

>>> syllabics2sro('ᓂᖬ')
'nitha'

By default, the SRO will be produced with circumflexes (âêîô):

>>> syllabics2sro('ᐁᐍᐹᐲᐦᑫᐍᐱᓇᒪᕽ')
'êwêpâpîhkêwêpinamahk'

This can be changed to macrons (āēīō) by setting produce_macrons to True:

>>> syllabics2sro('ᐁᐍᐹᐲᐦᑫᐍᐱᓇᒪᕽ', produce_macrons=True)
'ēwēpāpīhkēwēpinamahk'

In both cases, the character produced will be a pre-composed character, rather than an ASCII character followed by a combining diacritical mark. That is, vowels are returned in NFC normalization form.

For compatibility with cree_sro_syllabics.sro2syllabics(), syllabics2sro will convert any instances of <U+202F NARROW NO BREAK SPACE> to a hyphen in the SRO transliteration.

>>> syllabics2sro('ᑳ ᒪᐦᐃᐦᑲᓂ ᐱᒧᐦᑌᐟ')
'kâ-mahihkani-pimohtêt'

In some syllabics text, syllabics with a ‘w’ dot are rendered as two characters: the syllabic without the ‘w’ dot followed by <U+1427 CANADIAN SYLLABICS FINAL MIDDLE DOT>; this differs from the more appropriate pre-composed syllabic character with the ‘w’ dot. For example,

ᐃᑘᐏᓇ — pre-composed syllabic
ᐃᑌᐧᐃᐧᓇ — syllabic + CANADIAN SYLLABICS FINAL MIDDLE DOT

syllabics2sro() can convert both cases appropriately:

>>> syllabics2sro('ᐃᑘᐏᓇ')
'itwêwina'
>>> syllabics2sro('ᐃᑌᐧᐃᐧᓇ')
'itwêwina'

Some syllabics converters produce erroneous yet very similar looking characters. syllabics2sro() knows the following look-alike characters:

Look-alike Correct character
ᐩ CANADIAN SYLLABICS FINAL PLUS ᕀ CANADIAN SYLLABICS WEST-CREE Y
ᑦ CANADIAN SYLLABICS T ᒼ CANADIAN SYLLABICS WEST-CREE M
ᕁ CANADIAN SYLLABICS SAYISI YI ᕽ CANADIAN SYLLABICS HK

syllabics2sro() automatically interprets erroneous look-alikes as their visually equivalent characters.

>>> syllabics2sro('ᒌᐯᐦᑕᑳᐧᐱᑲᐧᓂᐩ')
'cîpêhtakwâpikwaniy'
>>> syllabics2sro('ᐊᓴᒧᐱᑕᑦ')
'asamopitam'
>>> syllabics2sro('ᒫᒥᕁ')
'mâmihk'
Parameters:
  • syllabics (str) – the text with Cree words written in syllabics.
  • produce_macrons – if True, produces macrons (āēīō) instead of circumflexes (âêîô).
Returns:

the text with Cree words written in SRO.

Return type:

str