Plains Cree Orthography

Function documentation

crk_orthography.sro2syllabics(sro: str, hyphens: str = '\u202f', sandhi: bool = True) → str

Convert Cree words written in SRO text to syllabics.

Finds instances of SRO words in strings, and converts them all to syllabics.

>>> sro2syllabics('Eddie nitisiyihkâson')
'Eddie ᓂᑎᓯᔨᐦᑳᓱᐣ'

Any word that does not have the “structure” of a Plains Cree word is not converted:

>>> sro2syllabics('Maskêkosihk trail')
'ᒪᐢᑫᑯᓯᕽ trail'
>>> sro2syllabics('Maskêkosihk tireyl')
'ᒪᐢᑫᑯᓯᕽ ᑎᕒᐁᕀᓬ'

Roman full-stops/periods (“.”) are converted into syllabics full-stops.

>>> sro2syllabics('Eddie nitisiyihkâson.')
'Eddie ᓂᑎᓯᔨᐦᑳᓱᐣ᙮'

Note that the substitution of full-stops only takes place after syllabics; if it doesn’t “look” like Cree, it will not be converted:

>>> sro2syllabics("tânisi. ninêhiyawân.")
'ᑖᓂᓯ᙮ ᓂᓀᐦᐃᔭᐚᐣ᙮'
>>> sro2syllabics("Howdy. This be English.")
'Howdy. This be English.'

sro2syllabics() can handle variations in orthography. For example, it can convert circumflexes (âêîô):

>>> sro2syllabics('êwêpâpîhkêwêpinamahk')

It can convert macrons (āēīō):

>>> sro2syllabics('ēwēpâpīhkēwēpinamahk')

And it can convert an unaccented “e” just as if it had the appropriate accent:

>>> sro2syllabics('ewepapihkewepinamahk')

Additionally, apostrophes are interpreted as short-i’s. For example, converting “tânsi” will not work as expected:

>>> sro2syllabics("tânsi")

However, add an apostrophe after the ‘n’ and it will work correctly:

>>> sro2syllabics("tân'si")

Hyphens in Plains Cree words are replaced with <U+202F NARROW NO BREAK SPACE>> (NNBSP) by default. This is a space that is narrower than the normal space character. NNBSP also prevents breaking the word across line breaks. We chose the NNBSP character as the default, as it helps visually distinguish between meaningful sub-elements within words, while being less likely to be mistaken as word-separating whitespace by most text processing applications.

Compare the following hyphen replacement schemes:

Replace hyphens with kâ-mahihkani-pimohtêt isiyihkâsow
(nothing) ᑳᒪᐦᐃᐦᑲᓂᐱᒧᐦᑌᐟ ᐃᓯᔨᐦᑳᓱᐤ
NNBSP ᑳ ᒪᐦᐃᐦᑲᓂ ᐱᒧᐦᑌᐟ ᐃᓯᔨᐦᑳᓱᐤ
Space ᑳ ᒪᐦᐃᐦᑲᓂ ᐱᒧᐦᑌᐟ ᐃᓯᔨᐦᑳᓱᐤ

We discourage using an ordinary space character (U+0020), as it is often interpreted as separating words, both by computers and people alike. If you are viewing this documentation in a web browser, try double clicking the syllabics rendition of “kâ-mahihkani-pimohtêt” with NNBSP separators versus the one with space separators. Double clicking typically selects an entire word by default, and this is often the case when double clicking the word with NNBSP characters; however this fails for the rendition with space characters.

Despite this, you can chose any character of your liking to replace hyphens in syllabics by providing the hyphens= keyword argument:

>>> sro2syllabics('kâ-mahihkani-pimohtêt', hyphens='\N{NARROW NO-BREAK SPACE}')
'ᑳ ᒪᐦᐃᐦᑲᓂ ᐱᒧᐦᑌᐟ'
>>> sro2syllabics('kâ-mahihkani-pimohtêt', hyphens='')
>>> sro2syllabics('kâ-mahihkani-pimohtêt', hyphens=' ')
'ᑳ ᒪᐦᐃᐦᑲᓂ ᐱᒧᐦᑌᐟ'

In SRO, the most orthographically correct way to write certain compounds is to separate two morphemes with a hyphen. For example:

pîhc-âyihk — inside
nîhc-âyihk — outside

However, both words are pronounced as if discarding the hyphen:

pîhcâyihk — inside
nîhcâyihk — outside

This is called sandhi. When transliterated into syllabics, the transcription should follow the latter, blended interpretation, rather than the former, separated interpretation. By default, sro2syllabics() applies the sandhi rule and joins the syllable as if there were no hyphen:

>>> sro2syllabics('pîhc-âyihk')

However, if this is not desired, you can set sandhi=False as a keyword argument:

>>> sro2syllabics('pîhc-âyihk', sandhi=False)
'ᐲᐦᐨ ᐋᔨᕽ'
  • sro (str) – the text with Cree words written in SRO.
  • hyphens (str) – what to replace hyphens with (default: `<U+202F NARROW NO BREAK SPACE>>).
  • sandhi (bool) – whether to apply sandhi orthography rule (default: True).

the text with Cree words written in syllabics.

Return type:


crk_orthography.syllabics2sro(syllabics: str, produce_macrons=False) → str

Convert Cree words written in syllabics to SRO.

Finds all instances of syllabics in the given string, and converts it to SRO. Anything that is not written in syllabics is simply ignored:

>>> syllabics2sro('Eddie ᓂᑎᓯᔨᐦᑳᓱᐣ᙮')
'Eddie nitisiyihkâson.'

By default, the SRO will be produced with circumflexes (âêîô):

>>> syllabics2sro('ᐁᐍᐹᐲᐦᑫᐍᐱᓇᒪᕽ')

This can be changed to macrons (āēīō) by setting produce_macrons to True:

>>> syllabics2sro('ᐁᐍᐹᐲᐦᑫᐍᐱᓇᒪᕽ', produce_macrons=True)

In both cases, the character produced will be a pre-composed character, rather than an ASCII character followed by a combining diacritical mark. That is, vowels are returned in NFC normalization form.

In some syllabics text, syllabics with a ‘w’ dot are rendered as two characters: the syllabic without the ‘w’ dot followed by <U+1427 CANADIAN SYLLABICS FINAL MIDDLE DOT>; this differs from the more appropriate pre-composed syllabic character with the ‘w’ dot. For example,

ᐃᑘᐏᓇ — pre-composed syllabic

syllabics2sro() can converts both cases appropriately:

>>> syllabics2sro('ᐃᑘᐏᓇ')
>>> syllabics2sro('ᐃᑌᐧᐃᐧᓇ')
  • syllabics (str) – the text with Cree words written in syllabics.
  • produce_macrons – if True, produces macrons (āēīō) instead of circumflexes (âêîô).

the text with Cree words written in SRO.

Return type:
