glibc 2.26: New and Updated Locales

On August 2, 2017 glibc (The GNU C library) version 2.26 has been released. Among others, many issues related with supported locales have been addressed, most of them shortly before the release. Let’s see what has been changed.

New locales

Compared to the previous version, this release introduces the support of 6 new languages: Aguaruna, Bislama, Fiji Hindi, Samoan, Tok Pisin, and Tongan as well as 2 new variants: South Azerbaijani for Iran, and Maithili for Nepal.

Aguaruna is a language spoken by about 38,000–45,000 indigenous people in Peru. Bislama is an official language of Vanuatu although spoken by about 10,000 people only. Fiji Hindi is a language descending from although different than Hindi. It is spoken by about 300,000 citizens of Fiji which makes about ⅓ of its total population and is one of the official languages of the country. It is written using both the Latin and the Devanagari script. This release introduces the Latin script only but Devanagari is also considered to be introduced in future. Tok Pisin is one of the official languages of Papua New Guinea. Although spoken by only 120,000 native speakers which makes 1.7% of total population it is the most widely used language of the country. No wonder since Papua New Guinea features about 850 native languages.

South Azerbaijani is a variant of Azerbaijani language spoken by about 13 million people (16% of total population) in Iran and Maithili is spoken by about 3 million people (11.5% of total population) in Nepal. Both have been previously represented by their variants for Azerbaijan and India, respectively. Now their users may enjoy more granularity.

Updates

Bugs in alphabetic sorting in Hungarian and Malayalam (see also: here) have been fixed. But lots of other fixes have been introduced in date and time elements, mostly in month names. Typos in either full or abbreviated or both names have been fixed, among others, in Arabic (many variants), Belarusian, Breton, Friulian, Hindi, Kannada, Konkani, Malayalam, Marathi, Mongolian, Northern Sami, Serbian (Latin only), Spanish (Peru and Uruguay), Uzbek, Yoruba, Zulu — total of 55 languages have been updated to the content of CLDR version 31. Weekday names have been updated in Arabic, Chechen, and Kashmiri — Saudi Arabian users had them displayed in English so far. Yes and no translated strings have been added or fixed in many languages.

Incorrectly appended trailing spaces have been removed in several locales, usually from weekday names. They mainly include languages of India but also Albanian (where the issue has been first spotted), Haitian, Maltese, and more. This change will polish date formatting in these locales.

Unicode 10.0

This version also introduces the full support of Unicode 10.0. The changes are mainly focused on new emoji characters.

It’s worth mentioning that the full Unicode 10.0 support has been added to glibc only 2 days after its official release by the Unicode Consortium.