REMINDER Every slide are lasts in 20 seconds. So don’t reclaim! STARTS IN… 3210.
This talk lasts 三十分钟
Transcript of This talk lasts 三十分钟
This talk lasts Localisation is easy
Administrative Notes
• @pilif on twitter
• pilif on github
• working at Sensational AG
• @pilif on twitter
• pilif on github
• working at Sensational AG
• warming up to shirts
Thanks Richard for the Recording
About that 💩
Maybe ES6…?
My host name is a horrible spoiler if you're into JRPGs. Disregard
however…
close enough.
Back to the topic at hand
Let’s talk terms
• Language is a language as it is spoken or written
• Locale is the name given to a set of parameters that define how things should be done for users speaking a certain language in a certain place
• There are many more locales than countries
Locale
• Locales consist of a language…
• … and a country
• … and sometimes specific variants
Specifying locales
• IETF BCP-47 document
• See RFC 5646 and RFC 4647
• Use language-script-territory@modifier
• POSIX uses language_territory.encoding@modifier
fr-Latn-CH
fr-CH
fr_CH.utf-8
The Locale affects many things
Number formatting• Probably the most obvious of the bunch.
• Decimal separator
• Thousands separator
• Sign
• Also: Currency information
Some Samples
de-CH de-DE en-US
decimal separator . , .
thousands separator ' . ,
12,435
en-US twelve thousand four hundred and thirty five
de-DE twelve comma four three five
de-CH error
Date Formatting
• Obviously names of months and weekdays
• Order of distinct parts
• Separator character
• Commonly used formats in different contexts
Date Formatting• Libraries usually provide a generic short/
medium/long format
• Libraries also provide templates
• If your library’s template language has any characters that are not for replacement, they are doing it wrong
• Apple does it right since 10.11 and iOS9
2015-07-18 17:47Long Medium Short
en-US July 18, 2015 at 4:58:00 PM CEST
Jul 18, 2015, 4:58:00 PM 7/18/15, 4:58 PM
fr-CA 18 juillet 2015 16:58:00 UTC+2
18 juil. 2015 16:58:00 15-07-18 16:58
fr-CH 18 juillet 2015 16:58:00 UTC+2
18 juil. 2015 16:58:00 18.07.15 16:58
fr-FR 18 juillet 2015 16:58:00 UTC+2
18 juil. 2015 16:58:00 18/07/2015 16:58
Choice of calendar• Most of the world is using the Gregorian
calendar
• The Julian calendar uses the same month names but is off by 13 days (they have July 5th right now)
• Other calendars use different month names
• Might affect holiday calculations
Collation order
• How to compare to strings. Which one is first?
• Where to put the characters with pesky accents?
• How to deal with case differences?
• What about non-latin scripts?
Collation fun*• Phonebook german vs. ordinary german, vs.
Austrian german (dealing with umlauts)
• Contractions (Spanish ch counts as one letter, ch in Czech sorts after h, but c after b, etc)
• Handling of accents is language-dependent
• Case insensitive is a mess
Case folding• Some languages don’t differentiate between upper- and
lowercase
• Inconsistent mapping between upper- and lowercase (ß => SS, the reverse is not always true)
• Uppercasing accented characters is language (and sometimes locale) dependent. French characters often loose accents when uppercasing
• Inconsistent uppercasing for some languages (uppercase turkish i is İ. Lowercase turkish I is ı)
Double the fun• Collation and Case-Folding provide an interesting
team
• Depending on locale, upper- and lowercase should be sorted together or apart
• In some locales, case doesn’t matter at all when sorting
• In some locales, case always matters when sorting
• Depends on the use-case
Collation strength
• icu created the concept of “collation strength”
• strength 1 is the most lenient
• strength 5 is the most exact
• Example: Strength 2 removes accents unless the language is Danish
‘nough said
RTL
Perspectives matter
Context matters• “This slide lasts one minute”
• “This talk lasts 30 minutes”
• “Lunch lasted 1:30 hours”
• “Tomorrow I’ll sleep in”
• “August, 1th is a national holiday”
Let’s get practical
Locale handling is like escaping
• Always store raw unformatted data
• Format near the end of the chain
• Just before you escape
• Parse user input as early as possible
• Use native data types
UI Language is not locale
• Users might prefer to use the os in a different language than what’s inferred by their locale
• Just because I’m in de_CH it doesn’t mean I want your software to speak german to me
• UI language is completely different from the users locale
Avoid this mess
Avoid this mess
Avoid this mess
Mixing Locales• Forming sentences in UI language with locale formatted
data is… challenging
• Be mindful that language might influence some locale formatting.
• “This talk lasts ”
• or rather “This talk lasts 30 minutes”
• It depends. Does the locale also use hours and minutes?
Never be helpful* and translate units
1kg in de_CH is not 1lbs in en_US
Btw: Apple’s APIs are really good at this
What about web sites?
• Never, ever infer UI language by IP Geolocation.
People from Google: This slide is for you!
What about web sites?
• Never, ever infer UI language by IP Geolocation.
• Ever. Ever. EVER.
People from Google: This slide is for you!
What about web sites?
• Never, ever infer UI language by IP Geolocation.
• Ever. Ever. EVER.
• Promise!
People from Google: This slide is for you!
What about web sites?
• Never, ever infer UI language by IP Geolocation.
• Ever. Ever. EVER.
• Promise!
• You may infer Locale from IP Geolocation though
People from Google: This slide is for you!
Rely on HTTP• Trust Accept-Language - by now browser set
it correctly
• Use the header to determine UI language
• Use the header to determine default locale
• But ask the user
• Same goes for time zones
SHOW ME SOME CODE ALREADY!!!
The past• There has always been date formatting
(Date.toLocaleString). Mostly useless
• People were self-nebling (search youtube for “ich neble selber”) for example in date pickers and libraries
• hint: applying substr() to Date.toDateString() is not a correct solution.
• same goes for using replace(‘.’, ‘,’) on a number
The present• Microsoft has donated a huge chunk of localisation code to the
jQuery project.
• It’s not integrated into jQuery, but maintained by the jQuery project
• Check out https://github.com/jquery/globalize
• Doesn’t support collation
• The library is big
• But most of it is data and this problem can only be solved with a huge database of special cases
Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));
Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));
Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));
Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));
Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));
Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));
Globalize.locale("fr-CH"); console.log(Globalize.formatDate( new Date(), {datetime: "medium" } )); console.log(Globalize.formatDate( new Date(), {skeleton: "yMMMM" } )); console.log(Globalize.formatNumber(12345.6789)); console.log(Globalize.formatCurrency(1956.3334, "EUR")); console.log(Globalize.formatRelativeTime(-35, "second"));
The future• ECMA-402 from 2012
• Yes. Specs from 2012 are “the future” in JS land
• Provides the global Intl object
• Date, Number formatting and Collation
• see: http://www.ecma-international.org/ecma-402/1.0/
Could be worse
node.js is still bikeshedding because icu
var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));
var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));
var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));
var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));
var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));
var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));
var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));
var f = new Intl.DateTimeFormat('de-CH', { weekday: 'long', year: 'numeric', month: 'long', day: 'numeric' }); console.log(f.format(new Date())); var n = new Intl.NumberFormat('de-CH', { style: "decimal", minimumFractionDigits: 2 }); console.log(n.format(1234.5)); var currency = new Intl.NumberFormat('de-CH', { style: "currency", currency: 'EUR' }); console.log(currency.format(1234.5)); var comp = new Intl.Collator('de-CH'); var words = [ "Swissjs", "swissjs", "is", "loads", "of", "fun" ]; console.log(words.sort(comp));
Conclusion• Proper localisation is part of our job to make the web useful for
everybody
• Use the libraries provided
• Whenever you think you know better than the library: No. You don’t.
• Remember that UI language and Locale are not always connected
• Don’t do IP geolocation for language choice
• When in doubt: Ask the user. She’ll know for sure.
Before I leave
""".length
[…"""].length
In case you answered 11 and 8, I salute you
Thanks everyone and enjoy your evening
• U+1F468 (MAN) 👨
• U+200D (ZERO WIDTH JOINER)
• U+2764 (HEAVY BLACK HEART) ❤
• U+FE0F (VARIATION SELECTOR-16)
• U+200D (ZERO WIDTH JOINER)
• U+1F48B (KISS MARK) 💋
• U+200D (ZERO WIDTH JOINER)
• U+1F468 (MAN) 👨