Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the month detection for lowercase and long abbreviated month names. Fix ddd and mmm formats to make them fit more locales. #10

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Exoth
Copy link

@Exoth Exoth commented Aug 19, 2013

Let's try the usual case:

>> I18n.locale = :en
:en

>> "18 October 2013".to_date
Fri, 18 Oct 2013

>> "18 october 2013".to_date
Fri, 18 Oct 2013

>> "18 oct 2013".to_date
Fri, 18 Oct 2013

>> "18 Oct 2013".to_date
Fri, 18 Oct 2013

So far everything is ok. But in fact there are a lot of locales, which have lowercase and long abbreviated month names, like 'octobre' and 'oct.' in French.

>> I18n.locale = :fr
:fr

>> "18 Octobre 2013".to_date
ArgumentError: invalid date
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:12:in `new'
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:12:in `to_date'
  from (irb):39
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:47:in `start'
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:8:in `start'
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands.rb:41:in `<top (required)>'
  from /home/exoth/projects/jetradar/script/rails:6:in `require'
  from /home/exoth/projects/jetradar/script/rails:6:in `<top (required)>'
  from -e:1:in `load'
  from -e:1:in `<main>'

>> "18 octobre 2013".to_date
ArgumentError: invalid date
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:12:in `new'
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:12:in `to_date'
  from (irb):41
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:47:in `start'
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:8:in `start'
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands.rb:41:in `<top (required)>'
  from /home/exoth/projects/jetradar/script/rails:6:in `require'
  from /home/exoth/projects/jetradar/script/rails:6:in `<top (required)>'
  from -e:1:in `load'
  from -e:1:in `<main>'

>> "18 oct. 2013".to_date
NoMethodError: undefined method `map' for nil:NilClass
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:11:in `to_date'
  from (irb):43
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:47:in `start'
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:8:in `start'
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands.rb:41:in `<top (required)>'
  from /home/exoth/projects/jetradar/script/rails:6:in `require'
  from /home/exoth/projects/jetradar/script/rails:6:in `<top (required)>'
  from -e:1:in `load'
  from -e:1:in `<main>'

>> "18 Oct. 2013".to_date
NoMethodError: undefined method `map' for nil:NilClass
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/bundler/gems/timeliness-9c8161f4ab08/lib/timeliness/core_ext/string.rb:11:in `to_date'
  from (irb):45
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:47:in `start'
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands/console.rb:8:in `start'
  from /home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/railties-3.2.14/lib/rails/commands.rb:41:in `<top (required)>'
  from /home/exoth/projects/jetradar/script/rails:6:in `require'
  from /home/exoth/projects/jetradar/script/rails:6:in `<top (required)>'
  from -e:1:in `load'
  from -e:1:in `<main>'

So that pull request fixes the month detection behavior to properly handle such situations.

And the second part is about regexp used for ddd and mmm formats.

Here's a simple code, checking how well the regexps perform against translations from all the locales found in the rails-i18n gem:

def check key, regexp
  bad = []
  locales = Dir['/home/exoth/.rvm/gems/ruby-1.9.3-p448@jetradar/gems/rails-i18n-0.7.4/rails/locale/*']
  locales.each do |file| 
    q = YAML.load(File.read(file))
    bad << q.keys.first if q.values.first["date"][key].compact.any?{|i| i !~ regexp} 
  end
  puts "#{bad.count}/#{locales.count} miss the regexp: #{bad.join(', ')}."
end

>> check 'month_names', /^\w{3,9}$/
check 'month_names', /^\w{3,9}$/
59/80 miss the regexp: es-AR, th, es-VE, el, is, az, de, tr, he, cy, zh-CN, fr-CH, lv, ca, hi-IN, pl, et, zh-HK, fr, mk, hi, hr, sk, uk, lt, es-PE, ro, pt, fa, pt-BR, lo, ne, es-419, kn, gl, ar, hu, fi, cs, es, sw, zh-TW, ja, es-MX, es-CL, eo, es-CO, bg, sr, or, de-AT, ru, wo, fr-CA, ko, vi, bn, de-CH, mn.

>> check 'abbr_month_names', /^\w{3,9}$/
43/80 miss the regexp: th, el, is, az, de, tr, he, zh-CN, fr-CH, lv, ca, uz, pl, et, zh-HK, fr, mk, hi, hr, sk, uk, fa, lo, ne, gl, ar, hu, fi, cs, zh-TW, ja, eo, bg, sr, or, de-AT, ru, fr-CA, ko, vi, bn, de-CH, mn.

>> check 'day_names', /^\w{3,9}$/
66/80 miss the regexp: es-AR, th, es-VE, el, sl, is, az, de, tr, he, cy, sv, zh-CN, id, lv, nb, uz, hi-IN, pl, et, zh-HK, mk, hi, hr, sk, uk, lt, es-PE, it, ro, nn, bs, pt, it-CH, fa, pt-BR, da, lo, ne, tl, es-419, kn, gl, ar, hu, fi, eu, cs, es, zh-TW, ja, es-MX, es-CL, eo, es-CO, bg, sr, or, de-AT, ru, wo, ko, vi, bn, de-CH, mn.

>> check 'abbr_day_names', /^\w{3,9}$/
check 'abbr_day_names', /^\w{3,9}$/
61/80 miss the regexp: es-AR, th, es-VE, el, sl, is, az, de, tr, he, sv, zh-CN, rm, lv, ca, nb, uz, hi-IN, pl, et, zh-HK, mk, hi, hr, sk, uk, lt, es-PE, ro, nn, bs, pt, fa, pt-BR, da, lo, ne, es-419, kn, ar, hu, fi, cs, es, sw, zh-TW, ja, es-MX, es-CL, eo, es-CO, bg, sr, or, de-AT, ru, ko, vi, bn, de-CH, mn.

So there are quite a lot of misses. So far these are the best results I've managed to get by making minor changes to the regexps:

>> check 'month_names', /^[[[:alnum:]]\.]{2,11}$/
11/80 miss the regexp: th, hi-IN, hi, ne, kn, sw, or, wo, vi, bn, mn.

>> check 'abbr_month_names', /^[[[:alnum:]]\.]{2,11}$/
8/80 miss the regexp: ca, hi, lo, ne, or, vi, bn, mn.

>> check 'day_names', /^[[[:alnum:]]\.]{1,14}$/
13/80 miss the regexp: th, az, cy, id, hi-IN, hi, uk, fa, ne, kn, or, vi, bn.

>> check 'abbr_day_names', /^[[[:alnum:]]\.]{1,14}$/
8/80 miss the regexp: hi-IN, hi, lo, ne, kn, or, vi, bn.

Much better already, though there are still some locales left, but they mostly involve multiword names, and this probably can't be fixed just by changing the format regexps. The way I see is to use the list of month/day names of the current locale at the stage of format processing.

Anyway, the pull request includes the changes in formats as specified above.

@Exoth
Copy link
Author

Exoth commented Dec 13, 2013

Now when this solution couldn't match the Thai February month name translation I decided to go even further and defined mmm like this:

months = I18n.available_locales.flat_map do |locale| 
  I18n.t('date.month_names', locale: locale) + I18n.t('date.abbr_month_names', locale: locale)
end 
months.compact.uniq.map { |month| Regexp.escape(month) }.join('|')

and enabled case insensitive search. This way I'm guaranteed to match any possible month translation. But regexp is much bigger now and I don't know how it affects the performance. Anyway tell me if you find this solution suitable for the timeliness gem and I'll make a pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant