Question

In Chronic 0.9.1, when parsing Febr 2013 I'm getting a result June 2013. Feb 2013 is parsed fine but Febr 2013 is not.

I think the issue is when the month abvreviation has four letters.

I need to:

  • Parse Febr 2013 to February 2013, or
  • Invalidate Febr 2013.

To validate a date I use:

Chronic.parse(params[:date]).blank?

Is this a bug? Can I do a work arround? Or, there is a right way to validate this?

Was it helpful?

Solution

Technically it's a bug, but I'm more inclined to call it a hole in their logic. Here's how Chronic::Repeater.scan_for_month_names decides what a month is:

# File 'lib/chronic/repeater.rb', line 38

def self.scan_for_month_names(token)
  scan_for token, RepeaterMonthName,
  {
    /^jan[:\.]?(uary)?$/ => :january,
    /^feb[:\.]?(ruary)?$/ => :february,
    /^mar[:\.]?(ch)?$/ => :march,
    /^apr[:\.]?(il)?$/ => :april,
    /^may$/ => :may,
    /^jun[:\.]?e?$/ => :june,
    /^jul[:\.]?y?$/ => :july,
    /^aug[:\.]?(ust)?$/ => :august,
    /^sep[:\.]?(t[:\.]?|tember)?$/ => :september,
    /^oct[:\.]?(ober)?$/ => :october,
    /^nov[:\.]?(ember)?$/ => :november,
    /^dec[:\.]?(ember)?$/ => :december
  }
end

Month names are either three letters, or the entire name.

You could extract that method from the source, modify the patterns to fit your needs, then overwrite that method, along with submitting it as a patch so the tweak gets added to future revisions of the gem. Or, you could modify the incoming string by searching for the three-letter abbreviations at the beginning of a word, and trimming extraneous characters.


OK, here's something to chew on:

require 'abbrev'

MONTHS = %w[
  january
  february
  march
  april
  may
  june
  july
  august
  september
  october
  november
  december
]

MONTHS_ABBREV = Abbrev.abbrev(MONTHS)
MONTHS_REGEX = /\b(?:j(?:a(?:n(?:u(?:a(?:ry?)?)?)?)?|u(?:ly?|ne?))|s(?:e(?:p(?:t(?:e(?:m(?:b(?:er?)?)?)?)?)?)?)?|a(?:u(?:g(?:u(?:st?)?)?)?|p(?:r(?:il?)?)?)|d(?:e(?:c(?:e(?:m(?:b(?:er?)?)?)?)?)?)?|f(?:e(?:b(?:r(?:u(?:a(?:ry?)?)?)?)?)?)?|n(?:o(?:v(?:e(?:m(?:b(?:er?)?)?)?)?)?)?|o(?:c(?:t(?:o(?:b(?:er?)?)?)?)?)?|ma(?:r(?:ch?)?|y))\b/i

%w[j ja jan janu january f fe feb febr february].each do |m|
  puts "#{ m } => #{ MONTHS_ABBREV[m[MONTHS_REGEX]] }" 
end

Which outputs:

j =>
ja => january
jan => january
janu => january
january => january
f => february
fe => february
feb => february
febr => february
february => february

In other words, j isn't unique, so there isn't a hit. ja is unique and is associated with january, as are the rest of the ja... tests. f is unique so it hits, as do all the rest of the f... tests.

What does Abbrev.abbrev do? It breaks the words passed in, into the minimum unique strings used to identify the whole word. Here's what it looks like if I only use four months:

require 'abbrev'

MONTHS = %w[
  march
  may
  june
  july
]

MONTHS_ABBREV = Abbrev.abbrev(MONTHS)
pp MONTHS_ABBREV

Resulting in:

{"marc"=>"march",
 "mar"=>"march",
 "jun"=>"june",
 "jul"=>"july",
 "march"=>"march",
 "may"=>"may",
 "june"=>"june",
 "july"=>"july"}

Those make wonderful seed values for a regular expression.

Where did I get MONTHS_REGEX? Heh... it's some magical Perl code using a little known module called Regexp::Assemble, that I dearly miss in Ruby. It's skanky... no, it's... diabolically good and closely tied to how Perl does things, and makes my head hurt when I read through it, otherwise I'd have ported it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top