April 2013 – Weapon of Choice

2013-04-27

Fun with the Gregorian Calendar

Without a proper method it’s not really possible to calculate the day of the week of a day of the year. A trivial proper method is to look it up into a calendar, be it analogical or digital. Of course you can do that, but it’s not nearly as fun as the method I’m going to describe here.

First and foremost, we know that a year is 365 days, which in turn is 52 weeks plus one extra day. So, if it was easy, we could infer that today one year in the future has the next day of the week from today’s. For example, today is 2013-04-27 and is Saturday, so 2014-04-27 should be Sunday, which is. Analogously it goes the other way around, 2012-04-27 should be Friday, which is. OK, so a proper method could be simply to either add as many days as there are years in the future or subtract as many days as there are years in the past to get from today to the desired date.

Sadly, it’s not that easy, even if we stay into the limits of the Gregorian Calendar. In fact it’s much more difficult because of leap years, happening at a rate of one each four years… approximately. The real succinct formula for finding out if a given year is a leap year is as follows:

A year is a leap year if it's divisible by 400 or it's divisible by 4 but not by 100.

For example, year 2000 was a leap year, but 1900 was not, and 2012 was, but 2013 is not.

The next important bit is to find out how many leap years there are between any two dates. We could easily find it if we already knew how many leap years there are between the 1st day of March of the years of those two dates. In fact,

leap_years(date1, date2) = leap_years(date1, march1st(year1)) + leap_years(march1st(year1), march1st(year2)) + leap_years(march1st(year2), date2)

For example, if date1 was 1800-01-10 and date2 was 2000-10-01, then

leap_years(1800-01-10, 2000-10-01) = leap_years(1800-01-10, 1800-03-01) + leap_years(1800-03-01, 2000-03-01) + leap_years(2000-03-01, 2000-10-01)

and in this case it would be

leap_years(1800-01-10, 2000-10-01) = 0 + x + 0 = x

Both the first and the last term are 0 because (first) 1800-01-10 < 1800-03-01 but 1800 is not a leap year and (last) 2000 is a leap year but 2000-03-01 < 2000-10-01.

To find out how many leap years there are between the 1st day of march of any two years, we can use a little trick, derived from the formula about how to identify a leap year. It is a trick because what follows only works for time intervals and if both limits are after the year 1582, when the Gregorian Calendar was first adopted by catholics.

leap_years(march1st_year1, march1st_year2) = fake_leap_years(year2) - fake_leap_years(year1)

fake_leap_years(y) = y:4 - y:100 + y:400, where : stands for the integer division

Following up with our example, we’d have

leap_years(1800-01-10, 2000-10-01) =
= leap_years(1800-03-01, 2000-03-01) = 
= fake_leap_years(2000) - fake_leap_years(1800) = 
= (500 - 20 + 5) - (450 - 18 + 4)  =
= 485 - 436 =
= 49

If Monday = 1, Tuesday = 2, …, Saturday = 6, Sunday = 0; and today is 2013-04-27 which is a Saturday (T = 6) and we want to know which day of the week (W) was on 2011-08-30; then here is how we can proceed.

2013-04-27 >>> 2013-08-30
1. 2013-04-27 >>> 2013-04-01
  1. A: offset of Apr 27, 2013 from Apr 1, 2013
    = 27 % 7 = 6, because 27 = 3 * 7 + 6
  2. B: day of week of Apr 1, 2013
    = T – A
    = 6 – 6 = 0 = Sunday
    (always use a minus here because the 1st is always before any other day)
2. 2013-04-01 >>> 2013-08-01
  1. C: # days in between Apr 1, 2013 and Aug 1, 2013
    = # days of Apr + # days of May + # days of Jun + # days of Jul
    = 30 + 31 + 30 + 31 = 122
  2. D: offset of Aug 1, 2013 from Apr 1, 2013
    = C % 7 = 122 % 7 = 3, because 122 = 17 * 7 + 3
  3. E: day of week of Aug 1, 2013
    = B + D
    = 0 + 3 = 3 = Wednesday
    (use a minus or a plus depending on wether the month to get to is before or after)
3. 2013-08-01 >>> 2013-08-30
  1. F: offset of Aug 30, 2013 from Aug 1, 2013
    = 30 % 7 = 2, because 30 = 4 * 7 + 2
  2. G: day of week of Aug 30, 2013
    = E + F
    = 3 + 2 = 5 = Friday
    (always use a plus here because the 1st is always before any other day)
2011-08-30 >>> 2013-08-30
1. H: # leap years in between Aug 30, 2011 and Aug 30, 2013
  => leap_years(2011-08-30, 2013-08-30)
  = leap_years(2011-08-30, 2011-03-01) + leap_years(2011-03-01, 2013-03-01) + leap_years(2013-03-01, 2013-08-30)
  = -0 + fake_leap_years(2013) – fake_leap_years(2011) + 0
  = (503 – 20 + 5) – (502 – 20 + 5)
  = 1
2. J: # years in between Aug 30, 2011 and Aug 30, 2013
  = 2013 – 2011 = 2
3. K: offset of Aug 30, 2013 from Aug 30, 2011
  = H + J
  = 1 + 2 = 3
4. W: day of week of Aug 30, 2011
  => day of week of Aug 30, 2013 = day of week of Aug 30, 2011 + offset of Aug 30, 2013 from Aug 30, 2011
  => G = W + K
  => W = G – K = 5 – 3 = 2 = Tuesday

2013-04-192021-09-26

How to share code among ActiveRecord models

My last project was migrating a Rails 2.3 app to Rails 3.2. At some point I was updating tens of definitions of named scopes, a pretty easy and tedious task.

When I see many similar code snippets, generalizations popup in my mind. That never fails, and this time I found many named scopes that could be parametrized in some way or another. So I did it and started replacing those occurrences with calls to my generic scopes.

Later on, when it came time to test whether my changes worked or not… (f*ck) I found out there was a pretty good reason why a bunch of them were the way they were, i.e. not parametrized. Aargh

The first generic scope: or_where

To explain the issue I need to take a little detour. One scope I could not live without anymore was the where with an OR, i.e. the where that makes the OR of its arguments. It’s very strange that Rails implements only the where with an AND. I mean, anyone knows that AND and OR go hand in hand…

module GenericScopes

  module ClassMethods

    def merge_conditions_with_or(*conditions)
      segments = []
      conditions.each do |condition|
        unless condition.blank?
          sql = sanitize_sql(condition)
          segments << sql unless sql.blank?
        end
      end
      "(#{segments.join(') or (')})" unless segments.empty?
    end

  end



  def self.included(model)

    model.extend ClassMethods

    model.class_eval do

      model.named_scope :or_where, lambda { |*conditions|
        split_conditions = []
        conditions.each { |condition|
          if condition.is_a?(Hash)
            condition.each { |key, value|
              split_conditions << {key => value}
            }
          else
            split_conditions << condition
          end
        }
        sql = merge_conditions_with_or(*split_conditions)
        where(sql)
      }

    end

  end

end

About the GenericScopes module, the interesting thing to note is that it will work without modifications with the final solution I’m going to describe here. In fact, its code could be buggy (I didn’t even try to make it work as nicely as the default where) but it works pretty fine for my own needs. (See an example at the end of this post.)

The problem with the default where is that the AND of the arguments is hardcoded into the merge_conditions method. Instead of changing anything into the ActiveRecord code, I opted for copy-and-pasting that method and replacing the AND with an OR. I use lower case SQL so that when I inspect queries I can easily understand if a piece of query was generated by me or by Rails.

The issue I’m trying to explain pops up with the call to sanitize_sql from merge_conditions_with_or. If the method was public it could work directly like this

User.sanitize_sql(:full_name => 'John Doe')
  # => ["`users`.`full_name` = 'John Doe'"]

but, if an exception wasn’t raised, it would be like this

User.or_where(:full_name => 'John Doe') 
  # internally, the sanitize_sql would return 
  # => ["``.`full_name` = 'John Doe'"]

In fact, when the call is applied to User, and when it takes place inside of merge_conditions_with_or (even if called from or_where which is in turn applied to User), the values of self are completely different. While in the former case it’s User (concrete), in the latter self is ActiveRecord::Base (abstract).

Problem

The problem was the way generic scopes were made available to models.

ActiveRecord::Base.send(:include, GenericScopes)

The inclusion of GenericScopes into ActiveRecord::Base was the cause of all troubles with sanitize_sql. It had been working fine for years in production, but for my new scopes I needed to move the inclusion of the module from the abstract ActiveRecord::Base to each and every concrete model, like this:

class User
  include GenericScopes
  #...
end

class Post
  include GenericScopes
  #...
end

#...

Solution

That is quite boring and you must remember to include GenericScopes in all models. Luckily Ruby provides the inherited method. The following code is then a pretty good way to share code among ActiveRecord models. The nice things are that (1) it follows the DRY principle, (2) it is completely automatic, (3) shared code is properly bound to the inheriting class.

module ActiveRecord
  class Base
    class << self
      def inherited_with_generic_scopes(model)
        inherited_without_generic_scopes(model)
        model.send(:include, GenericScopes)
      end
      alias_method_chain :inherited, :generic_scopes
    end
  end
end

being or being_not, that is the question

It’s very common to have many boolean columns in the same table, so it’s useful to have a generic scope for them. Using where and or_where, these are all the combinations:

model.named_scope :being,        lambda { |*columns|    where(columns.map{|x| {:"#{x}" => true}}.reduce(&:merge)) }
model.named_scope :or_being,     lambda { |*columns| or_where(columns.map{|x| {:"#{x}" => true}}.reduce(&:merge)) }
model.named_scope :being_not,    lambda { |*columns|    where(columns.map{|x| {:"#{x}" => false}}.reduce(&:merge)) }
model.named_scope :or_being_not, lambda { |*columns| or_where(columns.map{|x| {:"#{x}" => false}}.reduce(&:merge)) }

And here is an example about how to use them:

Song.or_being('reggae', 'ska', 'ska_punk')

2013-04-172021-09-26

How to highlight Ruby code on the web

I’m using my own Chili plugin on this website, and I made it exactly for that: highlighting my snippets on my websites. It is very good for many languages, but not for Ruby. Some years ago I started developing a new version, much more powerful than ever, but later I was sucked up into life again and I left that behind…

So now that I wanted to publish some Ruby code here, I recently wandered in the internet looking for some kind of replacement, or at least something that I could use side by side with Chili, and which could highlight Ruby. I mean, highlight Ruby as real Ruby, because I could easily write any fake Ruby recipe for Chili myself… But if I had this kind of code I want it highlighted as it should:

{[ .snippet | 1.hljs(=ruby=) ]}

Note that such a snippet is perfectly valid Ruby code (results in a == “#}1”) but it is easy to understand why a highlighter could fail on it. And many do.

Luckily I eventually stumbled upon highlight.js (by Ivan Sagalaev) which is excellent for Ruby (and I think all other languages it supports, too). As you see above, it correctly understands how string interpolation works in Ruby. That is thanks to its context based architecture (which Chili lacks).

Use this CSS to reset PRE tags:

{[ .reset-pre | 1.hilite(=css=) ]}

Use this HTML to setup highlight.js for coloring PRE blocks containing CODE blocks with a class “hljs”. As usual Chili will ignore blocks with languages it doesn’t have a recipe for, and highlight.js will instead consider only its own blocks.

{[ .js-config | 1.hilite(=html=) ]}