Displaying articles with tag

That's Not My Job!

Posted by andy, Mon Apr 20 14:22:00 UTC 2009

Methods of Delegation with Ruby and ActiveRecord

What do you do when your data domain is well-factored but just isn't 'convenient'? For example, suppose you create a data model in which both a Person and a Building have an Address and you've pulled out a ZipCode class so that you can prompt your user to select from a list of previously known city/state pairs. Do you compromise your award winning data model and repeat (gasp!) data? Do you provide a series of convenience methods? No! Teach your objects (and maybe yourself) to delegate.

In this talk we'll look at ways that ActiveRecord simplifies delegation and how you can take advantage of Ruby to build similar delegation in non-ActiveRecord classes.

Meeting: Tuesday April 28 at High Noon

Check here for directions if you have not joined us before.

0 comments | Filed Under: Meetings | Tags:

New article: Globbing up javascript

Posted by andy, Mon Mar 16 12:01:00 UTC 2009

I've just published another article called 'Globbing up javascript' that covers the basics of Dir.glob and how it can be used to clean up all the javascript (stylesheet, etc) include in your layouts. If you've gotten frustrated by having to maintain that 'golden list' of javascript files or ever been frustrated when some piece of javascript logic for your Web2.0 app failed to appear because you forgot to update the list then head on over and check it out. If not, you might still find it a useful technique for your bag of tricks.

0 comments | Filed Under: | Tags:

Globbing up javascript

Posted by andy, Mon Mar 16 12:00:00 UTC 2009

The Motivation

When you first started with Ruby on Rails you were probably intrigued by the simplicity of adding AJAX to your applications by simply including a line like this:

<%= javascript_include_tag :defaults %>

It was short, concise, and invisible. All the Rails "magic" took over and you didn't need to worry much about which files were included because everything just worked.

Then you ran into jquery. Either because you did not want to be outdone by the "cool kids" or you really were motivated by the clean separation between view and code even in the browser you bought in. Then you discovered a few nifty plugins for paging and ajax forms. And there was a nifty auto-complete that you just had to have. Before you knew it, there was more javascript_include_tag logic in your application layout than anything else.

Compounding the Issue

For me it really started with the ExtJs library. Once we bought into it at work we quickly realized that we were now developing a client-server app over http and that our javascript views needed to be just as neatly organized as our html views would have otherwise been. We developed a scheme for organize our javascript views that looked remarkably like the Rails organization:


public
  javascripts
    my_company
      components
        ...files...
      views
        area1
          ...files...
        area2
          ...files...

The good news for us was that the code was pretty well organized. We know where to find a file that needs some love when things go awry. The bad news is that means we also need to write a lot of includes. Too many includes to keep up with really. There was also the not-so-minor inconvenience of working hard on a new view, refreshing the browser, and throwing javascript errors (or worse, silently failing) because the new file had not been included. Argh!

Glob to the rescue

This is where 'globbing' came in very handy. If you are not familiar with it, glob is a class method of Dir that searches your directory tree for files that match a pattern that you supply. The pattern for the file names is similar to but not quite the same as regex. As the documentation explains, it's closer to the shell's glob.

For this problem, though, that's all we need. The pattern matching matches literals as you might expect and it includes a pattern for recursing subdirectories ('**'). That fits perfectly well with our problem since what we want to do is find all the javascript files in subfolders of /public/javascripts.

Of course, there's a catch. glob begins it's work in the current directory and it returns file names relative to that directory. The javascript_include_tag, however, expects file references to be relative to /public/javascripts. That leaves you with two options: change working directories or alter the file paths. We chose the former path. The results looked like this:

<% javascript_include_tag 
  *(Dir.chdir( File.join(Rails.root, 'public', 'javascripts') {
      Dir.glob("my_company/**/*.js").sort
    }
  ) %>

Okay, so what's going on there??? Simple. First, we change directory (Dir.chdir) to start our directory globbing relative to the javascripts folder. By using the block-form of chdir we allow Ruby to reset the working directory for us when the block exits. Within the block we simply invoke Dir.glob asking for all the javascript files ("*.js") in subdirectories of the "my_company" directory. The great thing here is that glob really does recurse the subdirectories so we can add as many organizational layers as is necessary.

Don't miss the splat ('*') and parentheses around all that. It's pretty important. Dir.glob is going to return an array of strings describing file paths. javascript_include_tag, however, is still looking for a list of arguments. The splat comes in handy here, expanding the array.

The Net Result

The net result of all that is a single javascript_include_tag statement that includes all of the nested javascript files we've created. Even better, since they are included together in a single line when we can turn on javascript caching they will be compressed into one js bundle before being delivered to the client. Even better than that, we can drop in new files, reorganize existing files, and drop out that crummy javascript file we always hated without ever having to re-build the list of includes.

0 comments | Filed Under: Rails Ruby | Tags:

Problem with named_scope and :include?

Posted by andy, Thu Jun 26 12:30:00 UTC 2008

Recently I've been using the great Rails 2.1 addition called named_scope to help simplify the process of building queries with meaningful names. Certainly you could do something similar by creating a custom method, but as I wrote earlier, the advantage of named_scope is that the scopes are composable: you can chain them together to build ever more powerful queries.

That's why I've been really frustrated trying to get named scopes to work with the :include option. In one scenario I have been trying to sort a collection by a field that exists in an included table. If I were doing freight-forwarding the models involved look something like this.

class Customer < ActiveRecord::Base
  has_many :product_offerings
  has_many :products, :through=>:product_offerings
end

class Product
  has_many :product_offerings
end

class ProductOffering
  belongs_to :customer
  belongs_to :product

  named_scope :by_name, :include=>:product, :order=>:name
  named_scope :for_customers, lambda{|customer_ids| {:conditions=>{:customer_id=>customer_ids}}}
end

The objective I have in mind is to build a named_scope that allows me to list all the products for a particular customer alphabetically by the name of the product so each customer can check their inventory. In the day and age where holding companies are involved an individual user may be authorized for several companies and thus want to be able to get a unified inventory list for all the companies for whom he works. That's where the second named_scope comes into play. I'd like to be able to do this:

# all the product offerings listed by product name
ProductOffering.by_name

# all the product offerings for a set of customers
ProductOffering.for_customers([1, 2, 3])

# all the product offerings by name for a user's companies
ProductOffering.by_name.for_customers(@current_user.customer_ids)

But it doesn't work. For some reason the :include option seems to get dropped and I end up getting SQL errors reporting an unknown column name. I've found one stray comment that suggests that you may be able to fix the issue by adding the conditions necessary to make the SQL join work. That is we could expand the named_scope to something like

named_scope :by_name, :include=>:product, :order=>:name, :conditions=>["products.id = product_offerings.product_id"]

That's ugly. Really ugly. It also pushes perilously close to letting ProductOffering peek into Product too much. If you begin to go that route be careful because you'll be on the slippery slope of a brittle solution that won't survive refactoring.

The simple solution is to bypass :include in favor of :join. I don't understand why :join works more reliably but I'm sure the answer is down there in the source code if you want to dig around. I suspect that the answer lies in the way that Rails 2.1 breaks joins into two distinct fetches now in order to reduce the cost of spinning up redundant ActiveRecord instances (see the Relationship Optimised Eager Loading discussion.). Whatever the case, the workable, concise solution looks like this:

named_scope :by_name, :joins=>:product, :order=>:name

If you already need to include conditions on the join table you can probably continue along as you normally would. In this situation the only requirement on the join table is that the IDs match so I prefer the concise solution.

1 comment | Filed Under: | Tags:

New in Rails 2.1: Timestamped migrations

Posted by andy, Mon Jun 02 22:00:00 UTC 2008

What was wrong with migrations?

If you've been part of any development team that was larger than the "Army of One" you've probably run into an issue with migrations. It's happened to me a few times: one member of the teams goes head-down on some problem and it takes longer than expected. Not wanting to check in 'broken'' code the patch builds up for a while... and so do the migrations to fix db issues. Finally ready, the change gets checked in and ... poof... What worked no longer works. Why?

While Mr. Fix-It was head down the trunk was updated with other migrations. But these migrations had overlapping numbers so when they merged into the code base it was unpredictable which ones would be run on any given system. To be clear, the migrations will be run in a very definitie order. They're run in alphanumeric order, but only one migration of a specific 'version' will be executed. As a result, which migrations are run on your system depends on how many you'd already checked out and run and the alphabetical naming of each script. Now it's up to you and your team to rename all the migrations, backing them out one by one and adding them back to make sure all the database changes are applied appropriately. Yikes!

Enter Sandman

Disclaimer: I've only heard 'Sandman' when certain closers enter baseball games but I thought someone would appreciate the reference

The new timestamped migrations may put all these issues to rest. Instead of prefixing the migrations with 001, 002, 003, etc the prefix will now be a timestamp. So, the result of running a 'script/generate scaffod MyObject attribute1:string attribute2:integer' will be a file with a name like 20080601214508_create_my_object. The likelihood that you and a teammate create a migration at the exact same time is pretty small so the 'level collisions' are almost surely a thing of the past.

Tracking revisions, not the version

Even better, though, is that the schema_info table will now track revisions, not only the latest version. That is, every migration that is run via rake db:migrate will be recorded in the database. As a result, whenever Mr. Fix-It decides to enlighten the rest of the team with his update, a rake db:migrate will be able to identify the individual migrations that have not been run whether they were on Mr. Fix-It's machine (when he finally updates from trunk/master) or on a teammates' machine (when the patch is loaded).

Even better, there are new rake db:migrate:up and rake db:migrate:down commands. These commands accept an individual migration 'version' (the time stamp) and either run it (up) or back it out (down). Remember that table that you created and decided you'd overengineered? Now it's a lot easier to back that one out.

Could it get even better?

Yes, it could get better. Those of you who've read The Rails Way (Obie Fernandez, Addison-Wesley Professional Series, 2007) may have come across a recommendation to accumulate migration changes until they are pushed to production. That is, rather than create three migrations for a table and some additional fields while the table is in development, there is a recommendation to have one create script that gets updated until it's pushed to production. I've tried this on a couple of apps and really liked the approach because it cuts down on the 'noise' in the migration collections. I'm willing to accept the argument that migrations are not really necessary for tracking database changes until they change something beyond development.

What could be even better that the current implementation of the timestamped migration would be if it could detect these changes in the migration files. It should be possible to check the creation and update times of the files to see if they've been updated and then validate the updated time in the migrations db. This particular idea has some drawbacks, particularly if a production migration were ever touched accidentally.

0 comments | Filed Under: Rails | Tags:

New in Rails 2.1: named_scope

Posted by andy, Sun Jun 01 22:30:00 UTC 2008

Rails 2.1 Released

If you haven't heard, the release of Rails 2.1 was announced during core member Jeremy Kemper's keynote Saturday morning (but it didn't actually get released until around 2am the next morning).

named_scope

One of my favorite additions to the framework is the absorption of the has_finder plugin into the framework. If you've used has_finder in the past the only thing you'll need to do in order to 'go native' with it in your application is replace the 'has_finder' invocations with 'named_scope'. If you are not familiar with has_finder, it gives you the ability to declare custom finders in a concise, semantically meaningful way. For example:

class Task < ActiveRecord::Base
  named_scope :incomplete, :conditions=>{:completed_at=>nil}
end

In the Grade class above, we've used named scope to add a class-level finder called 'incomplete' that will perform the equivalent of this:

Task.find(:all, :conditions=>{:completed_at=>nil})

Great, so now I can write Task.incomplete and get a list of the incomplete tasks. But so what? I could have written a class-level method myself. Is this anything more than syntactic sugar? Yes!

named_scopes can combine

The real beauty of named scopes is that they chain together. Well crafted named_scopes are fine-grained pieces of find parameters that have clear purposes and meaninful names. Consider these:

class Task < ActiveRecord::Base
  named_scope :incomplete, :conditions=>{:completed_at=>nil}
  named_scope :past_due, lambda{ {:conditions=>['due_on < ?', Date.today]}}
  named_scope :due_today, lambda{ {:conditions=>['due_on = ?', Date.today]}}
end

Task.incomplete.past_due is the same as Task.find(:all, :conditions=>['completed_at is NULL and due_on < ?', Date.today]

Sweet. The chaining of the named_scopes means that you can create really nice 'sentences' in your Rails code that is clear and easy to read.

named_scopes play nicely with association proxies

Even better, the named_scopes work through association proxies as well. Without getting into the details too much, assume that your User class has_many Tasks. Now, your boss wants you to help him through a commons scenario. "Ryan, how can I figure out how many tasks that slacker Chris has let slide. Easy.

@chris = User.find_by_name('chris')
@chris.tasks.incomplete.past_due

In the first place I used this, the code became a lot easier to read. I added a named_scope that at first did not make a lot of sense: it added a model-related scope to the find. In this case it helped the code because I was normally accessing the data through another 'owner' and this named_scope helped me chain in finer focus.

@grades ||= @student.unreported_grades.find(:all, :conditions=>{:subject_id=>params[:subject_id]})
    @grades ||= @student.grades.unreported.for_subject(params[:subject_id])

The commented code is the original version that used a (now deprecated) class-level method. Passing the find conditions there worked but it was a little ugly. The named_scope version is both clearer and easier to maintian.

0 comments | Filed Under: Rails | Tags:

5 Tips for ActiveResource

Posted by andy, Thu Apr 24 16:30:00 UTC 2008

The first couple of tips have an indrect impact on ActiveResource. Still, they are worth keeping in mind because they simplify the data with which ActiveResource deals.

Tip 1: Use delegate and :method for encapsulation

If your crash course in Ruby involved reading the Agile book, then the delegate method may be new to you. Delegate is a class-level command that allows you to pass certain method calls on to an associated model. For example, if you have a highly-factored address book you might have a pair of models like this:

class Address < ActiveRecord::Base
  belongs_to :zip_code
end

class ZipCode < ActiveRecord::Base
  has_many :addresses
end

That's a model with some theoretical purity... but in practice it's cumbersome. You really want to deal with an address that has all the information you'd like to render (street, city, state, zip) in on model. Atleast it should feel that way. That's precisely where the delegate command comes into play.

class Address < ActiveRecord::Base
  belongs_to :zip_code
  delegate :city, :state, :zip, :to=>:zip_code
  delegate 'city=', 'state=', 'zip=', :to=>:zip_code
end

Modeled as shown above you can ask an address for it's city and the address will pass the request on to the zip_code object to which it belongs, retrieve the answer, and return it to you. (It's taking advantage of the fact that Rails is doing some method_missing magic to provide getters and setters for your attributes). That level of encapsulation will become increasingly important when you begin to use ActiveResource heavily. In many cases you may want to return only a few fields from an associated model and, as in the example above, you do not want or need to reveal how you've organized your data to the outside world.

The final piece to the puzzle with respect to ActiveResource will be making sure you use the :method parameter when you serialize the delegating object to xml.

addresses_controler.rb

...
def show
  @address = Address.find(params[:id], :include=>:zip_code)
  respond_to do |format|
    format.html # show.html.erb
    format.xml  { render :xml => @address.to_xml(:methods=>[:city, :state, :zip])
  end
end
...

As shown, the call to @address.to_xml tries to include the results of calling the city, state, and zip getter methods on address. The delegate command causes the Address object to pass that request on to the association ZipCode object and the results are returned and placed into the xml envelope as if they were attributes of the address (which they are, indirectly). The application that's consuming all this through ActiveResource remains blissfully unaware of your modeling nirvana. It simply receives some nicely formatted xml along the lines of this:

<home-address>
  <id type="integer">1</id>
  <street>123 Main St.</street>
  <city>Anytown</city>
  <state>XX</state>
  <zip>12345</zip>
</home-address>

Tip 2: Clean up the delgation you just learned to keep the code clean and clear

If you start maximizing your use of delegate your code can get untidy especially since delegate introduces some duplication when you're dealing with attribute accessors. If we keep in mind that class declarations are still Ruby scripts then we can clean the attribute accessor delegation pretty easily while making the intent very clear.

class Address < ActiveRecord::Base
  belongs_to :zip_code
  [:city, :state, :zip].each do |delegated_accessor|
    delegate "#{delegated_accessor}", "#{delegated_accessor}=", :to=>:zip_code
  end
end

On to some tips with more direct bearing on ActiveResource itself.

Tip 3: Use AppConfig to get your site information out of the class file!

The Core did a great job modeling ActiceResource along the lines of ActiveRecord so that using ActiveResource feels very natural to any Rails programmer. But it's also left me stumped as to why there is no equivalent to /config/databases.yml. I suppose that in some cases you will be using a well-known, established, public REST interface but I'm finding ActiveResource to be a very natural way to develop 'sub-applications' that can be shared to create a larger application. Because of that I need to be able to have different site information for development, test, and production. Clearly some configuration is needed.

Even though I shudder at the thoughts that a name like 'AppConfig' brings to mind, it's a great part of the solution to this problem. If you're not familiar with it, AppConfig allows you to provide a yaml config files for global (/config/app_config.yml) and environment-specific (e.g., /config/environments/development.yml) configuration. The plugin reads these config files, merges inforamtion as necessary, and provides all the options as class-level attributes of the AppConfig class.

sites:
  addressbook: http://localhost:3001
  financials: 
    url: http://localhost:3002
    username: money
    password: talks

The yaml above shows two different types of configuration that would be useful for ActiveResource, organized together under a 'sites' attribute. The first one (addressbook) is the way I started before I ran into an application that needed http basic authentication. The site info consists only of the url. The second one (financials) came out of the latter need. A quick extension of ActiveResource causes these to spring into action.

class ActiveResource::Base
  protected
  def self.establish_site_connection(site_id)
    raise(ArgumentError, "#{site_id} is not defined for #{RAILS_ENV}") unless AppConfig.sites.respond_to?(site_id)
    site_info = AppConfig.sites.send(site_id)
    return site_info.respond_to?(:url) ? site_with_basic_auth_info(site_info) : site_info
  end
  
  def self.site_with_basic_auth_info(site_info)
    site = URI.parse(site_info.url)
    site.userinfo = "#{site_info.username}:#{site_info.password}"
    return site.to_s
  end
end

I've been dropping the code above into /lib/core_ext/active_resource_extension.rb. The first method (establish_site_connection) is meant to emulate ActiveRecord::Base#establish_database_connection. It accepts a site id in the form of a symbol or string and retrieves the site configuration matching that id. If that site info is already a simple string, that string is returned unmodified. If the site_info is further broken down into the url, user name and password for http basic authentication then that is handed off to the site_with_basic_auth_info method to build up a simple string.

It's true that the http basic authentication credentials could be written into the url. In fact, that's exactly what the site_with_basic_auth_info does. If that's the case, then why add the username and password to the config file?

Tip 4: Share your site AppConfig settings between your applications

When you have the fortunate advantage of controlling both your ActiveResource-based application and your ActiveRecord-based application you can share the configuration information between the applications. Specifically, you can share the username and password information used for http basic authentication so that both sides can be externally configured... and reconfigured. By sharing the configuration files and including the use of AppConfig in the source application for the ActiveResource your http basic authentication will be as simple as

def basically_authenticated(user, password)
  user==AppConfig.sites.financials.username && password==AppConfig.sites.financials.password
end

What makes this even more compelling is that AppConfig (as anything leaning on yaml) allows you to use ERb in your configuration files. Why is that significant?

Tip 5: Use Embedded Ruby in your configuration files to automatically change your user/password

Clearly with http basic authentication you will want to go the extra step of passing through a secure connection, but if you're too tired to add an 's' to your http, then you'll want to change your clear-text password. Often. Embedding Ruby might be just the trick because you could share a single algorithm between your applications that would change the password for you.

sites:
  addressbook: http://localhost:3001
  financials: 
    url: http://localhost:3002
    username: <%= %w{money cash penny moulah dineiros pennywise poundfoolish}[Date.today.wday] %>
    password: <%= Digest::SHA1.hexdigest("#{Date.today.to_s}---financials") %>

There is a potential pitfall here. With this type of approach -- shifting the user/password each day -- the application servers will have to be kept in step. A reboot on one machine will require a reboot or restart on the other to make sure the applications share the same username/password since the AppConfig object will be re-loaded when the webserver starts. Pick the scheme that works best for you.

0 comments | Filed Under: Rails | Tags:

Why assign site in ActiveResource?

Posted by andy, Tue Apr 22 16:30:00 UTC 2008

ActiveResource is a great tool for helping your business keep not only its business logic DRY, but even keep its business applications dry. If you're not familiar with ActiveResource, think of ActiveRecord using an internet-based datastore. It's a bit more complicated than that but you can do all the basic CRUD methods, custom methods, etc

The advantage that ActiveResource brings, though, is that you only need to create the object once. Ever. Used effectively, you don't need to create an object in one project that you import or somehow reuse in another. You create a small, targetted application and share the application with other applications. For example, you could create an accounting engine that deals with ledgers and accounts and journals and expose the RESTful HTTP interface to higher level apps that simply consume the Journals and Ledgers and Accounts using ActiveResource. Within a single company it might be the ultimate in DRY.

For Rails developers, ActiveResource is very clearly modeled on ActiveRecord. If you've gotten used to one set of methods you should almost seamlessly be used to the other. With one painful exception: setting the site in the class. I honestly cannot understand why there is no configuration yaml equivalent to database.yml for ActiveResource. Maybe it was unnecessary since the creators already had some RESTful applications with which to work. Whatever the case, it's a real pain in the neck.

In an attempt to keep the ActiveRecord-like API going, I've come up with the following code that I've been dropping in /lib/core_ext/active_resource.rb

require 'yaml'
class ActiveResource::Base
  protected
  def self.establish_site_connection(site_id)
    site_yaml = File.new(File.join(RAILS_ROOT, 'config', 'sites.yml'))
    environment_configurations = YAML.load site_yaml
    site_configurations = environment_configurations[RAILS_ENV]
    return site_configurations[site_id.to_s]
  end
end

The code is supposed to emulate ActiveRecord.establish_database_connection. As implemented above it will add an establish_site_connection method to your ActiveResource class that will read a sites.yml file in your /config folder. sites.yml is structured similarly to database.yml -- you have entries for each environment (development, test, production, etc) along with site names and urls for each site.

development:
  activity_center: http://localhost:3002/
  church_member: http://localhost:3001/

test:
  activity_center: http://testy:3002/
  church_member: http://testy:3001/

With such a configuration file, of course, you have a few luxuries. First, you can use different sites while running in different environments. This might make it easier, for example, to create mocks for testing ActiveResource objects. Second, you can more quickly adapt to external changes (e.g., remote resource down or relocated) since it's just a yaml change and not a source code change.

I've typically gone one step further with the ActiveResource hack. As alluded to above, I have sites split into separate sub-applications each responsible for part of the end solution. As a result I have a whole family of ActiveResources that use one source application. For this reason I have emulated the multiple database solution for Rails with the following for ActiveResource.

require File.join(RAILS_ROOT, 'lib', 'core_ext', 'active_resource_extension')
class ActivityCenterResource < ActiveResource::Base
  # see /lib/core_ext/active_resource_extensison.rb
  self.site = self.establish_site_connection(:activity_center)
end

class ActivityCenter < ActivityCenterResource
  ...
end

0 comments | Filed Under: Rails | Tags:

STI Factory

Posted by andy, Mon Mar 17 17:30:00 UTC 2008

Single Table Inheritance

One of the abstractions that I really like in Rails is its implementation of Single Table Inheritance (STI). If you're not familiar with STI, it is a simple design pattern in which you model an inheritance hierarchy in a single database table (Martin Fowler does it more justice here). Since ActiveRecord, Rails' primary domain modeling base class, is also db-centric the marriage of the two is fairly straight forward: include a column called 'type' in your database table and you're done. Simple.

But type is a real headache

In practice it turns out that it's not always so simple. In a number of applications that I've worked on we like to put the user in the driver's seat by allowing them to select the subtype they are going to create. For example, assume that we start with a domain modeling different types of vehicles. Without getting into all the attributes that might distinguish the vehicles, the class model might look something like this:

class Vehicle < ActiveRecord::Base
  def self.inheritance_column
    'vehicle_type' # we'll see why in a bit...
  end
end

class Car < Vehicle
end

class Truck < Vehicle
end

What I'd really like to do is give the user a select and let them pick either 'Car' or 'Truck'. That in itself is should not be too difficult. There is one little gotcha: type is a reserved word in Ruby. If you try to use a select or select_tag helper Rails (Ruby) will complain with an error that will probably leave you scratching your head for a while. The simple way to avoid this problem is shown above. You override the class-level inheritance column method and return the name of the column that you will use to discriminate among classes in the inheritance hierarchy. In this case we're using the column 'vehicle_type' to hold the name of the subclass.

Things get trickier when you get back to the controller. It turns out that Rails musters up some righteous indignation about any attempt to change the class. To see what I mean, let's simulate what you might see back in the VehiclesController if you let the user request a Porche Cayenne...

params = HashWithIndifferentAccess.new(:vehicle=>HashWithIndifferentAccess.new(:vehicle_type=>'Car', :make=>'Porche', :model=>'Cayenne'))
=> {"vehicle"=>{"vehicle_type"=>"Car", "make"=>"Porche", "model"=>"Cayenne"}}
vehicle = Vehicle.new params[:vehicle]
=> #<Vehicle id: nil, name: nil, vehicle_type: nil, created_at: nil, updated_at: nil, make: "Porche", model: "Cayenne">

Already we can see there is a problem. The user sent back a request to build a Car, but what is being assembled is a generic Vehicle. The reasoning is pretty straightforward: you asked for a new Vehicle, not a new Car, so you got a new Vehicle. Perhaps too graciously, Rails assumed that you knew what you were asking for. Unforunately, you don't -- the user knows what is being created but you are clueless. You could try to build a big case statement, but that's very messy and you have to update it each time you add or remove a class from the hierarchy. It also suffers from the fact that it's too concrete; you can't transport your knowledge to any other STI implementation.

An STI Factory Method

This sounds like a classic case for the Factory Method design pattern. One of my favorite design pattern books (Head First Design Patterns) says that this pattern "defines an interface for creating an object, but lets subclasses decide which class to instantiate." If we translate that to Rubyisms and consider our problem, it sounds like we need a module (interface) that will mix into a class that will help the class pick from among its subclasses when it's asked for something new.

I've taken a stab at this a few times and never liked the results. Most of the time it felt like I was injecting too much code. I also got somewhat inconsistent results from the class-level array I was trying to build to maintain the list of subclasses. Recently I was working on a different problem and stumbled on some information that I'd forgotten from my first dance with Rails. I know that David Black told me that ActiveRecord maintained a protected list of subclasses just for STI, but it was washed away in the grey matter (probably because i read it at the beach... I'm a geek.)

Having been reintroduced to ARec#subclasses again, I've worked out an abstract STI factory. I built it as a plugin and the essence of it is in the code that gets mixed into the ActiveRecord base_class in the inheritance hierarchy.

def new(*args)
        target_class_name = requested_class_name(args)
        return self.base_class.factory(target_class_name, *args) unless self.name === target_class_name
        super
      end
      
      def factory(requested_class_name, *args)
        requested_class_name = base_class unless has_subclass_named?(requested_class_name)
        requested_class = requested_class_name.constantize
        requested_class.new(*args)
      end
      
      def type_options_for_select
        subclasses.collect{|subclass| [subclass.name.humanize, subclass.name]}
      end
      
      protected
      # Returns true if the STI tree includes a subclass with the specified name
      def has_subclass_named?(subclass_name)
        subclasses.detect{|subclass| subclass.name==subclass_name}
      end

      def requested_class_name(args)
        class_name = self.name
        if args.last.is_a?(Hash)
          requested_class = args.last.delete(self.inheritance_column.to_sym)
          class_name = requested_class unless requested_class.blank? or !has_subclass_named?(requested_class)
        end
        return class_name
      end

We'll read the code from the bottom up, mostly so that the helper methods make sense when we see them in context.

  • requested_class_name
    This helper method attempts to determine the name of the subclass that is being requested. Like Rails, it begins with the assumption that you knew what you were asking for (class_name=self.name) and then it searches the parameters it was passed to see if the :inheritance_column was passed. If so, it tries to return the value that was requested. There are two conditions on this: if the class name was blank it assumes that you wanted the class from which you requested something new. If you supplied a value but that value is not a subclass it assumes it was a typo (kinder than assuming you were a fool :-) For both cases it falls back to the class_name of the orignal class; otherwise it overrides with a subclass name. An important thing to note is that the subclass name is deleted from the options passed to the method. This is done to prevent an infinite loop but it means that you've got to keep a copy of the returned value.
  • has_subclass_named?
    This helper method checks the list of subclasses for the (base) class and makes sure that the requested class actually exists as a subclass.
  • type_options_for_select
    This is a convenience method for the select/select_tag. It builds an array that can be used as the options source with a human readable name for the class as the text and the class name as the value.
  • factory
    This method requires the name of the subclass. It takes advantage of the fact that Ruby classes are global constants (hence the call to constantize) to get a handle on the requested class and then invokes 'new' on it, passing in the parameters it received.
  • new
    This is the part that I like most about the plugin. The first thing that it does is check to see if the class you requested differs from the class that is trying to fill the request. If so, it automatically to the factory method and if not you proceed with the default 'new' behavior. The advantage to this is that you never have to know what you're trying to create and you don't have to remember to use the facotry method. You can just use new (or create) on this class like every other class... and it will figure out what you meant to do.

With that in place things look a bit different for the VehiclesController.

class Vehicle < ActiveRecord::Base
  has_sti_factory
  
  def self.inheritance_column
    'vehicle_type'
  end
end
...

params = HashWithIndifferentAccess.new(:vehicle=>HashWithIndifferentAccess.new(:vehicle_type=>'Car', :make=>'Porche', :model=>'Cayenne'))
=> {"vehicle"=>{"vehicle_type"=>"Car", "make"=>"Porche", "model"=>"Cayenne"}}

vehicle = Vehicle.new params[:vehicle]
=> #<Car id: nil, name: nil, vehicle_type: "Car", created_at: nil, updated_at: nil, make: "Porche", model: "Cayenne">

Now all I have to do is figure out how to make the VehiclesController fulfill that "create Porche Cayenne" request. :-)

0 comments | Filed Under: Rails | Tags: