Is this your first visit? You may want to subscribe to the feed.

New Collective Idea site

We launched a new Collective Idea site this week, and I am really excited about it. The design is gorgeous, I got to play with a lot of HTML5 and CSS3 stuff, and we have a lot of ideas to continue to make it better.

Screenshot of new Collective Idea site

Most of all, I am excited that we are also going to start blogging as a company. I work with a lot of smart people. This will be great forum for us to share our ideas and wrestle with questions. So head over the Collective Idea Blog and subscribe.

Code: collectiveidea Jun 11, 2010 ● updated Jun 11, 2010 0 comments

Capybaras eating cucumbers

(or, testing static sites with Cucumber and Capybara)

While working on an app built purely in HTML and Javascript, we needed a good way to write integration tests. We played around with a few different approaches, including “functional” tests using one of the Javascript unit testing libraries. But for now, we settled on using Selenium.

OMG you’re crazy!

No, we’re not. The latest version of Cucumber comes with capybara, which makes it super simple to use Selenium. Capybara just uses any rack app, so we made a simple rack app that serves our static files. So here is what we ended up with in features/support/env.rb:

require 'rubygems'
require 'spec'
require 'cucumber'
require 'rack'
require 'capybara'
require 'capybara/dsl'

Capybara.app = Rack::Builder.new do
  map "/" do
    use Rack::Static, :urls => ["/"], :root => 'public'
    run lambda {|env| [404, {}, '']}
  end
end.to_app

require 'capybara/cucumber'
require 'capybara/session'

Capybara.default_selector = :css
Capybara.default_driver = :selenium

Now just copy over the web_steps.rb that cucumber generates from another project, and proceed as normal.

Code: cucumber, selenium, testing May 11, 2010 ● updated May 11, 2010 1 comment

The quest for the Holy object serialization format

One of my favorite features of delayed_job is that you can delay execution of any method on any object. In order to get this to work, you have to be able to serialize the object into a database field, and then load it in a separate process.

For certain objects, like ActiveRecord or MongoMapper models, you don’t actually want to “serialize” this object, but instead just reload the record from the database. To accomplish this, delayed_job previously would call #to_yaml on the job, and do a nasty hack to store any objects that were ActiveRecord objects. This has always bothered me, so yesterday I set out to find a proper solution to serializing jobs.

Scene 1: enter YAML

YAML had 2 major problems in the context of how delayed_job was using it:

  1. It would serialize the attributes of the ActiveRecord class and reload them in the same state that they were in when the job was created. In most instances, I want to load the class in its current state from the database.
  2. You can’t call #to_yaml on a class, which delayed_job required to delay execution of class methods
String.to_yaml
# TypeError: can't dump anonymous class Class

Scene 2: who needs documentation?

It turns out that YAML has an undocumented feature (and YAML was originally written by _why, so you have to be mad genius to understand the code) where you can define how objects should be serialized and unserialized.

Here is how it works for ActiveRecord:

class ActiveRecord::Base
  yaml_as "tag:ruby.yaml.org,2002:ActiveRecord"

  def self.yaml_new(klass, tag, val)
    klass.find(val['attributes']['id'])
  end

  def to_yaml_properties
    ['@attributes']
  end
end

Problem 1: solved.

Scene 3: a Class act

As luck would have it, someone submitted a patch to YAML back in 2006 to add #to_yaml to Class and Module. _why was reluctant to accept the patch because “reloading these objects causes trouble if you haven’t required the right libraries”. This doesn’t worry me with delayed_job because the worker will be running in the same environment.

Here’s the monkey patch in all its glory:

class Module
  yaml_as "tag:ruby.yaml.org,2002:module"

  def Module.yaml_new( klass, tag, val )
    if String === val
      val.split(/::/).inject(Object) {|m, n| m.const_get(n)}
    else
      raise YAML::TypeError, "Invalid Module: " + val.inspect
    end
  end

  def to_yaml( opts = {} )
    YAML::quick_emit( nil, opts ) { |out|
      out.scalar( "tag:ruby.yaml.org,2002:module", self.name, :plain )
    }
  end
end

class Class
  yaml_as "tag:ruby.yaml.org,2002:class"

  def Class.yaml_new( klass, tag, val )
    if String === val
      val.split(/::/).inject(Object) {|m, n| m.const_get(n)}
    else
      raise YAML::TypeError, "Invalid Class: " + val.inspect
    end
  end

  def to_yaml( opts = {} )
    YAML::quick_emit( nil, opts ) { |out|
      out.scalar( "tag:ruby.yaml.org,2002:class", self.name, :plain )
    }
  end
end

Problem 2: Solved

Scene 4: Finalé

This all seem seems to work wonderfully, but I’m left wondering if there’s something I’m missing. Anyone see any problems with using YAML in this way, or have I found the Holy Grail?

Code: ruby, yaml May 03, 2010 ● updated May 03, 2010 0 comments

Cucumber and Sunspot

If you haven’t checked out Sunspot yet, you should. I have tried every solution for doing real full-text searching in a Rails app. Sunspot, backed by solr, is the only one that I haven’t had issues with in production. Hopefully I’ll get around to posting more about my experience with it.

But for today, here’s how I got Sunspot working with Cucumber.

First, add a configuration for the cucumber environment in config/sunspot.yml. I usually just use the same settings as test.

test: &TEST
  solr:
    hostname: localhost
    port: 8981
    log_level: WARNING

cucumber:
  <<: *TEST

Then, throw this bit of nastiness in the bottom of your env.rb file.

$original_sunspot_session = Sunspot.session

Before("~@search") do
  Sunspot.session = Sunspot::Rails::StubSessionProxy.new($original_sunspot_session)
end

Before("@search") do
  unless $sunspot
    $sunspot = Sunspot::Rails::Server.new
    pid = fork do
      STDERR.reopen('/dev/null')
      STDOUT.reopen('/dev/null')
      $sunspot.run
    end
    # shut down the Solr server
    at_exit { Process.kill('TERM', pid) }
    # wait for solr to start
    sleep 5
  end
  Sunspot.session = $original_sunspot_session

  MyModel.remove_all_from_index!
end

This will start up Sunspot for scenarios tagged with @search, and mock out the sunspot connection for ones that aren’t.

Code: Apr 07, 2010 ● updated Apr 26, 2010 1 comment

delayed_job 2.0

I’ve pushed out the delayed_job 2.0 gem from the Collective Idea fork on GitHub. See the changelog for a summary of changes, or see the full list changes.

Multiple Backends

One of the most significant changes was adding support for multiple backends. You can now use Active Record, MongoMapper, or DataMapper as backends for your job queue. See the README for more info.

Benchmarks

The Active Record backend in delayed_job 2.0 is much faster (6x in the benchmarks I ran), primarily due to reversing the priority column and adding an index. Here are benchmarks for running 10,000 simple jobs on my laptop:

                      user     system      total        real
delayed_job 1.8.5 195.670000  14.020000  209.690000 (230.887172)
delayed_job 2.0    36.200000   0.940000  37.140000  ( 39.959233)

While we’re looking at benchmarks, here is how the current backends compare:

                     user     system      total        real
active_record      36.200000   0.940000  37.140000 ( 39.959233)
mongo_mapper       69.270000   3.220000  72.490000 ( 90.783220)
data_mapper       255.620000   2.880000 258.500000 (275.550383)

I have not done anything to optimize the mongo_mapper or data_mapper backend, so performance patches would be appreciated.

Upgrading

To take full advantage of the Active Record performance improvements, you’ll want to add an index:

add_index :delayed_jobs, [:priority, :run_at], :name => 'delayed_jobs_priority'

The only other issue that most people will run into is that all of the configuration options have been moved to Delayed::Worker. Here’s how to change the options now:

Delayed::Worker.destroy_failed_jobs = false   # Delayed::Job.destroy_failed_jobs = false
Delayed::Worker.max_attempts = 3              # Delayed::Job.const_set("MAX_ATTEMPTS", 3)
Delayed::Worker.max_run_time = 5.minutes      # Delayed::Job.const_set("MAX_RUN_TIME", 5.minutes)
Delayed::Worker.sleep_delay = 60              # Delayed::Worker.const_set("SLEEP", 60)

Feel free to post any comments or questions on the mailing list.

Code: delayed_job, gem, ruby Apr 03, 2010 ● updated Apr 03, 2010 3 comments

Great Lakes Ruby Bash

The Great Lakes Ruby Bash is now accepting talk proposals. The conference will be held on Michigan State University’s campus in East Lansing, Michigan on Saturday, April 17th.

We’re looking for passionate speakers to give 25 and 40 minute presentations about their experiences with Ruby and related technologies. Our goal is to engage attendees and inspire them to create great software, empower users, and continue learning with others.

Proposals are due Feb 28th, 2010 at 12:59pm EST. We hope to have all proposals reviewed and speakers chosen by March 8th, 2010. Visit the website for more info.

Sponsors

We are also looking for companies and freelancers to sponsor the event. This year’s conference will be the third Ruby conference in this region. The previous two conferences attracted 60-70 attendees each. As a result of more interest, greater community involvement, and a more aggressive marketing campaign, we are anticipating 100-150 attendees this year.

Visit the sponsorship page to become a sponsor or find additional information about available sponsorship packages.

Code: conference, local, ruby Feb 17, 2010 ● updated Feb 17, 2010 5 comments

Active Resource in practice

I’m working on app to integrate Pivotal Tracker and Harvest. There’s a great ruby wrapper around Harvest’s API, but there isn’t a decent Ruby wrapper for Tracker’s v3 API, so I thought I would just build one as I needed it.

If this app were read only, I would probably use HTTParty and HappyMapper, but since I also want to be able to update timers and stories, Active Resource seemed like the right tool for the job. Active Resource in theory is great. Active Resource in practice is not so great. I’ve toyed around with it in the past, but using it for something real I found it…lacking.

Fortunately the Harvest gem had solved a lot of these problems. I write about them here in hopes that they will be useful to you.

Challenge 1: Headers don’t inherit

Pivotal Tracker uses a token for authentication and looks for it in a header called “X-TrackerToken”. It would be nice if you could just set this once, and all Active Resource classes would use it. But unfortunately, headers don’t inherit.

So the trick is to define a base class for all of your models to inherit from, and in that override how Active Resource treats headers.

module PivotalTracker
  class Resource < ActiveResource::Base
    Resource.site = "https://www.pivotaltracker.com/services/v3"

    class << self
      # If headers are not defined in a given subclass, then obtain
      # headers from the superclass.
      def headers
        if defined?(@headers)
          @headers
        elsif superclass != Object && superclass.headers
          superclass.headers
        else
          @headers ||= {}
        end
      end
  end

  class Project < Resource
  end
end

Now we can set our token once and subclasses will inherit it:

PivotalTracker::Resource.headers['X-TrackerToken'] = "mytoken"
projects = PivotalTracker::Project.find(:all)

Challenge 2: “Associations”

I find it strange that Active Resource doesn’t support associations. Rails has a standard way of defining embedded resources, so you would think that Active Resource would have a standard way of consuming them (I know, I should get off my lazy duff and contribute a patch, but it’s so much easier to just complain about it).

So for APIs that have nested resources like Pivotal Tracker’s, Active Resource forces you to hard code the parent resource id. If you want to get the iterations for a project, then you have to set the project_id on the Iteration resource.

PROJECT_ID = 1738

module PivotalTracker
  class Iteration < Resource
    self.prefix = "/services/v3/projects/#{PROJECT_ID}"
  end
end

This is just not a scalable solution. I’m going to need to be able to access multiple projects in the app that I’m working on. So the Harvest gem had a really clever (and evil) solution, which I’ve modified a bit here.

It basically involves creating an anonymous subclass of our resource, and setting the prefix just for that subclass.

module PivotalTracker
  class Project < Resource
    def iterations
      Iteration.build_subclass.tap do |iteration|
        iteration.prefix = "/services/v3/projects/#{self.id}"
      end
    end
  end
end

Now we can access iterations for any project.

iterations = Project.find(x).iterations.find(:all)

The #build_subclass method is defined on the base resource and just creates an anonymous subclass and copies some settings that don’t inherit.

Onward Ho!

I don’t have a lot built out yet for the new Pivotal Tracker wrapper, but you can check out the latest progress on GitHub. I feel like I’ve overcome most of the bit barriers, so it shouldn’t take much to finish it up.

Do you have any other tips or tricks for working with Active Resource?

Code: activeresource, rails Feb 16, 2010 ● updated Feb 16, 2010 4 comments

Passenger and browser testing in virtual machines

If you’re running Passenger in development, here is how to make Windows running in a virtual machine connect to your app in Passenger.

  1. Boot up the VM and open up the Windows command prompt (go to “Start->Run…”, enter “cmd” and press enter)
  2. Type ipconfig to see the network configuration. Take note of the “Default Gateway” address.
  3. Navigate to C:\WINDOWS\system32\drivers\etc and edit the hosts file. Add a line with the gateway address pointing to your app’s domain (you can even list multiple on the same line).
    172.16.248.2    awesomeapp.local otherawesomeapp.local
  4. Open up a browser in the VM and type in the address.

Tada! Now you can test out your app in those other browsers.

Code: passenger, rails, vmware, windows Feb 08, 2010 ● updated Feb 08, 2010 0 comments

Things that will rock the (my) world in 2010

Here are a few things that I’m really looking forward to using and abusing this year:

MongoDB

We’ve all been trying to shoehorn our Web 2.x applications into a 20 year old technology with an antiquated query language. If you haven’t looked into MongoDB, you need to, and you also need to check out MongoMapper.

I went out on a limb back in October and stated that a year from now, we’ll be using MongoDB for most new web apps. I think Mongo will be one of those things like Git that catches on like wildfire.

I have worked on a couple apps now with MongoDB, and it is a lot of fun. It blows my mind just how much our understanding of data modeling is tied to relational databases.

Sammy

Sammy is an awesome little Javascript framework for building RESTful and evented web applications. I’ve only build one little toy app with it so far, but I really enjoyed it and plan to update a couple existing apps an use it on some new ones this year.

Sunspot

If you need to do full-text searching, then Sunspot is your man. I have tried and given up on almost every single solution for doing real full-text searching in applications. I have a reputation around the office for enthusiastically touting several different search solutions as “the one”, only to watch all of them fall on their face in production.

But Sunspot has not let me down yet, and I have one app that has been using it in production for a few months with zero problems. I look forward to abusing it this year.

Rails 3

Two weeks ago, Rails 3 would not have made it on this list. In fact, it would have probably made it on my “things that are totally going to suck” list. I was afraid I was becoming a curmudgeon.

But the Rails core team has kicked it into high gear the past couple weeks. I spent some time playing with Rails (setting up a new app was no easy feat, although now there’s a guide) and digging through the internals and I can now say that I am officially excited.

What are you looking forward to?

Code: collectiveidea Jan 04, 2010 ● updated Jan 04, 2010 5 comments

acts_as_audited and authlogic

For those using authlogic that have had issues with auditing your User model, version 1.0.2 of acts_as_audited should cure your woes.

All you need to do is exclude the last_request_at and perisable_token fields from being audited. We also excluded a few other fields that don’t need to be audited:

class User < ActiveRecord::Base
  acts_as_audited :except => [
    :crypted_password, 
    :persistence_token,
    :single_access_token,
    :perishable_token,
    :last_request_at,
  ]
end
Code: acts_as_audited, authlogic, collectiveidea, plugin Oct 29, 2009 ● updated Oct 29, 2009 0 comments

How to Gemify your Rails Plugins

Ever since Rails added support for declaring gem dependencies, there is really no (good) reason to use plain ol’ plugins. We’ve been slowly gemifying all of our plugins as we need them. There’s a few hoops you have to jump through to get Rake tasks and Capistrano recipes working, but it’s fairly straight forward.

First, you need something that will help you generate the gemspec and build the gem. You can do this by hand, but there’s several great plugins out there that make it easy. We recommend Jeweler. Follow the directions in the Jeweler README for “Using in an existing project”.

1. Move init.rb to rails/init.rb

Rails plugins have the magical init.rb that gets loaded when the plugin is initialized. To make this work in a gem, all you have to do is move it to rails/init.rb. Recent versions of Rails will look there whether you install it as a plugin or a gem, so you can just move it to the new location if you don’t care about ancient versions of Rails.

2. Move rake tasks to lib/

Rails will load tasks/*.rake defined in any plugins. Unfortunately, these don’t get loaded from Gems. To make your rake tasks work from a plugin, you will need to move them into the lib directory, and explicitly require them from your app’s Rakefile:

require 'mygem/tasks'

If you want your tasks to still be available when your code is installed as a plugin, you can just explicitly require the task from task/mygem.rake:

require File.expand_path(File.join(File.dirname(__FILE__), '..', 'lib', 'mygem', 'tasks'))

3. Move Capistrano tasks to lib/

Capistrano recipes defined at recipes/*.rb in your plugin are also automatically loaded by Rails. Unfortunately, they suffer the same fate as Rake tasks and have to be move to the lib directory and be explicitly required from config/deploy.rb.

When moving the recipes to the lib directory, we have to jump through a hoop to get Capistrano to load it properly.

Capistrano::Configuration.instance.load do
  # put cap recipes here
end

And in your config/deploy.rb:

require 'mygem/recipes'

As with Rake tasks, if you want your recipes to still work when installed as a plugin, add the following to a file in the recipes/ directory of your plugin:

require File.expand_path(File.join(File.dirname(__FILE__), '..', 'lib', 'mygem', 'recipes'))

4. Generators

You don’t have to do anything, they just work.

That’s it

Publish your gem and go buy yourself a drink. Check out one of our gems if you need more examples.

Code: Oct 05, 2009 ● updated Oct 05, 2009 5 comments

Capistrano, Git and SSH keys

This trick has been around for a while, but I’ve talked with several people that didn’t know about it.

When deploying apps with Capistrano, your server needs to have access to the Git repository. Generating an SSH key for each server is a bit of a pain, but there’s an easier way: SSH agent forwarding enables you to use any of your local SSH keys on the server. It’s really easy to set up.

Enable SSH forwarding in deploy.rb:

set :ssh_options, {:forward_agent => true}

The only other thing you need to do is tell the SSH agent about your key.

$ ssh-add -K

The -K option only works on OS X and it adds your key to your keychain so you don’t have to run ssh-add after you reboot (and if you have a passphrase set, you don’t have to type it every time). You can also pass it the path to an SSH private key in a different location.

Now the server can pull from any Git repository that you have access to.

Code: capistrano, deployment, git, rails, ssh Jun 23, 2009 ● updated Jun 25, 2009 3 comments

Cucumber scenarios that depend on Sphinx

I love writing apps that make heavy use of search indexes, but testing them can be a bit of a pain. Here is how I got ThinkingSphinx to play nice with Cucumber.

Here is the relevant part of what I put in features/support/env.rb:

# Cucumber::Rails.use_transactional_fixtures

# http://github.com/bmabey/database_cleaner
require 'database_cleaner'
DatabaseCleaner.strategy = :truncation
Before do
  DatabaseCleaner.clean
end

ts = ThinkingSphinx::Configuration.instance
ts.build
FileUtils.mkdir_p ts.searchd_file_path
ts.controller.index
ts.controller.start
at_exit do
  ts.controller.stop
end
ThinkingSphinx.deltas_enabled = true
ThinkingSphinx.updates_enabled = true
ThinkingSphinx.suppress_delta_output = true

# Re-generate the index before each Scenario
Before do
  ts.controller.index
end

What’s going on here?

Start by commenting out the line about using transactional fixtures in env.rb. Using transactional fixtures will run each scenario inside of a transaction and roll it back at the end of the scenario to revert the database state. Thinking Sphinx uses an after_commit callback for kicking off the delta indexing, but the callback never gets run when transactional fixtures are enabled because the entire scenario is run inside of a big transaction.

Once we’ve disabled transactional fixtures, our test database will start to fill up, likely causing some problems. So we need to add a Before block that clears out the database before each scenario. I’m using database_cleaner, which gives you some different strategies for cleaning the database. Alternatively, the brute-force solution is just to reload the schema before each scenario, but this is slower than truncating the data.

Before do
  ActiveRecord::Base.establish_connection(ActiveRecord::Base.configurations['test'])
  ActiveRecord::Schema.verbose = false
  load "#{RAILS_ROOT}/db/schema.rb"
end

Next, we start Sphinx when env.rb is loaded, and shut it down when the Ruby process exists. We also enable deltas and updates, which are disabled by default in test mode. Finally, we define a Before block that updates the index before each scenario so we don’t end up with a stale index.

Putting it all together

I’m using Sphinx’s delayed delta support, so whenever I update records, I need to have delayed_job process jobs. Instead of trying to get delayed_job to run in the background, I took the easy way out and defined a step: “When the system processes jobs”.

Scenario: Posting a new listing
  Given I am logged in as "MovinMan" 
  When I create a new listing titled "Lots of Boxes" near "49423" 
  And the system processes jobs
  And I browse listings near "49423" 
  Then I can see a listing titled "Lots of Boxes" 

Which is just implemented as:

When 'the system processes jobs' do
  Delayed::Job.work_off
end

If you’re just using the default deltas, and not delayed deltas, then you can update the index like this:

When /^the system updates the index$/ do
  MyModel.sphinx_indexes.first.delta_object.index(MyModel)
end

I hope that helps. Post your suggestions in the comments for improving this.

Code: bdd, cucumber, rails, search, sphinx Jun 01, 2009 ● updated Jun 25, 2009 13 comments

Site-specific app for Rails docs

Unless you’re a Rails genius, you probably need to frequently reference the Rails API docs. And if you haven’t discovered it yet, railsapi.com is awesome.

John Nunemaker suggested that I create a site-specific browser and point it to a local copy of the docs from railsapi.com. I did and have been loving it, so I’m suggesting that you do it too.

Download Fluid (or the comparable app for your platform if you’re not on a Mac). Then download a copy of the docs from railsapi.com and create a new app pointing to that local copy.

Better yet, head over to railslogo.com and grab the Creative Commons licensed logo to use as the icon.

Now my only complaint is that I don’t have docs for Ruby and other gems in this app, but I have a hunch that it won’t be long until that changes.

Code: api, docs, mac, rails May 05, 2009 ● updated May 06, 2009 4 comments

Keepin' Sphinx Indexes Fresh

<infomercial-voice>Stale indexes got you down? Embarrassed that your users’ searches are coming up empty? Act now and you can avoid stale indexes with NEW and IMPROVED delayed delta indexing!!</infomercial-voice>

Ok, maybe it’s not new and improved. It’s actually been around since January, but it’s still awesome. Thinking Sphinx can use delayed_job to keep indexes fresh.

I was slow at jumping on the Sphinx bandwagon for one reason: the index has to be rebuilt to incorporate new data. Delta indexing alleviated some of this by storing frequent changes in a small separate index, but it still had to be occasionally reindexed. It also seemed to only index existing records in my trials with it. New records didn’t ever seem to show up until I rebuilt the whole index.

From what I can tell, delayed delta indexing makes everything Just Work™, and here’s how to use it…

After you’ve setup ThinkingSphinx, set the :delayed property to :delta in your index:

class Listing < ActiveRecord::Base
  define_index do
    indexes title
    indexes description
    indexes user.login, :as => :user

    set_property :delta => :delayed
  end
end

The delayed delta support depends on delayed_job, but if you’re using the gem version, it’s already bundled in. I’m using delayed job for some other things in my project, so I still have it installed separately and that seems to be working just fine.

delayed_job uses a table to keep track of the jobs that need run, so create a migration containing:

create_table :delayed_jobs, :force => true do |table| 
   table.integer  :priority, :default => 0 
   table.integer  :attempts, :default => 0 
   table.text     :handler 
   table.string   :last_error 
   table.datetime :run_at 
   table.datetime :locked_at 
   table.datetime :failed_at 
   table.string   :locked_by 
   table.timestamps 
end

And lastly, all you need to do is fire up the worker process:

$ rake thinking_sphinx:delayed_delta

Now whenever changes are made to your models, the index will be updated moments later. And that’s how you keep it fresh!

Code: rails, search, sphinx Apr 29, 2009 ● updated Apr 29, 2009 2 comments

View archives for June 2010.

Archives for Home

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2010 1 3 2 2 1
2009 3 1 4 2 1 2 2
2008 3 4 2 4 3 2 2 5 1 5 5 3
2007 2 14 7 9 4 4 12 4 3 3 2 3
2006 1 1 1 1 6 12 4 6 4 5 5
2005 2 1

Subscribe

Browse by Tag