April 3, 2008

Guide to Getting Started with Merb and ActiveRecord

Spent a while today getting up and running with Merb, the minimalist modular alternative Ruby framework from Ezra and the good people at Engine Yard. Merb has been in a bit of chaos these recent months as it's gone through a major reworking to acheive a whole new level of performance as well as honest-to-goodness modularity, including choosing your own ORM and templating system. I've been watching Merb's development for some time waiting for it to get to a level of stability that looked safe enough to dive in; their most recent release, 0.9.2, combined with pressing needs in a few exciting new Grabb.it features, made today the day.

My first day with Merb has been mostly great, but the one thing I found really sorely missing was a tutorial on how to get started. In all fairness, the Merb team promises left and right that copious documentation will be coming as they settle down to 1.0. In the meantime, I thought I'd pitch in for any brave early adopters out there with this Guide to Installing Merb and ActiveRecord.

Acquire and Install the Source

So, my goal here was to get from 0 (no Merb on my machine whatsoever), to running a hello world for accessing the database from an existing Rails project using ActiveRecord from within Merb. The first step was to acquire the source. One of the downsides to Merb's modular architecture is the complexity involved with installing it (again, all relevant disclaimers here about how the merb team will, I'm sure, be working to simplify and improve the process as they reach 1.0). At least for now, you actually have to get your hands on three different packages: merb-core (the base of the framework, a requirement), merb-more (has more advanced features like the command for actually creating a new app), and merb-plugins (this is where things like ORMs and templating systems live). Let's do that: $ git clone git://github.com/wycats/merb-core.git $ git clone git://github.com/wycats/merb-more.git $ git clone git://github.com/wycats/merb-plugins.git That'll get us the bleeding edge trunk version (about which more here).

Now that we've got the pieces, we need to install them, thusly: $ cd merb-core ; rake install ; cd .. $ cd merb-more ; rake install ; cd .. $ cd merb-plugins/merb_activerecord; rake install; cd ..

Create a New Project and Configure it for Active Record

We've got all the pieces; it's time to create our project and get it setup to use ActiveRecord. Go to the place you want to create your app and do this: $ merb-gen app my_app This is equivalent to the 'rails' command and will create your project directoy with most of what you need. But since a basic project in Merb is assumed to be simpler than a basic project in Rails, you'll quickly notice that you don't have a models directory. Since we're actually going to need a model if we want to connect up to a database with ActiveRecord, go ahead and create that directory and create a file inside if for your model just as you would in Rails, for example my_resourece.rb, which could look like this: class MyResource < ActiveRecord::Base end

We'll probably want a controller as well, so create a new file: controllers/my_resources.rb: class MyResources < Application def show r = MyResource.find :first render r.some_method end end Notice that all Merb controllers inherit from Application just like Rails controllers inherit from Application::Controller. The naming choice there is kind of interesting because it reveals Merb's controller-centric history and philosophy (remember that the framework doesn't assume that we need models by default; it turns out there's a lot you can do with just controllers).

Since we're using ActiveRecord, we'll obviously need to tell Merb that we want it to go ahead and actually load AR as our ORM. Go into config/init.rb in your project and uncomment the line that says "use_orm :activerecord".

We're almost there! These last few steps will feel familiar from setting up a Rails app: letting the framework know about our route and database configuration. To set up your route, open up config/router.rb and, inside the 'prepare' block, add a line like this: r.resources :my_resources Merb's routes work pretty much like Rails's, but with a few more advanced features some of which are explained in the comments at the top of that file. If you need something other than standard RESTful routing, read those.

Finally, all we've got to do is configure database access and we'll be ready to roll. In Merb this looks exactly like Rails, in fact, I simply copied the database.yml file over from the Rails project that usually manages the db I wanted to access, dropped it in config/databse.yml and it worked straight out of the box.

Ok! If you've made it this far, then you're probably more than ready for the big reveal. In you project directory do this: $ merb The server will start and you'll get a few log messages in your terminal that look like this: $ merb ~ Loaded DEVELOPMENT Environment... ~ loading gem 'merb_activerecord' from ... ~ loading gem 'activerecord' from ... ~ Connecting to database... ~ Compiling routes... ~ Using 'share-nothing' cookie sessions (4kb limit per client) ~ Using Mongrel adapter Once that settles down, browse to http://localhost:4000 and you should see the Merb welcome screen. Go to your expected url to see the result of your handiwork: i.e. http://localhost:4000/my_resources/7

If this works, you're officially up and running with Merb and ActiveRecord. If not, you'll see one of Merb's very stylish error screens and it'll be time to go to #merb on irc.freenode.net where all the friendly merbfolk hang out and are more than willing to help.

Tagged: , , , , ,

Posted by Greg at 9:35 PM | Comments (1)

March 7, 2008

Automating Firefox for Web Application Integration

This post explains how to control Firefox from the command line with Telnet and Ruby. After presenting some context to explain why I think this hack represents an important area of concern in contemporary web application development, I'll show how to execute it with actual install directions and code samples.

Ok, I'll say it: I think JavaScript is cool. One of my favorite effects of the move to the modern AJAX-oriented web application architecture has been the opportunity to move ever more functionality into the client. At Grabb.it, we like to say, "Anything you can implement in JavaScript is free." Instead of running on our servers, the JavaScript portion of our app runs on a distributed grid of thousands of machines maintained for us by our users. Also, despite the reputation given it by the Browser Wars, JavaSript is incredibly fun to develop in: it's lightweight and extremely flexible in a unique way that somehow forces you to constantly keep your code very closely tied to the data it's manipulating.

The one big downside to JavaScript is its runtime environment. Not only does code running in the browser confront a Gordian Knot of browser compatibility problems, but it's also irretreviably isolated from interoperating with other application code. While javascript libraries (like the inestimable jQuery) are increasingly proving the Alexander's sword of the browser compatibility Knot, the issue of lack of application interoperability is only just beginning to get serious. As JavaScript's innate advantages lure more and more application code into the browser, the question will be unavoidable: How do you get modules implemented in JavaScript to interact with those built in other languages that live in more traditional environments? How do you avoid duplicating all functionality that you put into the JavaScript portion of the application so that you can call it from outside the browser?

This week, trying to solve exactly these types of problems, I discovered a tantalizing avenue towards addressing some of these questions: browser automation from the command line and from scripting languages. Here was my situation.

As part of an upcoming Grabbit project, I've built a a highly interactive data browser for our customers. The JavaScript running on that page makes a series of JSON GET requests to gather all of the necessary information to compose its display and it makes a few AJAX POST requests to report back to the server on certain bits of status. But now, I wanted to trigger those POSTs programatically on a schedule rather than waiting for customers to trigger them. The dilemma is that I'd already written this relatively sophisticated JavaScript application that makes all the necessary requests, implements the business logic, and knows how to POST in the data. I had two options: redo all of that work again in my server-side application (ick!) or figure out a way to trigger this JavaScript code by automating its runtime enviornment (the browser).

After a half day's research, here's what I discovered: there's a Firefox extension that allows other applications to establish JavaScript shell connections to a running Mozilla process via TCP/IP. It's called JSSH. Once you've got JSSH installed and running in Firefox, you can open a telnet connection to the browser that allows you to automate it using JavaScript commands to do things like load new pages or even manipulate the DOM on pages you've loaded. You can then automate this interaction using any scripting language with a telnet library. For the remainder of this post, I'll provide step-by-step instructions for running JSSH and for automating it with Ruby.

Install JSSH

The easiest way to install JSSH is to download the JSSH.xpi and open it with Firefox which will offer to install the extension (if you're interested in compiling Firefox with it from scratch or installing an existing binary, you should read these instructions).

Start Firefox with JSSH

Once you've got a copy of Firefox with JSSH installed, you'll need to run it. You can do this by providing the correct options when launching Firefox from the command line. On Mac OS X, that looks like this: /Applications/Firefox.app/Contents/MacOS/firefox -jssh & The "&" at the end of that line will background your command so it doesn't take over your terminal session.

Telnet into the JavaScript Shell

Once Firefox is running, we can use telnet to log into JSSH like so: $ telnet localhost 9997 Trying ::1... telnet: connect to address ::1: Connection refused Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. Welcome to the Mozilla JavaScript Shell! >

Load a URL from JSSH

Now that we're in, we can tell Firefox to load pages for us, thusly: var w0 = getWindows()[0] var browser = w0.getBrowser() browser.loadURI("http://www.urbanhonking.com/ideasfordozens") And that's it! If the JavaScript application I wanted to run lived at "http://urbanhonking.com/ideasfordozens", we'd be done. That command would load the page and Firefox would interpret and run the JavaScript it found there.

Now, all we've got left to do is make it so that we can trigger this process from our application code. So, we'll...

Automate the Process with Ruby

Like any good scripting language, Ruby has a telnet library, which means that once we've got an instance of Firefox running with JSSH enabled, we can talk to it from Ruby whenever we want. Here's an example script that logs into the telnet shell and loads a series of URLs one at a time: require 'net/telnet' my_urls = ["http://urbanhonking.com/ideasfordozens", "http://atduskmusic.com", "http://grabb.it", "http://pdxpopnow.com"] # start telnet session with the Firefox javascript shell and setup browser object puts "starting telnet session" firefox = Net::Telnet::new("Host" => "localhost", "Port" => 9997) firefox.cmd "var w0 = getWindows()[0]" firefox.cmd "var browser = w0.getBrowser()" # load each page my_urls.each do |url| puts "loading...#{url}" firefox.cmd "browser.loadURI('#{url}')" sleep 10 # so that the browser has time to load even if the page is slow end firefox.close

Further Research: Screen Scraping JavaScript Heavy Sites

What else might this rickety bridge we've built to the JavaSript runtime environment be good for? One thing that immediately occurs to me is: screen scraping for sites with a lot of JavaScript. Another side effect of the rise of rich JavaScript applications has been to create intractable problems for people trying to do screen scraping. If the data you want is not in the page's HTML when you request it in the first place but is only written in later when the page's JavaScript runs then traditional spidering and screen scraping techiques will fail to find it. Freebase, the open database application built by Danny Hillis and his team, for example, uses a highly dynamic interface for presenting its data that is almost entirely based in JavaScript. Or, on the low-brow side, MySpace uses JavaScript throughout the forms in its interface to help with date picking and such. If you wanted to scrape or automate interaction with either of these sites, you'd need access to a runtime environment that could execute JavaScript.

I haven't really tackled this problem with JSSH, but I do have some leads. For example, here's how you get the html of the document: > browser.contentDocument [object XPCNativeWrapper [object HTMLDocument]] > domDumpFull(domNode(browser.contentDocument)) <HTML><HEAD><META content="text/html... If you want to explore this avenue further, one of the best places to look is Firewatir, a project to add Firefox support to the WATIR browser testing framework. They do lots of click-by-click automation and checking for results, so I'm sure they've figured out approaches for a lot of what you'd confront when screen scraping. The JSSH documentation itself is useful and clear but not the most in depth.

Happy automating! Let me know what you discover...

Tagged: , , , , , , ,

Posted by Greg at 10:54 AM | Comments (0)

March 1, 2008

Developing Single Serving Sites using Ruby CGI scripts on Dreamhost

There's been a lot of hullabaloo lately about Single Serving Sites. Stimulated by the inexplicable runaway success of Barack Obama is Your New Bicylce, these simple sites that provide a small dollop of amusement (isitchristmas.com) or utility (istwitterdown.com) have become all the rage.

Of course, I've been making them for awhile now, e.g. Largehearted Goat and The NY Times Explains the Ratings. At first, creating a new SSS is a satisfying experience. You have a whacky idea and withing a few hours, you've registered a domain, written some simple code, and put something up. But, over time, each one gets to be more and more of a burden. My ideas, at least, have involved sites that need to be constantly updated with new information over time. Since I've always implemented these sites with simple ruby scripts that run on my local machine and then upload the static finished versions of the sites, this has meant keeping an eye on unreliable cron jobs and, sometimes, hand maintenance. And, over the years, I've wondered if there was a better solution.

Today, I took the first steps towards finding one. It turns out that good-ole CGI scripts — so foreign to those of us who's main experience has taken place in the age of sophisticated web frameworks like Rails — make a great basis for SSS development. What follows is a basic introduction to writing and running CGI scripts with Ruby. I'll focus on Dreamhost as a deployment target since it's the service I have access to for this kind of thing and also the issues that arise there are probably not that dissimilar from what comes up with the other shared hosting services that are the natural habitat of SSSes.

Step One: Get Ruby and Rubygems up and running

If you plan on keeping your SSSes extremely spartan simple, you might be able to skip this step. Dreamhost accounts come with an old version of Ruby already installed. If you don't need to install any custom gems for your scripts or control anything else about your ruby environment, you can skip straight down to the step two. However, if you plan on doing anything the least bit more sophisticated — from installing one individual gem all the way up to using Rails itself — you've got a bit of work to do beforey you can get started on the fun part.

Being far from an expert on unix build process and dependencies, I followed Nate Clark's excellent instructions for building Ruby and Rubygems on Dreamhost with just a few discrepancies. My biggest note is that the version of the Ruby source to which he links is out-of-date. I changed his line for download the Ruby source to: $ wget ftp://ftp.ruby-lang.org/pub/ruby/1.8/ruby-1.8.6-p111.tar.gz But, don't trust me! The most recent version of Ruby changes all the time and you can always find the most recent version of Ruby at ruby-lang.org. The one other thing worth noting that I discovered was that in any of the compilations make may die by timeout (Dreamhost kills processes that last longer than a given interval, a major pain point we'll be returning to in a moment when we try to install individual gems). It won't always be clear that a timeout was at fault so if you see a 'configure', 'make', or 'make install' command die mysteriously, just go ahead and try again.

Step Two: CGI Hello World with Ruby

Now that we've got Ruby and Rubygems installed, the first thing we want to do is to just run a basic 'hello world' CGI script to make sure that we've got things configured correctly. Here are the steps:

  1. Log into dreamhost over ssh.
  2. Navigate to the desired directory within your web space.
  3. Create a new file, e.g. test.cgi.
  4. Fill it with the following content (if you skipped Step One and are using the build-in DH Ruby, the first 'shebang' line should read: "#!/usr/bin/ruby" instead of what is shown): #!/usr/bin/env ruby require "cgi" cgi = CGI.new("html3") cgi.out("text/plain"){ "Hello World #{Time.now}" }
  5. Change the permissions of that file by running: $ chmod 775 test.cgi
That should do it. Navigate to that path in your browser and you should see output that looks something like this: Hello World Sat Mar 01 19:04:53 -0800 2008

Step Three: Install Gems by Hand

Now, if you plan on doing anything more interesting than displaying the date, the odds are you're going to want to take advantage of Ruby's great and growing weatlh of libraries made available as Gems. Unfortunately, the RubyGems server system running at RubyForge has lately become incredibly slow. In order to find packages and detect dependecies, the default 'gem install' command has to communicate with that server and so becomes subject to the ruthless Dreamhost killing of long-running processes. In my experience, trying to install gems on Dreamhost will universally end in frustration: $ gem install feedalizer Bulk updating Gem source index for: http://gems.rubyforge.org Killed What to do?

Fortunately, in addition to using the RubyForge server to search for gems, it's possible to install them directly from .gem files if you can find them for your desired packages. For example, when building a recent SSS I wanted to use Feedalizer — a great library that scrapes web pages and automatically creates RSS feeds from the resulting content — so I hunted down the .gem file via the RubyForge website, downloaded it, attempted local installation, and then (when that failed) chased down the sources for the dependencies (in this case just Hpricot) to follow the same process. To give you a taste of things here were the gory details:

Installing Feedalizer: $ wget http://rubyforge.org/frs/download.php/13797/feedalizer-0.1.0.gem $ gem install feedalizer-0.1.0.gem which threw an error complaining about the need for the Hpricot dependency, so: $ wget http://code.whytheluckystiff.net/dist/hpricot-0.5.140.tgz $ tar xzf hpricot-0.5.140.tgz $ cd hpricot-0.5.140 $ rake gem $ gem install pkg/hpricot-0.5.gem The 'rake gem' step there is necessary because _why only makes the source directory available for direct download rather than an explicit .gem file. After this completes successefully, return to the 'gem install' step for Feedalizer above (be sure to the return to the directory into which you dowmloaded the Feedalizer .gem before running it).

Much bumpier than the smooth ride normally provided by 'gem install', but it'll get the job done. You could clamber your way down similar rutted paths for most any other gem you needed for you script.

And that should be more than enough to get you started. If you run into any troubles, dive into the comments here and on the other blog posts linked for help. Also, if anyone has any experience using lightweight persistence strategies in this context, I'd love to hear about them, espcially if they're file-system only; a lot of my scripts requiring saving records and I tend to use a rickety YAML-based system for them that could stand improving.

Note: Thanks to The Pug Automatic for inspiring me to start down this path in the first place.

Tagged: , , , , ,

Posted by Greg at 7:27 PM | Comments (0)

December 26, 2007

50 Blog Posts I'd Like to See in 2008

A while back, noted social media commentator Chris Brogan published a list of 100 blog posts he'd like to see other people write. The list was meant as a spur in the backside of readers who suffer from Blogger's Block, a final blow to the excuse that they can't think of anything to write about. While Brogan's actual topics were too focused on the incestuously insider world of social media for my taste ("44. The Difference Between Fark and Truemors"), the format was inspiring; I immediately started putting together my own list of posts I'd like to see.

Inevitably, my list reflects my interests and bias just as Brogan's does his. The posts I'd like to see are largely aimed at people like me: people who picked up Ruby in the last couple of years because of Rails and have got the language and framework pretty handily under control but lack the deep rooting in technical fundamentals and culture that comes from having gone to school for this stuff or having had a long career in it.

The theme is: "I know Ruby. Now what?"

Like Brogan, I'll ask that if you write on one of these topics link back here or come back to comment so I can find your post and gradually turn this list into something of a directory of this kind of info. Now, without further ado, here's my list:

  1. My First C Extension
  2. A Set Theory Primer for Relational Database Users
  3. What I Get Out of My Local Ruby Users' Group
  4. All About Postgres Indexes
  5. Ruby Was My First Language, Here's My Second
  6. The Five Most Useful Things I Learned in Computer Science
  7. My First Contribution to an Open Source Project
  8. Participating Actively on Mailing Lists
  9. Participating Actively on IRC
  10. Understanding and Using Threads
  11. Keeping Up to Date with New Developments to the Libraries You Care About
  12. How to Write Good API Documentation
  13. An Intro to Code Research, Or: Is There A Library for That?
  14. An Intro to Queues, Pools, Runners, and Inter-Process Communication
  15. Unicode Once and for All
  16. Timezones Once and for All
  17. How Do Migrations Actually Work?
  18. Basic System Administration for Developers
  19. Domain-based Programming in Javascript
  20. Instrumenting My Rails App
  21. Approaches to Processing Large Data Sets
  22. A Developer's Guide to Deployment
  23. Bootstrapping Rails Development for the Absolute Beginner
  24. Hey Look, I Did Something Useful with LISP!
  25. Approaching Your Idols: How to Start Conversations with Gray Beards and Gurus
  26. My Painless Gem Integration System
  27. Using Your Own Documentation
  28. Building a Rails App from Someone Else's Excel Spreadsheet
  29. Useful Ruby Outside of Rails
  30. Ruby as PHP Replacement
  31. Writing Simple CGI Scripts with Ruby
  32. Finding and Fixing Memory Leaks in Rails, Or: Why Are My Mongrels So Big?
  33. Writing and Managing Long-Running Processes
  34. SMS Integration in Rails
  35. What I Learned from Working with Statically Typed Languages
  36. My First Profiling Session
  37. Big Team Tools and Small Teams, Or: Why Is My Trac Empty?
  38. Rules of Thumb for Performant Ruby Code
  39. A Survey of Persistence Strategies Beyond Relational Databases
  40. Living with a Large Schema
  41. My First Cocoa Program
  42. Writing and Distributing Rails Apps for Desktop Installation
  43. Domain Registration and Hosting for Rails Apps
  44. Setting Up Subdomains and Pointing Them at Rails Apps
  45. My First Apache Configuration
  46. My First Nginx Configuration
  47. Is It Worth Releasing?: When to Open Source Your Work
  48. An Introduction to GUI Programming
  49. Lessons from Java for Someone Who's Never Written Any
  50. Practical Tips for Learning Protocols and Reading Specs
Tagged: , , , , ,

Posted by Greg at 1:18 PM | Comments (3)

September 20, 2007

git_tools: the Beginnings of a Rake Library for Git

There's been quite a lot of buzz lately amongst Rubyists (or at least amongst those I follow on Twitter) about the new version control system Git. As Linus Torvalds explained in a Google Tech Talk last May, licensing issues drove him to start work on Git to replace BitKeeper, under which Linux kernel developers had previously operated. Torvalds had three main requirements for the project: it had to be fast, it had to be distributed, and it had to be guaranteed never to lose any data.

For mere mortal developers like me, the most attractive one of these three is the second: Git's distributed architecture. In practice, the fact that Git is distributed means that you can separate the act of committing your changes from the act of publishing them. With Subversion, and other centralized systems, the only way to get your changes under version control is to push them out to the central server from which all of your collaborators pull. So, if you've got a set of changes that make a logical atomic commit, but break the build in a major way, you can't commit them without disrupting the work of others. Git, on the other hand, lets you make commits on your local machine without pushing any changes out to anyone. The result is that you can make a series of commits as you go that leave the app in whatever state you want -- even including creating your own experimental branches for trying out ideas, a process Git makes almost trivially easy -- and then when you're ready to push your changes out into the wider world, you can do so without anyone having to see the mess you made along the way. (A cool side effect of this architecture is that you can commit even if you don't have network access, for example if you're on a plane, which is where I wrote most of the code I'm about to tell you about...)

Anyway, as soon as I started using Git myself, I noticed the lack of a tool I use so often I'd stopped even noticing it was there: jchris's svn_tools plugin. It's a series of rake tasks for controlling Subversion. It provides a bunch of tasks to do common things like add all existing unrecognized files all the way up to bootstrapping a brand new Rails project into a new checkout. And Git definitely has some bumpy edges that would be nice to abstract out in this same way (like the need to type "git commit -a" for almost every commit and the need to manually tell Git about each new file individually when they're first being added). So, I set out to write a version of the same library for git, one that could provide some utilities that would speed up everyday activities and, eventually, some more complex macros for doing things like bootstrapping a new Rails project.

I didn't manage to accomplish that second goal, but I made enough progress towards the first, that I thought it would be useful to throw the project out there to see if one of those feverishly twittering Rubyists busy trying out Git might want to pick it up and run it into the end zone. So, without further ado here it is:

git_tools_0.1.zip

The main tasks of interest are:

  • rake git:add
    - adds all files that need a manual git add
  • rake git:commit
    - commits with the -a flag thrown so your new content will actually get committed
  • rake git:ignore
    - removes the given files (or all files matching a quotation-surrounded pattern if given) from version control
For those looking to develop the library further, a full framework for parsing Git's response messages is already in place as well as a system for executing Git commands (the design of which is yoinked from the svn_tools project). Bits here and there may be a touch wonky (for example, if you send a pattern to rake git:ignore, git raises an error even though it completes the task successfully).

A note on contributing: The archive of git_tools is itself a Git repository. One of the nice things about Git is that the files containing the versioning info are very compact and portable, so it's easy to ship around copies of a full repository with its entire history. If you make changes or improvements you'd like to see incorporated, just send me an archived version of the updated repository and I'll merge them in.

If you're intrigued and would like to learn more about Git I'd recommend the excellent tutorial intro to Git on Kernel.org and this how-to on building Git on OS X.

Happy Gitting!

Tagged: , , , , ,

Posted by Greg at 1:13 AM | Comments (2)

June 15, 2007

Presenting THUMBNAIL: a Ruby wrapper for the AWS Alexa Site Thumbnail service

I'm proud to announce the release of my very first gem: THUMBNAIL. It's a Ruby wrapper for the Amazon Web Services Alexa Site Thumbnail Service, which lets you automatically download thumbnails of any website or dynamically embed them in your own pages for an incredibly small fee ($0.20 per 1000 images). And it has a badass logo (see above)!

You can get it from RubyForge thusly: $ sudo gem install thumbnail

You've got to do some bureaucratic overhead at Amazon before you can play (more about that on the THUMBNAIL homepage), but once you do it's just as easy as pie to download pix of sites from around the web like so: require 'rubygems' require 'thumbnail' require 'open-uri' t = Thumbnail::Client.new :access_key_id => YOUR_ACCESS_KEY_ID, :secret_access_key => YOUR_SECRET_ACCESS_KEY url = t.get("www.urbanhonking.com")[:thumbnail][:url] File.open("urho.jpg", "w") { |f| f.write open(url).read } where YOUR_ACCESS_KEY_ID and YOUR_SECRET_ACCESS_KEY are things you get from Amazon when signing up for the service. Running such code would make you the proud owner of a local copy of an image like this:
urho.jpg

THUMBNAIL will also build you a url you can include in any webpage that will redirect to the site thumbnail you want like so: require 'thumbnail' t = Thumbnail::Client.new :access_key_id => YOUR_ACCESS_KEY_ID, :secret_access_key => YOUR_SECRET_ACCESS_KEY, :action => :redirect url = t.get("www.twitter.com") #=> http://ast.amazonaws.com/?Action=Redirect&AWSAccessKeyId=YOUR_ACCESS_KEY_ID &Signature=sdhfiawrkjw3h9bncoa8ue&Timestamp=2007-06-14T09:09:18.000Z&Url=www.tw itter.com&Size=Large

I'm working on a Rails plugin that will provide a view helper so you can easily do this from any template, but for now you'll have to be satisfied with the additional sample code available on the THUMBNAIL RDoc.

There are all kinds of other details available at the THUMBNAIL homepage. And the code is, of course, available to all for free under the MIT License. So, Go! Get your thumbs dirty! Enjoy!

Tagged: , , , , , , , , ,

Posted by Greg at 12:33 AM | Comments (2)

June 12, 2007

A Beginner's Guide to Practical Syntactic Magic: the tale of Hpricot's sudo-constructor

I spent much of today working with Hpricot. And so, as when spending significant solo time with any of why the lucky stiff's code, I found myself admiring all the neat little syntactic nicknacks strewn about to cozy up the place.

One of the best is the way you get started. Hpricot is a toolkit for parsing and manipulating XHTML. So, obviously enough, just about every time you invoke it, you're going to want to pass it an XHTML document so it can, you know, prep it for parsing and manipulation. And how do you do that? What's the syntax? Hpricot(my_document) That's it. There's no "Hpricot::Base.new(my_document).parse" nonsense, or any of the other more or less torturous common options. Not a single character of syntax is wasted.

But, if you're a mere Ruby mortal, like me, you're probably looking at that code and going: 'Huh?' Isn't Hpricot a constant? It's capitalized. But it's taking an argument like a method. How is that even valid Ruby? How can the parser tell if it's a constant or a method?

Well, it turns out that there's no rule against having capitalized method names; the parser can tell it's a method because it's got an argument. And that's all that's required for it to be sent off to method- instead of constant-dispatch (as Chris pointed out, this is one advantage of not having Ruby be "turtles all the way down"; Smalltalk couldn't do this).

Beyond providing fodder for a Language Nerd Attack, though, what's the upshot? How's this fact help the man on the street? Well: there's nothing actually sophisticated going on here. So: you can do it too.

Here's an admittedly contrived (and useless) example: class Dogger def initialize puts "dog" end end def Dogger() Dogger.new end a simple class definition followed by a simple method invoking it.

Which leaves us with the ability to write two snippets of code that, while they may look nearly the same, do very different things: >> Dogger => Dogger >> Dogger() dog => #<Dogger:0x15d2478> and that is exactly from where _why's use of this little quirk derives its leverage. This trick makes you feel like you're invoking a constructor or calling some other kind of class method when you are, in fact, doing nothing of the sort. Just as our Dogger() method above needn't have done anything remotely related to the Dogger class, _why could have named his method Clown() or ChunkyBacon() while still calling Hpricot.parse(input, opts) inside it (which is exactly what Hpricot() does).

But his chosen usage is particularly inspired. In one fell swoop, he gives his whole complex feature-ful library a single welcoming point of entry. You need never concern yourself with the internal machinery; just heave a document over the transom and let the library figure out what to do with it. And this is the wider lesson of _why: real power comes from combining the playfulness (better: the insouciance) needed to probe, question, and even bend the limits of the language with the discipline and aesthetic sense required to use what you find not to obfuscate and confuse, but to write elegant and, above all, more humane code.

I mean, Hpricot would definitely not be a better library if that method was called ChunkyBacon(). Right?

Tagged: , , , , ,

Posted by Greg at 1:29 AM | Comments (10)

May 2, 2007

My First Mongrel Handler: A Recipe for providing a JSONP Callback

One fact of life is: every technical community has its own recurring bugaboo, an unshakeable criticism that dogs it through its greatest successes, enduring in spite of furious defense and fact marshaling by supporters. Linux? Won't work on the desktop. Lisp? No standard libraries.

In the Rails community our bugaboo is performance. Whether you're a hawk or a dove on the issue, you probably weren't drawn to the framework because of its performance, but because it makes building even complex web apps a joy. You're also probably more than a little sick of hearing about performance in Rails.

But sometimes you're building something dead simple; you may not need Rails' leverage. And sometimes performance may actually be a showstopper. What then?

We encountered both of these conditions recently building a system to provide some JSONP callback functionality. Here's the spec: given a request for the JSON representation of some resource and the name of a callback function, return the JSON wrapped as an argument to the named function. In other words, take a request like:
http://mydomain.com/resource/2d8dbb1119a8.js?callback=myFunction
And return a response in the form:
myFunction({...resource represented as JSON...})
The point is for the user's function to get executed on their page with our JSON as an argument, allowing for data integration without the domain restrictions of AJAX.

Now, what does the code on our side have to do? It has to get the Javascript representation of the resource located at "mydomain.com/resource/2d8dbb1119a8" and then wrap the result in the function named in the "callback" param. That's it. That's the whole feature. All we need to implement it is: access to our data and some kind of structured access to the request itself. And, since this feature is part of our public data API (and so could potentially get called programatically by other people's code) it would be great if it was fast and cheap.

One way to meet these demands would be to write an action in each of our controllers that loaded up the right objects and concatenated their JSON representations with the contents of the callback param to compose the proper text response. But, this would pollute our existing code and use all kinds of unnecessary resources. There must be a better way. If only we had a piece of code lying around that specialized in parsing requests and serving up representations of our resources...

And, of course, we do: Mongrel, everyone's favorite pure-Ruby web server. As a web server, Mongrel speaks HTTP natively; as a kick ass web server, Mongrel is very high performing; and as our web server, it sees all of our requests anyway before they even reach Rails.

At this point, you probably won't be surprised to learn that Mongrel has a feature specifically designed to do this kind of thing: Mongrel handlers. For our purposes, you can think of Mongrel handlers like controllers: they examine the request, do something based on what they find there, and return a response. The Mongrel handler sees the request before it ever reaches our Rails app, so it doesn't have to load up cgi.rb or any of the other resource intensive, performance troubled, parts of the framework. Unlike Rails, It can also handle multiple requests per Mongrel instance. All of which adds up to a major performance win.

So, how would we build our JSONP callback code in a Mongrel handler? This is how:

require 'open-uri'

class JsonpHandler < Mongrel::HttpHandler
  def process(request, response)
    response.start(200) do |head, out|
      head["Content-Type"] = "text/javascript"
      callback = Mongrel::HttpRequest.query_parse(request.params["QUERY_STRING"])["callback"]
      json = open("http://#{DEPLOY_DOMAIN}#{request.params["PATH_INFO"]}").read
      out.write "#{callback}(#{json})"
    end
  end
end

uri "/jsonp", :handler => JsonpHandler.new, :in_front => true

Let's go through this bit by bit starting with the last line. When our script gets loaded up on Mongrel's startup (more nitty gritty on that below) that line tells the server that our handler class should be invoked for any uri that matches 'http://mydomain.com/jsonp/*' (The eagle eyed amongst you will notice that this means the url for requests to this service will have changed slightly from what we mentioned above: http://mydomain.com/jsonp/resource/2d8dbb1119a8.js?callback=myFunction instead of http://mydomain.com/resource/2d8dbb1119a8.js?callback=myFunction. This is the one price we pay for using Mongrel handlers; we need these requests to have something unique after the first slash so that Mongrel can known to intercept them before they reach Rails.). The in_front flag means that Mongrel should check for a match before waking up Rails at all.

When out handler gets invoked, its 'process' method gets called. As you can see, that method takes two arguments: the request and the response. We read the request to figure out what to do, writing the response as we go. We're going to return a response with code 200, so we invoke the start method with that as the argument. That method also takes a block with two local variables representing our response's header and its body (for more info, see the Mongrel::HttpResponse docs).

Inside the block, the first thing we do is set a header: since we're going to be returning executable Javascript, we set the appropriate Content-Type, "text/javascript". Now we get to the real action. We want to get the name of the callback value out of the query, so we grab the query string out of the request's params hash and use the Mongrel::HttpRequest's relevant class method to parse it. The result is a hash with key-value pairs representing everything after the "?" in our url. We pull the callback out of that and remember it.

Now, we've gotten the name of the callback function out of the request, so we're halfway there. All we've got left is to get the JSON representation of our resource. We've got two choices: we could load up our actual Rails objects or, we could use their urls. If we wanted to go the first route, we'd end up using Merb, a whole microframework that combines Mongrel handlers with ActiveRecord and ERB templates to provide a complete lightweight alternative to Rails. We don't need to go that far. All we've got to do is make a request for our own resource's .js url and we'll have what we need. At first glance, this may seem to be a security flaw (since it causes us to grab a url that is interpolated from a request that comes in), but we're only going for urls on our own domain, and we're just storing their contents in a string and then returning them to the agent who asked. The worst someone could do would be to request an image file or other expensive resource, which would then be garbled by having some text prepended to it and returned to them. We use open/uri to request the resource: DEPLOY_DOMAIN is a project-wide constant (bascially localhost or our real domain) and request.params["PATH_INFO"] gives us the request from the first slash up to the '?' -- just what we need to construct request for our resource. (There may be some performance downside from having an open/uri call in the midst of a Mongrel handler, but I don't know nearly enough about threading to speak intelligently on the subject.)

The last line of the block just puts the pieces together and writes the result as the response's body.

Once we've got this code written all we've got to do is tell Mongrel to load it up on launch:

$mongrel_rails start -S path/to/jsonp_handler.rb

This post would not have been possible without: the slides from ezmobius's Merb/Mongrel talk. Read them for more (and much better informed) info on the subject.

Tagged: , , , , ,

Posted by Greg at 10:23 PM | Comments (1)

February 7, 2007

Dipping a Toe in the C

Last night, I took an unexpected jacknife into the underground wellspring of C-code that burbles beneath the calm surface of Ruby. As part of some preliminary research for a super secret and incredibly exciting project I'm planning, I learned about two related libraries, Ruby2C and Ruby Inline, that dip into that pool to accomplish opposite, but complimentary, goals: translating Ruby into C for portability and flexibility, and writing custom Ruby methods in C in order to improve their performance.

Ruby was originally written in C and, for the most part, Ruby-related C coding is for the serious gray-bearded core language contributor. These two libraries give merely mortal programmers like me a chance to play with some of the power that comes from manipulating the language's internals. The downside of this dynamic is that I'm not smack dab in the center of the target audience for these projects the way I am with Rails and so I ran into a whole series of unaccustomed obstacles and inconveniences in the course of my dive including non-existent installation instructions, thin documentation, and incomplete, experimental code. While these may be familiar surroundings for the above-mentioned gray beards, they certainly aren't for me, and so I thought I'd take a moment to document what I learned about dealing with them on behalf of the next poor, desperately, Googling soul to follow in my footsteps.

Take Ruby2C, or, should I call it "ruby_to_ansi_c"? Part of the Metaruby project, which aims -- seemingly in a spirit of pure language geekery -- to rewrite Ruby's core classes in Ruby itself, this library provides machinery for translating ruby code into its C equivalent. Beyond whatever self-referentialist uses this might have, there's definitely a practical upside to the portability it provides. I don't want to tip my hand too thoroughly, but I can think of some neat places where I'd like to stick code that require C for entry.

'So far, this all sounds fine and dandy,' you say 'so what's the problem?' Well, my rhetorical question about the library's name hints at one facet of the unfriendliness involved. While the library usually calls itself Ruby2C in public, that is, in fact, almost never formally its name. Its rubygem goes by RubyToC and it includes two parallel libraries, one called "ruby_to_ansi_c" and one called "ruby_to_ruby_c". And, when it comes to code, names matter: installation, inclusion, and invocation are all impossible without getting them exactly right.

Another indicator of the size of the problem is the section of the README.txt that falls under the heading of Installation: "Um. Please don't install this crap yet..." So, I guess that leaves you with me. Anyway, without further ado, here's what I learned from installing Ruby2C and getting to hello world with it:

Install happens, like with pretty much any other gem (make sure you get the capitalization ), thusly:

$sudo gem install RubyToC

Now that we've got the library, let's write some code. We're going to write a class that, when translated into C will just print some text to the screen when compiled and run:

require 'ruby_to_ansi_c'
class MyTest
  def say_hello
    puts "hello"
  end

  def main
    say_hello
    return 0
  end
end
Note the require line. The use of the "main" method is just a cute little hack from one of the RubyToC examples that makes the resulting program runnable. When translated, the new program will have a "main" function which is what gets called when you run a compiled C program from the command line.

Now, let's go ahead and do the translation:

result = RubyToAnsiC.translate_all_of MyTest
puts result
This produces the following C code:
long main();
void say_hello();

long
main() {
say_hello();
return 0;
}

void
say_hello() {
puts("hello");
}
That C source may look a little funny since the RubyToC generator isn't much for producing aesthetic whitespace, but it'll compile and run, which is what matters. To test it out, copy that snippet into my_c_test.c and fo the following:
$ gcc my_c_test.c -o my_c_test
$ ./my_c_test
hello
We could also just build the C for one individual method, like so:
RubyToAnsiC.translate(MyTest, "say_hello")
which would return just the second of the two functions in the above C source.

All of this is pretty simple and very powerful once you get it working. Of course, the code itself has a beautiful and clean user interface (in the form of these class 'translate' methods). It's just the websites and documentation that suck!

Now, the opposite of the ability to convert Ruby to C is the ability to write your own Ruby methods in C, just like the gray beards do. Unlike its converse, this process has obvious and wide-ranging benefits in the form of significant performance enhancements. Ruby is a high level language and not an especially zippy one. C, being closer to the machine, will almost always take care of an equivalent task in less time. The point of Ruby Inline is to let you rewrite the biggest performance choke points of your code in C to speed them along. Here's a basic usage example that adds an instance method called "say_hello" to our MyTest class that simply prints the text "hello" as many times as is asked:

require 'rubygems'
require 'inline'

class MyTest
  inline do |builder|
    builder.c <<-CODE
      void say_hello(int i){
        int n = 0;
        while(n < i){
          puts("hello");
          n++;
        }
      }
    CODE
  end
end

MyTest.new.say_hello 10
While this is a trivial example, its form is a common one for optimization based on Ruby Inline: a central slow loop or algorithm that we plan on running many times.

A few things of note. Ruby Inline offers a number of options for including libraries, providing compilation instructions, and such. Take a good look at the documentation for details. Also, I've had some problems while playing with Ruby Inline in irb. They tend to take the form of "Errno::ENOENT: No such file or directory" errors. I don't think that this kind of code was really meant to run in a shell. It wants a source file to compile from and a static place to compile to. In addition, I think that wirble, a set of irb-enhancing tools I use, exacerbates things. In summary, run your Inline examples from files.

Tagged: , , , , ,

Posted by Greg at 3:04 AM | Comments (0)

January 26, 2007

Subverting Twitter

If it hasn't yet skittered across your radar, Twitter is a tiny web app that works a little like a universal IM away message. Created as a side project by Ev Williams and the gang at Odeo, Twitter lets you post tiny fragments of text (announcements of where you are and what you're doing, random thoughts and observations, etc.) via text message, IM, or the web. No more than 150 characters are allowed. It then broadcasts those messages to all of your friends and 'followers' and posts them to your own customizable page.

Here, for example, is my twitter page (warning: clicking may cause extreme boredom).

Anyway, it may be totally contrary to the spirit of such a pleasantly pointless thing, but I think I've found a use for Twitter that is actually…uh…useful: publishing commit messages from version-controlled coding projects.

Commit messages are the atomic unit of change in a project and can be the best way of keeping up with programming progress. Unfortunately, they all too often end up banished to obscure and unwieldy diff logs never to be regularly read. And worse, the knowledge of this sad fate leads harried coders to be lazy and uninformative in their composition making commit messages often doubly useless.

Maybe, publishing our commits to project-specific twitter pages, which our collaborators (and customers, and bosses) can follow in real time, will get us to give our messages some real zing as they become, not lost log entries, but comments in a conversation.

Or, at least, that's the theory, anyway. To test it out, I whipped up a solution for my current coding environment: a Rails project managed under Subversion. Specifically, I wrote a Rake task that prompts you for a commit message, runs the necessary svn commands to commit your code, and then forwards your commit message on to Twitter.

Well, more accurately, I extended Chris's svn tools plugin so that it uses addictedtonew's ruby twitter API library to post the message. Really, I did almost none of the work here, just tied together some real projects by others much more skilled than myself. Here's how you can do the same:

The first step is to get the svn tools plugin installed if you don't already have it. Chris mentioned today that he's thinking about refactoring it into a gem for use in Ruby projects more generally, but for now you can install it in your current Rails project like so:

> script/plugin install http://svn.rtra.in/public/plugins/svn_tools/

If you installed the plugin before January 26, 2007, you should reinstall it now, since Chris kindly made a little tweak to it to make my hack possible:

> script/plugin install http://svn.rtra.in/public/plugins/svn_tools/ --force

Once that's done, you're most of the way there. The next step is to install the twitter gem with hpricot upon which it depends:

> sudo gem install hpricot --source code.whytheluckystiff.net
> sudo gem install twitter
If you've already got the most avant-garde version of hpricot (today, it was 0.4.2), then you can skip the first of these two lines, but without it things will be quite bumpy.

Now, finally, all that's left is to add a new task to your project's own Rakefile. Anywhere in there (but not inside of a pre-existing namespace), add:

namespace :svn do
  task :twitter => :commit do
    email = 'me@mydomain.com'
    password = 'mydogsname'
    Twitter::Base.new(email, password).post(@message) 
  end
end
Obviously, this needs to get filled out with a valid twitter email address and password. And don't forget to stick "require 'twitter'" somewhere above this to make the gem accessible.

That's it, you're totally setup. Run it from your project's root directory with:

> rake svn:twitter
It will prompt you for the commit message, commit your code, and then send your message off to twitter, as promised.

You can see an example of this system in action by following the twitter page for the commits on grabb.it, a new semi-super-secret mfdz skunkworks project Chris and I are working on.

If you try it out, stop by and let me know how it works for you. Plus, if there's demand, I can always package this up as a gem once Chris does likewise with his svn tools, which would greatly simplify the install process and make it valid for non-Rails projects as well.

Ok, Twitter away!

Tagged: , , , , , , , , , ,

Posted by Greg at 2:58 AM | Comments (5)

December 9, 2006

Getting ComputerKrafty: Arduino, Ruby, and Blurry Video of Some Blinking LEDs

(Arduino Serial Ruby on YouTube)

For the last month or so, Brett and Marcus from Tables Turned and I have been meeting weekly to teach ourselves Physical Computing, the use of micro-controllers like those found in cell phones and Roombas to build all kinds of interactive projects, from multimedia installations to scientific equipment.

We're using Arduino, a cheap and simple micro-controller chip and programming framework that's great for beginners. Between the three of us, we've got lots of ambitious projects we'd like to build, from immersive sound installations to wifi-enabled street walking robots, but in order to learn the basics, we're starting with a pretty simple project: building our own version of the children's toy Simon. If you're interested, you can follow our progress on the ComputerKraft wiki.

The two videos I've posted here show some early experiments we tried out while learning the ropes. The one below is amongst the first things we ever tried: reading the analog input from a knob and using its position to light up a changing number of LEDs.

The video at the top is from this week and I'm pretty proud of it. It shows a Ruby program running on my computer that reads input from a user and then lights up a different LED depending on what number it receives. This doesn't sound too impressive; after all, it's just another 'hello world'. But the elements involved are really exciting to me. With them in place, pretty much anything you can do in Ruby scripts, Arduino can know about -- reading RSS feeds, looking for files, user input, etc. Plus, from here, it doesn't take much more to get the interaction to flow both ways: when Arduino does something or senses something, it can get sent off to a Ruby program and from thence to files, the web etc.

If you're curious to know more about the technical details, you can check out the Ruby/serial demo page on the ComputerKraft wiki. It's got both the Ruby and C source code as well as an explanation of the hardware and links for downloading the ruby/serialport library (which does, in fact, work on OS X even though their documentation gives you little confidence that it would). Or, if Ruby's not your thing, you can check out Todbot's C code for doing this manually from the command line to accomplish something similar.

(Dancing LEDs on YouTube)

Tagged: , , , , ,

Posted by Greg at 4:47 PM | Comments (7)

October 20, 2006

learns_to use Expect for Easy Automation

One of the great hopes you might have in beginning to learn about technology and computers is that they will save you time and effort. This is such an obvious expectation that it almost goes without saying, but, in my experience, it is rarely fulfilled and really unrelated to the true joy of technological learning. That joy comes in gaining whole new abilities, not in slightly improving existing capacities. I've was motivated to learn what I have about the web and programming because I wanted to publish my thoughts and my music for anyone in the world to read and hear and there was simply no other feasible way for me to do that. As my technical capacity has grown, I've come up with new ideas for things I wanted to do and make that I had never even known were possible. And now these ideas themselves drive me deeper into the technology in order to realize them.

Given this dynamic, I was a little shocked recently to come across Expect. For once, here's a command line utility that offers a staggering productivity increase without the attendant black hole of necessary technical mastery.

Expect is a tool for automating interactions with other programs. Expect scripts allow you to start up a program and then have the computer act use it in your stead. In your Expect script, you write out a dialogue for the interaction, e.g. 'if the program says that, respond with this,' and then the script holds up your side of the 'conversation' with the program, providing feedback, entering inputs, making simple decisions.

Why is this useful? With Expect, you can write scripts that fire off relatively complex interactions with a single command, so you don't have to remember all the individual sub-steps. Or, even sexier, you can automate multi-stage tasks you've previously had to do by hand so that you can trigger them with cron so you never have to think about them ever again.

This may sound fuzzy and abstract so far, but Expect scripts actually turn out to be a cinch to write. As proof, I'll show you the simple script I worked up last night to automate my daily "production process" for Largehearted Goat. In my original post on the subject, I mentioned that the code behind Largehearted Goat required "just a little hand holding." Here's what was involved: (1) run the ruby script which reads the Largehearted Boy RSS feed, finds the Goats, and rewrites the html, (2) sftp into my web hosting and copy the new html file over the existing one being served up to Largehearted Goat. And here's Expect script I worked up to get it all done (paths and passwords have been changed to protect the innocent):

#!/usr/local/bin/expect -f spawn ruby /path/to/goat/script/goats.rb expect eof spawn sftp mylogin@myhost.com expect -exact "Password:" send "MySecretPassword " expect "sftp>" send "put /path/to/my/new/html/file/goat.html path/to/my/online/goat/directory/ " expect "sftp>" send "exit " expect eof

So, here's how this works. The first line is just a necessary invocation to allow the Expect utility to read a set of commands from a file. The "spawn" command tells expect to start up a process, in the case of the second line, there, I'm running my ruby script. Already here, we have a big advantage over some other shell scripting choices available out there. Step (2), which I described above, only works properly if my ruby script has already been run. Otherwise, it would send the old version of the html up to the web and www.largeheartedgoat.com wouldn't change. Expect makes it incredibly easy to wait for the completion of that script. All we have to say is "expect eof" (for End Of File). That line tells Expect to wait for control to be returned to it from the previous process that it spawned before proceeding on.

Once the ruby script is done running, then it's time to go ahead and ftp it the new html file into place. Since my host requires ssh for login, I've got to use SFTP (Secure File Transfer Protocol), which I invoke with the next spawn line. From here on in, all I'm really doing is alternating prompts I "expect" to see from SFTP with commands I want to "send" to it. One of the best things about Expect is that if any of these "expect" conditions aren't met, the script won't just go ahead with the rest of the interaction running roughshod over your files, but will instead shut down without taking further action.

So, yeah, it's pretty easy. If you can do this task once by hand using SFTP, you can write this Expect interaction no problem. The only clever thing going on is the use of the new line when submitting a command, like so:

send "MySecretPassword "

Think of this line break as hitting 'return' in order to actually submit the command.

Now, once your script is written, all you've got to do is make it executable by running 'chmod x' on it and then actually call it like

$ expect my_new_script

You should see all of the normal output of your commands scroll by in the terminal. And once you've got it working, you can check out this great crontab tutorial to set it up to run automatically!

I've only barely scratched the surface here of what Expect can do. It's a real programming language, allowing branching based on the response of the program you're interacting with and a full vocabulary for logic and variables, etc. But even with just this limited Expect vocabulary, I bet you can save yourself a ton of time. Is there a simple process like this that you have to do everyday? Automate it. Is there a complicated interaction you only have to do every once in a long while whose commands you always forget and so have to spend an hour re-googling? Next time you do it, capture it in an Expect script, save it somewhere and then just run it when you need it. Could you spend a long time fiddling with all the different options, improving your Expect chops? Sure you could. But why would you? This one's easy. This one's for getting things done.

Tagged: , , , , , , , , , , ,

Posted by Greg at 6:00 PM | Comments (4)

September 10, 2006

learns_to Modules and Namespaces: Lessons from Wrapping the del.icio.us api

Last night, I started working on putting together a Ruby-wrapper for the del.icio.us api. I need it to execute this little idea I had recently (more about that when it's done) and I was surprised to find that there wasn't anything too useful out there -- though it's probably because the api is so easy to use you barely need a wrapper around it for most projects. There were a few libraries, but nothing really clean and complete and nothing using the new v1 of the api.

Anyway, in the course of working on the wrapper, I came across a common problem: the need for multiple namespaces. In the api, method names are not unique across objects. For example, there's a method that gets posts for a user and one that gets tags, both called "get" (api.del.icio.us/v1/posts/get and api.del.icio.us/v1/tags/get, respectively). Obviously, those urls leave no confusion as to which "get" method gets which type of object. The question is: what device in ruby should I use to capture this with equivalent clarity?

Two strategies occurred to me immediately: modules and subclassing. According to the relevant section of Programming Ruby, "modules are a way of grouping together methods, classes, and constants. . .[They] provide a namespace and prevent name clashes." Well, that sounds like exactly what I want to do. I want to group together the api methods for posts so that they don't pollute the namesapce for tags. Under this design, I would have multiple modules within my main class, one with the methods for each api "object," posts, tags, bundles, and whatnot.

So, to see if this would actually work, I ginned up a simple example of using modules inside a class. This is what it looked like: #namespaced class methods class Test module Gar def self.to_s puts "gar!" end end module Bax def self.to_s puts "bax!" end end end Test::Gar.to_s Test::Bax.to_s If you ruby this you'll see this output: gar! bax! In other words, it seems to work for class methods.

But what about instance methods. I made my toy example a little more complicated: class Best attr_accessor :dog def initialize @dog = "bot!" end module Gar def self.set_dog @dog = "gar!" end end module Bax def self.set_dog @dog = "bax!" end end end t = Best.new puts t.dog Best::Gar.set_dog puts t.dog Best::Bax.set_dog puts t.dog Unfortunately, this doesn't seem to work. The modules can't get access to the instance variable, @dog. The output ends up looking like this: bot! bot! bot!

This means that I'm thrown back to trying to solve the problem with regular subclassing. I'll be defining a series of classes like this: class Relicious attr_accessor :username, :password #my main class, connects to del.icio.us, etc. end class Post < Relicious def get #call the posts/get url end end clas Tag < Relicious def get #call the tags/get url end end That way, each separate subclass can implement identically-named methods with no danger of namespace confusion. My initial instinct was that this pattern was slightly less elegant than what I was trying to achieve with modules because the subclasses all have to access the centralized connection methods and such in the parent class. The resulting usage code looks like this: post = Post.new post.username = "myusername" post.password = "mypassword" post.get which is ugly (a post doesn't really have a username) and inefficient (you'd have to set the username and password attributes fresh if you called Tag.new since you'd have a new instance).

Thankfully, today Chris proposed a better solution, which, in retrospect, should have been obvious to me: wrapping up the child objects inside of accessors in the parent class and then only ever accessing them from there. This would turn the above usage code into this: rel = Relicious.new rel.username = "myusername" rel.password = "mypassword" posts = rel.posts.get The namespace problem is solved, everything is meaningfully encapsulated, and the syntax is concise and clear. Sounds like good design. Now, all that's left is to actually implement it. . .

Tagged: , , , , , ,

Posted by Greg at 6:19 PM | Comments (2)

August 28, 2006

learns_to build Academic Archive::Part 2:Setting up a New Rails App and a First Iteration on the Paper Model, Featuring our First Tests

Welcome to Part 2 of learns_to build Academic Archive, where I try to blog every last detail involved in building a Ruby on Rails application for publishing and peer-editing academic papers. As requested by Benjamin in the comments on Part 1, from now on, I'll be providing a table of contents to each post. So if you're looking for some specific piece of knowledge, you can jump right into the middle to get it. If you have any other ideas on how to make this series better, I'd love to hear about them in the comments.

Contents

  1. Creating a new Rails project
  2. Designing the Paper Model
  3. Setting Up the Database and Generating the Model
  4. Validating the Presence of Papers' Titles
  5. Getting Started with Testing: Fixtures
  6. Testing the Fixtures: Our First Test and First Test Helper
  7. Running Tests: Under Rake, Under Ruby
  8. How To Write a Test: Given, When, Then
  9. Philosophy of Testing

Well, we're airborne now. I posted Part 1 just before boarding a flight for LA and we just reached our cruising altitude.

At the end of Part 1, we'd thought our way through to a good starting design for the whole app and we were ready to start writing some real code. Specifically, we wanted to start with our central object: the Paper model. But before we write even our first line, we've got to do some setup and the tiniest bit more thinking.

Creating a new Rails project

First thing's first: run the "rails" command to generate the spine of a new Rails application in the file system:
gabc:~/Sites Greg$ rails archive I ran this command from my "Sites" directory where I keep all my projects. It will generate a new folder in there called "archive" and inside it will create a whole bunch of files and folders which constitute a fresh default Rails application.

If you cd into this directory and run "rails --version" you may find that you've got an old version of the framework (mine was at 1.2). Rails is a relatively new framework and it's undergoing a ton of rapid development. This is good because it means that new features get added all the time which make your job easier and old bugs get fixed. To take full advantage of this situation, we want to always be running the most recent version (as I write this it's 1.6). Thankfully all this takes is a single command: gabc:~/Sites/archive Greg$ rake rails:freeze:edge We're using Rake, the handy-dandy Ruby build utility. Rake automates common ruby programming tasks like creating, writing, and running files (especially tests). We'll be using Rake constantly in the setup and development of our app; to see all that it can do run "rake -T" and you'll see a list of all the available rake commands with their descriptions. This particular rake command makes sure that we're always running the most recently released version of Rails, going out and grabbing any new versions that come along. When you run it, you'll probably see a bunch of subversion changes scroll down your screen as the framework gets updated to the most recent version.

Now, I've got to confess that I did all of this setup so far at home last night. I knew that I'd be working without internet access while I was traveling and obviously commands like "rake rails:freeze:edge" have to go out over the wire to get their job done. Also, since I was going to be traveling, I wanted to grab a local copy of the Rails documentation which I normally use online. So, if you're working with dependable web access you might skip this step, but it's nice to know how for when you need it: gabc:~/Sites/archive Greg$ rake doc:rails Rake will go ahead and check to see if you've got any of the documentation, downloading it and installing it in your project's doc/api directory where you don't. It will take a good chunk of time and download a whole bunch of files.

Designing the Paper Model

Ok, we're good to go. Setup is done. We could start generating app-specific files and writing code right now if we wanted, but just the slightest bit more thinking and note-taking is probably in order first. We decided at the end of our last post that we were going start work by building papers and then the surrounding paper-approval-category relationship. What we didn't discuss was any of the specifics of the Paper model itself. What is a "paper" really? What attributes does it have? Is that really the right name for it? During the electronic blackout period of our ascent here, I sketched some answers in my moleskine. I'll explain them now.

Oops. Speaking of electronic blackouts, I lost battery power just as I polished off that last paragraph. I spent the rest of the flight into LA napping and reading. Not altogether unpleasant. Now, I'm in the corner of an LAX gate about 100 yards from where my flight will board, hunched over the only open outlet in the vicinity, trying to catch a quick charge before my flight for NY boards in 45 minutes.

Anyway, the last question that I asked in the air over Oregon may seem kind of nit-picky, but when it comes to domain modeling, the names we chose for things turn out to be surprisingly important. They should be expressive and unambiguous. We need be able to remember what they mean without confusion upon returning to our code after a long break. A good rule of thumb is: would this name make sense to someone who knows about the domain, but is not in any way a coder? For example, we could call our main object Article instead of Paper. Usage differs even within academia. In the humanities they tend to be papers when delivered at conferences and articles when printed in journals. Students and teachers think of them as papers. Engineers and scientists tend to lean towards papers as well -- for them "article" has a more formal ring to it. I chose paper instead of article because it has less linguistic ambiguity and talking about "an editor's articles" makes me think of parts of speech as much as written documents. You'll find as we go along that I do some hand wringing each time a new name needs to be coined. The process is even tougher when dealing with join models and other nouns that don't have a precise correlation to words in the real world (at work right now we're thinking about changing the name of a model from Batch to Batching because it really represents an event wherein some things are joined together into a batch. Both of those choices sound ugly and are confusing in different contexts).

So, what attributes does a Paper have? Here's a transcription of the sketch I made on my way in from Portland:

  • title
  • created_at
  • updated_at
  • url?
  • file_column?

The first attribute is pretty self-explanatory. The next two are time stamps; created_at tells you when the paper first entered our system and updated_at when it was last changed. These are pretty standard in database-driven web apps and if you include them on your models in a Rails app, Rails will automatically make sure that they get set in the way you'd expect.

A note here about attributes and the role of the database in a Rails app. So far, we've talked about our models in terms of the way they capture real world objects into the abstraction of our design. From another point of view, though, our models are simply representations in code of the database tables we're going to create. The database acts as persistent memory for our program. Here's how it works. At various points along the way, for example when we create a fresh object, the instance of our model will correspond exactly to the state of one row in our database. In concrete terms, if we wrote: thesis = Paper.create :title => "It's Not Just Academic" Then the object stored in "thesis" would correspond exactly with a row in the papers table. Each of its attribute-reader methods would return precisely the values of the corresponding columns in the database. Now, say we start changing the values of our paper's attributes like so: thesis.title = "It Is Just Academic" Well now the object we have in memory, the paper we're working with in our Ruby code, has diverged from the corresponding paper that we've got saved in the database. This will remain true until we call "save" -- at which point Rails will write our version of the object to the database updating each of the columns so they represent the current values of the attributes -- or "reload," which causes rails to revert the paper we've got in memory to the state that it has stored in the database, attributes will get reset to the values of their corresponding columns, whatever information we'd placed into those variables will be overwritten.

The last two attributes on our Paper model, url and file_column, represent two different ideas I had for keeping track of the location of the actual HTML files that our authors upload. The first and simpler of the two (the one I'll probably start with, in other words) is url. That would just be a string that keeps track of the location in the file system to which we uploaded the HTML file. Under this system, the part of our code that accepts uploads will have to be sure to record the uploaded-file's name so that we'll know where to look for it and how to link to it. The other option "file_column" represents an option I know a little less about, the File Column Plugin. I've never actually used it myself, but I've heard tell of a Rails plugin that allows you to store uploaded files in the actual database itself, handling all of the conversion code so that you can access the file from the database just as you would any other attribute stored there. That sounds intriguing and probably has important optimization repercussions (in other words, it probably plays a big part in determining what resource the application will consume most voraciously: memory on disk, database calls, processor time, etc.). Right now, storing the url as a string seems simpler to me so I'm going to start with that while making a note that the file column plugin is something I should look into more closely later.

Setting Up the Database and Generating the Model

Now that does it for theory and it's time to start actually coding our app (finally!). Wait. Wait. I just realized we've got one more small piece of configuration business to take care of: setting up and configuring the database. This bit is easy and once you've made a few Rails apps you'll be able to do it by rote. There are a ton of different combinations of databases, database engines, operating systems, etc. out there, so I'm just going to tell you what I have to do to get setup. If you're running on a contemporary Mac with a well-configured copy of MySQL things shouldn't be too different for you. If not, Google around, there are plenty of resources out there to help you get things right. Here we go:

First I've got to create the trio of databases on which a Rails app depends: development, test, and production. I'll do this from the command line: gabc:~/Sites/archive Greg$ mysql -p -u root (type your root password) mysql> create database archive_development; mysql> create database archive_test; mysql> create database archive_production; mysql> exit

Then, I'll open up config/database.yml and add my MySQL password to each of the three entires. Now we should be totally good to go. Serious this time. Let's run the server just to make sure: gabc:~/Sites/archive Greg$ mongrel_rails start -d Bringing up localhost:3000 in my browser I see: "Welcome aboard: You're riding the Rails!"

At last, it's time to get started on our Paper model. First I'll run the Rails model generator to get all of the files I'll need created and setup: gabc:~/Sites/archive Greg$ script/generate model Paper This'll give us, in addition to the model itself, a unit test and fixtures that are all set up and ready to go as well as a migration for setting up the database to handle our new model.

I'll write the migration next since we've basically done all the work already when thinking about what attributes our papers need to have. Here it is (archive/db/migrate/001_create_papers.rb): class CreatePapers < ActiveRecord::Migration def self.up create_table :papers do |t| t.column :title, :string t.column :url, :string t.column :updated_at, :datetime t.column :created_at, :datetime end end def self.down drop_table :papers end end The generator left me with empty self.up and self.down methods, which I've filled in to create the papers table with all the proper fields. Like I said above, the table that corresponds to our model is basically just another view on our model. When we save an individual Paper object the table will store the values that we've assigned to the object. And Rails provides us with convenient methods for reading them back out again. In a minute we'll get to using those, but first let's actually run our migration: gabc:~/Sites/archive Greg$ rake migrate Now the papers table exists and has the right fields. We can even go in right away and make a paper by hand if we want via Rails' "console", a shell the framework provide for interacting directly with our data. The console is a great place to sift through your data by hand or try out expressions when you're working on writing custom methods: gabc:~/Sites/archive Greg$ script/console >> thesis = Paper.new :title => "It's Not Just Academic" => #<Paper:0x26b6e5c @attributes={"updated_at"=>nil, "title"=>"It's Not Just Academic", "url"=>nil, "created_at"=>nil}, @new_record=true> >> Paper.count => 0 >> thesis.save => true >> Paper.count => 1 >> thesis => #<Paper:0x26b6e5c @attributes={"updated_at"=>Mon Aug 21 14:47:04 EDT 2006, "title"=>"It's Not Just Academic", "url"=>nil, "id"=>1, "created_at"=>Mon Aug 21 14:47:04 EDT 2006}, @new_record=false, @errors=#<ActiveRecord::Errors:0x2637a6c @base=#<Paper:0x26b6e5c ...>, @errors={}>> >> thesis.title => "It's Not Just Academic" If you follow along with that input, you'll see that I made a new paper with the title "It's Not Just Academic," storing it in a local variable called "thesis". Since I hadn't yet saved the new paper, there were still no papers to be found in the database. Then I did save it, which succeeded, returning true, and re-counted the papers in the database to discover that it was there now. Next, I looked at the object stored in thesis to find a paper different from the one I'd originally put there. It now had non-nil values for "created_at" and "updated_at" along with an additional instance variable by the name of @errors where Rails would store any errors that it happened upon while saving the object (you can read out the current errors on any object by saying something like this: thesis.errors.full_messages). And finally I used a method automatically added by Rails to read off the thesis's title attribute.

Validating the Presence of Papers' Titles

Ok. Now that we're past the total basics of getting our Paper model up and running, we can actually start doing something with it. What do we want the Paper model to do? Well, from when we thought about our screens earlier we know that when users upload papers they're going to be giving us two things: the title, and the HTML file. We're then going to need to store the title in the database, store the file in the filesystem, and store the file's location in the database as well, specifically in the url field we added to the papers table. It would be great if we could give the papers nice urls. For example, I'd love it if the url for my thesis could be something along the lines of: www.academicarchive.org/borenstein/art_history/its_not_just_academic.html. Now I don't want to think too hard about the "/borenstein/art_history" part right now because that's going to have to do with routing and right now I'm trying to concentrate on the Paper model. What I do know from this is that we don't want to save any papers into the database that don't have titles and we're going to want to figure out a system for making the titles our users give us safe to use as urls (there are rules about what can and can't be in a url, i.e. you can't have spaces, can't have apostrophes, they have to be under a certain length, etc.).

I want to take the first of these first: making sure that every paper we save in the database has a title. Thankfully, Rails makes this super easy with a system called validations. In essence, validations are just methods that automatically get run at different points in an object's life cycle (when you make a new one, when it gets saved, etc.), throwing errors unless the object meets certain criteria. When our app has actual views, we can use the validation errors to let our users know that they've done something wrong through on-screen feedback. At this point though, we're just going to use it to make sure that all of our papers have titles. The validation is a one-liner add, like so (in archive/app/models/paper.rb): class Paper < ActiveRecord::Base validates_presence_of :title end

What does the Rails' implementation of this validation actually look like in practice? Let's jump into script/console and find out: gabc:~/Sites/archive Greg$ script/console Loading development environment. >> thesis = Paper.new => #<Paper:0x2662e9c @attributes={"updated_at"=>nil, "title"=>nil, "url"=>nil, "created_at"=>nil}, @new_record=true> >> thesis.title => nil >> thesis.save! ActiveRecord::RecordInvalid: Validation failed: Title can't be blank from ./script/../config/../config/../vendor/rails/activerecord/lib/active_record/validations.rb:756:in `save!' from (irb):3 You can see that we built a new paper and didn't assign it a title. Then when we tried to save the paper, Rails raised an "ActiveRecord::Record Invalid" error that included a message explaining its cause and a traceback showing us exactly where in the code the problem came up (we called "save!" with the exclamation mark at the end because that tells Rails to throw an error in our face if one comes up instead of simply failing silently).

Getting Started with Testing: Fixtures

Now that we've finally written some actual code, our next job is to make sure that code actually works as we expect it to and that means tests. Testing is a big subject, but suffice it to say here that it has two main purposes: to make sure our code does what we think it does and to make it easy for us to change our code later on (if we make a major change and all the tests still pass, that's a good sign that the rest of our code still works; if they don't, well that means we've probably got some fixing to do). (Don't worry if you're totally new to testing and the whole concept seems a little fuzzy to you. It will become clear in a minute when we actually write our first test -- tests are one of those things, like spiral staircases, that are much easier to show than to describe.)

Anyway, for our tests to be most effective, we want to cover as much of our code as possible and that means starting right away. The more untested code you write the less likely you are to ever go back and add tests and the more likely you are to end up with confusing, unmaintainable code. In fact, some people insist that you should "test first," writing tests that define the behavior you want from your code before writing your code itself. That way you don't "overcode"; you make sure not only that your code works, but that it doesn't have any undesirable side effects. We may do some test first development a little later on, but right now we're in a simple enough situation that I'm perfectly happy to start testing with a whopping one line of existing code.

What do we want to test? We want to test that our code actually does require each paper to have a title like we're trying to get it to and, further, that a paper without a title will always throw an error. So, the first thing we need is some fake papers to play around with for testing. As part of its testing suite, Rails gives us a place to create these papers: the fixtures. You can think of fixtures as just like tables in the database, only they happen to be represented in a flat file. At the start of a test run, Rails loads the data in these files into a temporary testing database so you can access it in your test methods. This makes it perfect for creating different scenarios against which to run your code and make sure that it does the right thing. In our case, we're going to want to make some papers and see if our code can tell whether or not they're valid.

Rails already created our fixture file for us when we generated the Paper model, so let's open it up and take a look (it lives at test/fixtures/papers.yml): # Read about fixtures at http://ar.rubyonrails.org/classes/Fixtures.html first: id: 1 another: id: 2 Here's how this works: the non-indented lines are "names" by which we can refer to each entry. The other lines are pairs of column names and row values in the table. It will quickly become clear if I show you how I turned the version of my thesis we were playing with before in script/console into a fixture: thesis: id: 1 title: "It's Not Just Academic" created_at: 2006-08-21 09:34:28 updated_at: 2006-08-21 09:34:28

Pretty self-explanatory. The one gotcha is the format of the "created_at" and "updated_at" fields, which look different than what Ruby printed to the screen when we were in script/console. This is MySQL datetime format. When I can't remember how it goes, I make a new record in script/console and then just go look at my database using a GUI tool like YourSQL (especially when I'm on an airplane on the way from NY to San Francisco with no access to the web). There are a few other things that commonly go wrong when working with fixtures and I'll just point them out here, while we're on the subject: (1) the .yml format (rhymes with "camel") is super picky about white space; indentations need to be 2-spaces wide, there can only be one space between the colon and the value, etc. (2) each entry in a particular fixture file needs to have a unique id; if you accidentally re-use the same id twice in one file everything will go haywire. (3) the test database doesn't necessarily get reloaded each time you run your test, only if you run it under rake; sometimes this can get especially confusing because the fixtures that get loaded up for one test tend to stick around for the next one and so you can have tests that pass or fail depending on what order you run them in (for example a functional test that fails when you run "rake test:functionals" may pass if you run just "rake" (which runs the units first before the functionals)).

If you're totally new to tests, some of that may have just seemed like gibberish. Don't worry about it. You can always reread that paragraph if you're running into mysterious errors as some future point done the line. . .

Testing the Fixtures: Our First Test and First Test Helper

I'm back in Portland now and recovered from my travels. Where were we? That's right. We've got our fixture in place so it's time to write some tests! Before we try and test our actual code, though, it's probably a good idea to make sure that our fixture itself is well-formed, or else our tests will be pretty useless. I've got a little test helper method from some earlier projects that's super helpful for this (for full disclosure, like most things it was probably actually Chris's idea). If we want a method to be available to all our test, we just stick it in test/test_helper.rb, so that's where we'll stick the following code (there's a helpful little comment in test_helper.rb that will guide you once you once you're in there): def assert_all_valid klass klass.find(:all).each do |obj| assert obj.valid?, "#{obj.class} with id #{obj.id} is invalid" end end

Let's walk through this method. First of all, it takes a class as an argument. Since "class" itself is a reserved word (a word that has special properties in Ruby and is hence unavailable as a name for a normal variable) we call it "klass". We might as well have called it "bob," but "klass" is conventional because it's easy to remember what it means. Once that's understood, there's not too much else going on here. We use Rails' "find(:all)" syntax to find all the members of our class and then we assert the validity of each particular member in turn, printing out a helpful message if the object is not valid. When defining custom test_helper methods of your own you'll save yourself a lot of headaches if you add as specific as possible of an error message so that, when the test fails, it will be clear what went wrong as well as, importantly, which particular objects or attributes were involved (hence the inclusion of obj.class and obj.id in the message).

A note of syntactical explanation: Rails adds a method to our objects called "valid?" that returns true if the object passes its class's validations and false if not; "assert" is the simplest testing method, passing if its argument is true and failing if it is false. Put these two together and you've got a test that passes if and only if the object is valid.

Now, let's write and run the test. In the test for our Paper model that Rails automatically stubbed out for us (test/unit/paper_test.rb), we'll replace the sample method with: def test_fixtures assert_all_valid Paper end save the file and then run the tests like so: gabc:~/Sites/archive Greg$ rake test:units (in /Users/Greg/Sites/archive) /opt/local/bin/ruby -Ilib:test "/opt/local/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake/rake_test_loader.rb" "test/unit/paper_test.rb" Loaded suite /opt/local/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake/rake_test_loader Started F Finished in 0.186302 seconds. 1) Failure: test_fixtures(PaperTest) [./test/unit/../test_helper.rb:30:in `assert_all_valid' ./test/unit/../test_helper.rb:29:in `assert_all_valid' ./test/unit/paper_test.rb:7:in `test_fixtures']: Paper with id 2 is invalid. <false> is not true. 1 tests, 2 assertions, 1 failures, 0 errors rake aborted! Command failed with status (1): [/opt/local/bin/ruby -Ilib:test "/opt/local...] What's this? Our very first test and we've already failed it! Well, thanks to the message we added to our custom assertion, it's really easy to tell what's going on: we have an invalid paper in our fixtures (when tests fail or throw errors they print out Es and Fs and then report back on the problem with a trace, showing which lines in which files got run before the problem hit; if you're trying to track down a less obvious problem than this one, that trace will be your lifeline). If we look at our paper fixtures (test/fixtures/papers.yml), we'll see that, in addition to the paper we created above, we've got the second one that Rails automatically created for us still hanging around: another: id: 2 And that paper is definitely not valid. Remember, we're validating the presence of our papers' titles and this one hasn't got one. It's only got an id. So, in order to get this test to pass, we've got to either delete this paper from our fixture or edit it so it'll be valid. Let's do the latter, like so: another: id: 2 title: "Simulacra and Simulacrum" created_at: 1996-08-21 09:34:28 updated_at: 1996-08-21 09:34:28

Now, saving the file and rerunning should result in our first clean test run: gabc:~/Sites/archive Greg$rake db:test:prepare (in /Users/Greg/Sites/archive) rubygabc:~/Sites/archive Greg$ruby test/unit/paper_test.rb Loaded suite test/unit/paper_test Started . Finished in 0.194362 seconds. 1 tests, 2 assertions, 0 failures, 0 errors This is great! After a little bit of setup, we've successfully tested the code we just wrote: our validation catches papers that don't have titles.

Looking a little closer at the output from the test run, notice that we got credit for two assertions rather than just one. That's because rake counted the internal call to "assert obj.valid?" as well as the direct call to "assert_all_valid" itself. If we had two papers in our fixtures, rake would have told us we wrote three assertions, and so on.

Running Tests: Under Rake, Under Ruby

It is probably worthwhile to spend a moment here on some of the specifics involved when running tests. There are four basic ways to run tests: a "full rake", just the units (the tests that exercise our models), just the functionals (those that exercise our controllers), or individual test files one at time. The first three we do by invoking rake ("rake", "rake test:units", and "rake test:functionals" respectively) and the last we do by just running the test file as if it was any other ruby program ("ruby test/units/paper_test.rb", for example). When you run your tests, Rails uses a different database from the one you're developing on. If you remember some of the configuration we did above, when we set up our database.yml file, we told Rails to use a database called "archive_test" for this purpose. At the start of each run, rake clears that database and then loads it up with the data you stored in your fixtures so that you'll have a controlled environment in which to do your testing. Further, the Rails testing framework keeps the data generated in each test method from polluting your database for other methods. Each test method gets a clean start.

Besides running different sets of test files, each of the three different rakes (full, units, and functionals) does this database destroying and recreating process separately. So, if you run a full rake, your test database gets destroyed and recreated twice, once at the start of the rake when the units run and once halfway through before the functionals do. Since rake only loads up the tables that you tell it to (by including different sets of fixtures at the top of each of your test files) this ordering can mean that you can get different results from the same test! Let's say you were working on a functional test. When you run that test under rake test:functionals only the set of tables explicitly asked for in the functionals tests get loaded. Under a full rake the units run first, so by the time your tests get run, the tables created by the units will still be hanging around. If your tests passage or failure hinges on this difference, you'll see different behavior in the two situations. If you encounter this issue just make sure that each of your tests calls all of the fixtures that it needs (don't forget the ones being referenced through associations either!).

And finally when you're just running a single test like "ruby test/units/paper_test.rb" -- which can be a real time saver once you've got a lot of tests written and running the whole suite takes a full minute or two -- you don't have the benefit of rake's database loading at all. Your test will run with whatever the current state of the test database was leftover from your last rake. This can result in some seriously strange results that will have you chasing ghost bugs that aren't really there. To prevent that problem, simply run "rake db:test:prepare" before your test and rake will setup your test database just how you want it.

How To Write a Test: Given, When, Then

Now, while our first test definitely exercised the code we just wrote (the validation obviously got run), it plays kind of a more general role: guarding our paper fixtures from any invalid data. More to the point, if we stopped validating on the presence of a paper's title, the test would still pass (try it, go delete the whole line and then rerun your tests). Therefore, this can't quite be said to be a test on that validation as such. So, let's write one.

How, generally, do you write a test? Well, most tests have three parts: the setup that must be in place to accomplish some action, the actual code that runs the action (this is the code you're trying to test), and then some ideas about what we expect the effect of that action to be. Splitting these parts up in your mind and then addressing them one at a time usually makes it much easier to write a test. When I start my test methods, I find it helps to start by writing these parts down explicitly as comments so I can keep focused on exactly what I have to do (plus it lets me do a bunch of typing, which feels productive, without having to actually do any thinking), like so (in test/units/paper_test.rb): def test_validates_presence_of_title #given #when #then end Giving tests descriptive names is always a good idea since the whole point of them is that if you ever see them in a test run they should tell you exactly what's gone wrong. Rails will only run test methods that actually start with "test_", so a good recipe for naming tends to be appending some description of what you're testing onto there.

Back to the question of how to test our validation. Let's try to say the three parts of our test in words. Given a paper that has no title, when we try to save it, then the paper should throw an error, remain unsaved, and report itself invalid. Now, that's starting to sound like something I could write up in code. I'll give it a shot. Here's my first draft: def test_validates_presence_of_title #given p = Paper.new #when p.save! #then assert !p.valid? end I make a new paper. Don't assign it a title. Try to save it. And then assert that it is not valid. Just like I planned. What happens when I run that test? 1) Error: test_validates_presence_of_title(PaperTest): ActiveRecord::RecordInvalid: Validation failed: Title can't be blank /Users/Greg/Sites/archive/config/../vendor/rails/activerecord/lib/active_record/validations.rb:756:in `save!' ./test/unit/paper_test.rb:14:in `test_validates_presence_of_title' Oops! Trying to save the paper failed, like it was supposed to, but the error that it threw prevented the rest of our test from executing. What we need to do is wrap our save call in an assertion which knows to expect the error, like so: def test_validates_presence_of_title #given p = Paper.new #when assert_raises(ActiveRecord::RecordInvalid){p.save!} #then assert !p.valid? end This is a passing test. Assert_raises takes an error type as an argument (thankfully we knew exactly what type of error to expect since we'd already seen it on the first run) and passes only if the code in its block throws that error.

Now, I'll show you just one more iteration of this test with a few more trimmings: def test_validates_presence_of_title #given paper_count = Paper.count p = Paper.new assert !p.title #when assert_raises(ActiveRecord::RecordInvalid){p.save!} #then assert !p.valid? assert_equal paper_count, Paper.count end What have I added? Start with the first and last lines. One of the things we'd said we wanted to test was that the paper should remain unsaved. Well, there's two sides to that: the object's side and the model's side. We're already testing for the error thrown by the call to "save!", but now we want to test the model side, i.e. that the number of papers in the database doesn't change. To test that, we store the count of papers into a local variable (paper_count) on the first line and then compare it to a fresh count on the last line ("count" is a useful method that Rails adds to all of your model classes, it returns the result of Model.find(:all).length). As long as these two are the same, we'll know that nothing we've done has affected the count of papers in the database.

The other thing I've added is the assertion that, just after it is newly made, the paper does not have a title. While somewhat extraneous, the purpose of this assertion is to make explicit one of the assumptions in our given state: a new paper doesn't have a title. Since it's the very absence of that title that renders the paper invalid, it made sense to write an assertion verifying it before getting to the heart of the matter.

Philosophy of Testing

Is this overkill? This particular example is obviously somewhat contrived. I probably wouldn't be this thorough in testing such a simple situation if I wasn't trying to demonstrate the ins and outs of my thought process while writing tests. But what should our "philosophy of testing" be? Is it possible to have too many tests? What should be the thrust of the tests that we do write?

Like so many other things, answers to these questions are partially a matter of taste and partially a matter of responding to the particular situation you find yourself in, both of which are things that are hard to learn through any other method besides experience (I work all day with coders who are better at them than I), but I think I can lay down a few guidelines that help guide my thinking.

Let's start with some don'ts:

  • Don't test something that's part of the framework or a third-party library. If you don't trust other people's code enough to use it without redundant testing, you should probably just avoid using it altogether. Plus, this is just unnecessary extra work when the whole point of using libraries and frameworks is to avoid duplicating effort that other people have already put in. (To a certain extent we're breaking this rule in our test above, but not too badly. The key difference is that we're testing whether we've successfully used the framework to enforce a business logic rule (that papers must have titles) rather than whether or not the framework's code for enforcing that rule works in the first place.)
  • Don't let your tests lock down the specifics of your code too much. When I first got into the swing of writing tests, I got hooked on assertions. I wanted to run up the score, to see more dots zoom across my screen. And so for awhile, I picked up the bad habit of writing assertions on everything I knew to be true in my code: the exact wording of error messages, the exact values of a bunch of attributes in the fixtures, etc. This turned out to be a bad idea because it made my tests incredibly fragile. Anytime I'd twiddle around with my fixtures at all (say, to fix a typo), my tests would break. My tests were making more work for me when they were supposed to make my life easier. Which brings me to. . .
The dos:
  • Do write tests that ensure outcomes. Our goal with writing tests is to leverage a specific situation we've thought of (and, often, captured in the fixtures) into a general structure that will make sure that our code will act right in all situations. For example, in testing our validation above, we could have written something like this: def test_validates_presence_of_title #given p = Paper.new assert !p.valid? #when p.title = "My title" #then assert p.valid? end On the surface, this test seems a lot like the one we wrote above. It asserts that a paper without a title is not valid, adds a title to the paper, and then asserts that paper is valid. What it doesn't do is engage with the more general purpose of our validation: preventing papers that lack titles from getting saved to the database. It also has some specifics hard coded into it: the choice of "My title" as a title. While that seems fine right now, what if we made a change later on that, say, required all of our titles to be formatted in unicode for internationalization? Then this test would start to fail even though it was unrelated to the new code we were trying to write. It would become yet another spot in our code we had to change to add a new feature or to alter our design.
  • Do write tests first to specify behavior. Often times tests are just a better medium in which to think about the design for your program than the program itself. Writing a test lets your think precisely about what you want your test to do without worrying about how it's going to have to get it done. For example, take the goal I mentioned of having pretty urls for our papers (getting the url for my thesis to end with "its_not_just_academic.html"). Well, I still don't have a clear plan for how to accomplish that goal, but I know how to write a test on it: def test_paper_url #given p = Paper.new :title => "It's Not Just Academic" #when p.save! #then assert_equal "its_not_just_academic.html", p.url end Right now, running this test will result in a failure: 1) Failure: test_paper_url(PaperTest) [./test/unit/paper_test.rb:28]: <"its_not_just_academic.html"> expected but was <nil>. But now I've got the beginning of a kind of objective standard against which I can write my system for generating papers' urls from their titles. If this test (and presumably some others) passed then I would be done. Working this way lets me focus on making the individual parts of my code work without having to constantly be trying to remember what the point of all of this code was. The tests keep track of the larger context so I don't have to.
If all of these testing ideas seem a bit hypothetical to you right now, don't worry about it. Hopefully you'll be seeing them all in practice a lot as we continue work.

Speaking of which, now that we've got a basic Paper model, it's time for us to write some real screens with the forms and views that our users will interact with to upload their own papers. So, stay tuned until next time when we'll: gabc:~/Sites/archive Greg$ script/generate controller papers

Tagged: , , , , , , , , , , , , , ,

Posted by Greg at 1:46 PM | Comments (5)

August 18, 2006

learns_to build Academic Archive::Part 1:Introduction, Concept, and Design

At this year's FOSCON, Amy Hoy issued a clarion call to the elite Ruby hackers in the room: help the newbies! With the spectacular recent growth of Ruby and, especially, Rails there's a great and growing need for educational resources and infrastructure to help newcomers get acclimatized.

Since then, I've had a few ideas that might help. The first I blogged a while ago: The How-I-Learned-Ruby Quiz. The second one is more ambitious. I'd like to introduce it here.

One of the most useful experiences for me in the process of learning Rails was working side by side with a more advanced coder on a real project all the way from the first design sketches through the deployment of a working app. While the Agile Book tries to provide a version of it, this is an experience that is almost wholly unavailable to newbies. Most beginners are stuck trying to puzzle their way through with reference books, source code, and blogged code snippets. While these are sufficient for the experienced coder simply trying to pickup the new hip language, they just don't get the job done for true programming newcomers.

So, I therefore propose to provide some simulacrum of that experience here on this blog. I've got a project I want to build in Rails and as I'm doing so, I'll try to give you a view over my shoulder. I'll write about the practicalities and the philosophy behind each step. Eventually, I hope to make the code I write available through a public repository so you can follow along and even help out (as you'll see shortly, the app is public-minded and will, eventually, be open source).

To complete this introduction, I'm going to take you through all the steps I've gone through so far. . .everything up to actually writing code. First, I'll lay out the concept for the app, summarizing its purpose and aims. Then, I'll talk about design. In many ways the hardest and most interesting part of building a web app, design is the process by which we translate the real world inhabited by the app into abstracted models and relationships we can represent in code. I'll start with the basic screens I imagined when first thinking about how to make this website and then proceed through the first two iterations of "arrows and boxes" I've come up with so far.

Concept: Academic Archive1

Scholars of every rank and level regularly research and write papers which never see publication. Whether written by undergraduates or tenured professors, by amateur local enthusiasts or internationally renowned experts these papers represent a great wealth of research, insight, and argument which remains inaccessible to the wider community of scholars as well as the interested public at large.

Academic Archive seeks to provide a platform for publishing these papers on the web in order to make them universally accessible. The Archive will accept submissions from anyone regardless of qualifications. The Archive's back end will allow a network of volunteers to undertake cursory screening of submitted papers for plagiarism and to ensure that they meet a basic level of quality. The Archive's public website will organize and index these papers for convenient search and browsing.

Design

I don't know about other programmers, but when I'm first brainstorming about an idea for a new web app, the part of it that I can picture in my head is the screens. I can't necessarily see the specific style of how they look, but I can kind of get a sense of what different roles they'll have to play. It's like imagining your dream house. You might not know what color you'll paint it when it's done, but you know you want a hot tub, a racquetball court, a formal dining room, etc.

Anyway, here are the screens I first pictured for Academic Archive:

Author Upload Page
This will be where the whole process starts. Users will come here to upload their papers so obviously it will need a file upload form. Most users probably have their paper in Microsoft Word format, so we'll need to make some decisions about how to prepare papers for the web, In the long run we'd like allow them to be able to accept actual Word files and to process them into HTML ourselves. This would make things easiest for the users and allow us to ensure the best markup for our articles. In a first iteration, though, our goal is to eliminate or put off as much complexity as we possibly can, so we'll probably only accept papers already formatted as HTML (so this page will probably also have some instructions on how to convert Word files).
Editor Approval Page
If you're one of our volunteer editors, this will be your home base. You can see a queue of articles awaiting your approval. From here, you can read each of the articles and approve or reject them for publication.
Index of Papers
This is really a whole section of the site dedicated to browsing through lists of the published papers and searching for information contained in them. It's got a front page with either the most recent papers or some other selections. If you're a visitor, it takes you from arriving at the site all the way up to the point of clicking on a paper to read it. I haven't given this section too much thought since it is the least specific to the particular problem we're trying to address. Lot's of other sites on the web present browsing and searching interfaces and, at least to start out with, I'll probably steal one of those that I think is good.
Individual Paper View
This part couldn't be simpler. The papers come in as HTML and all we've got to do is remember where we stored those files and point the readers at them. Additionally, we may want to provide some location for people to discuss each paper, but again that's not within the scope of what we're doing right now. We're just trying to find simplest site we can build that will solve the problem as we set it out in the Concept. Other feature ideas are great and we'll try to keep them in mind as they come up so we don't make any design decisions that rule them out, but right now they are a dangerous distraction from getting the app onto a solid foundation so we'll put them aside.

Now that we've done some basic thinking about the types of things our app needs to do it's finally time to start thinking about how it's going to do them. That mean its time to "model our domain". Domain modeling is an incredibly deep subject and there are an endless number of books on the subject. In fact, I'm reading the one I hear is the best right now. In a nut shell, domain modeling is the process of building up in code a representation of the parts of the real world pertinent to your problem. The idea is to install in your program abstractions of the people and things you're working with (in our case authors, editor, papers, etc.) and to tie them together into the proper relationships. It's kind of a hard process to get a grasp on, but it will quickly come much clearer if we start work on our specific case.

I made my first design sketch at a pub while waiting for a friend to perform. The Concept was brand new and I was psyched about it. I was drinking a beer. Here's what I sketched in my moleskine:

Academic Archive first design sketch

The first thing I drew, and the part I was most confident about was the author-authorship-paper trio of models. I was confident about this idea because I stole it. (A word about the notation here: a word represents a model, simultaneously a particular type of thing or person we're trying to represent as well as an actual class in our code. The lines represent relationships between them, the stars a "many" relationship on one side about which more in a minute.)

These three models are trying to represent the idea that an author "owns" a paper. That is, that an author has_many papers and a paper belongs_to one author (when I mean these relationships in their technical Rails/relational database sense, I'll write them with underscores like this as they'll appear when they are actual Rails methods. One of the things that's so nice about Rails is the way it's natural syntax let's you kind of slide gradually into from natural language).

So where does authorship come in? Authorship is called a "join model" because it mediates the connection between authors and papers. Instead of asserting that an author has_and_belongs_to_many papers, we'll say that an author has_many authorships and has_many papers through authorship. Join models are helpful in a number of ways. How? Well, there were a couple of things wrong with the author-paper relationship we set out above. First of all, what if a paper has more than one author? In order to model this we've got to say that a paper has_and_belongs_to_many authors and vice versa, which, in the code, means adding a lot of difficulty to the average case just to handle some complexity which shows up on exceptional cases (papers with multiple authors). Never a good idea. Secondly, a join model lets us assign attributes to the relationship between the two things that it joins. So, with the authorship model, we could capture the idea that two authors don't have the same status on a paper, i.e. that one is a research assistant or something. That would be a very hard situation to handle with a normal has_and_belongs_to_many relationship.

What else is going on in this sketch? Well, down the right we've got some lines connecting author to person and from thence to editor with the inscription "STI?" nearby. The idea there was the following: we've got authors and we've got editors. But really, both of those are just different types of people. Single Table Inheritance (or STI) is a pattern that allows you to capture multiple roles that might be played by a certain type of entity while retaining the attributes that are always common across those roles. A common example might be people in a company: the same person could simultaneously be a manager, an employee, and a member of a committee. No matter what role they played they would still probably have a name, contact info, etc. so keeping them in a common table a lot of work could be saved. I won't go into too much depth on STI. When you get to the next design iteration you'll see why I decided to abandon it (or at least postpone thinking about it until later).

How far along was I after making this sketch? I had a pretty good list of the nouns (author, paper, editor, category) but not a very clear sense of their relationships. I knew the authorship join was a good idea because of having seen other smarter people model that exact same situation before. But I didn't really know how papers got into categories or what relationship editors had with them. You can see me brainstorming some ideas for how to solve these problems in the notes at the bottom of the page. I was trying to figure out how papers get assigned to editors for approval. I came up with the vague notion that papers get assigned to editors through categories, writing that papers "can be approved or not, etc. in many different categories." Though only beginning to come into focus, this idea turned out to be the key to unraveling the whole problem. But to get there, I needed help. So I brought in Chris.

On a lunch break from work one day, sitting at an outside table on Morrison between 10th and 11th eating mezzas at a Lebanese restaurant, I pulled out my notebook and started telling Chris about my design for Academic Archive. Very quickly he asked a number of highly clarifying questions and helped me tease out a much more robust design. Here's the sketch I made that day:

Academic Archive second design sketch (with Chris)

The author-authorship-paper relationship is there, but now there are a couple of whole new concepts on the board: approval and editorship. The main idea here is for how categorization would work. In plain language the idea is that a paper could be submitted for approval in any number of different categories. It would then gain membership in each category by gaining approval from each category's editor. So, for example, I might submit my thesis, It's Not Just Academic: The Academy of Motion Picture Arts and Sciences, in both Art History and Film Studies. It would then be subject to approval by two different editors and could end up published in both categories, one, or neither depending on what each of the editors thought of it.

How does the new modeling capture this concept of the paper-category-editor relationship? It does so with two overlapping join model relationships. First we added approval, which stands between papers and categories. A paper has_many approvals and has_many categories through approval (and likewise vice versa for categories). Like in the example of authorship, the presence of this join model gives us an opportunity to hang attributes on the relationship. Here, we'll likely want to keep track of which editor issued the approval and when it took place. Actually, if you look at the diagram that attribute will take the form of a full on relationship. An approval will belong_to an editor. And, in fact, approvals will join papers all the way to editors as well as to categories. This will make it a breeze to figure out all the papers an editor has approved. And, thinking about it a little more, the approval life cycle will likely be the spot where most of the action on the Editor Approval Page will take place. Not to get too deep into implementation details, but I can imagine a scenario where creating a new approval assigns a paper to its editor who then marks it as approved or rejected. We'll have to think this through more precisely at a later point, but it's probably good news that this structure seems so rich even at this early stage.

The one other thing to note before we move on to editorship is the fact that editors have a relationship to papers that is separate from categories. This seems like a good thing since it's easy to imagine a situation where category editors come and go over time. Keeping those relationships separate will mean that we can keep an accurate record of which editor actually approved a paper for a category rather than only knowing the current editor of the paper's category. Without this separation it would be really easy to lose track of the simple factoid: who approved this paper?

Our second join model, editorship, looks a lot like authorship. It's how editors gain the ability to approve papers for categories. It will be really easy to list the categories for which an editor has approval power -- handy when you're trying to build the Editor Approval Page.

What outstanding questions does this design leave us with? Well, beyond editor and author there's no larger sense of a person or a user. Like we thought of the first time through, at some point we're going to have to provide a common ancestor for editors and authors. It will be the place we'll stick the user's personal details as well as their authentication information. That stuff is easy to leave out for the moment since it's both totally unrelated to the specifics of our domain and easy to bolt on later with a third party plugin like acts_as_authenticated. More substantially, we'll be using the idea of a user to make sure that an editor isn't assigned to approve a paper she authored. That's an important rule to capture and I'm pretty sure our design makes it possible, but in the name of limiting complexity, I'm going to stick a pin in this issue and come back to it later once things are more real.

The other big issue we have is that there's no place to do admin type activities: how does an editor gain permission to add an editorship in a new category? Who is allowed to create categories? Again, we're aware that at some point we'll probably need an admin model which is related to our concept of users in the same way that our author and editor models are. Again, this will probably all be done with Single Table Inheritance. And again, we're going to put it off for a little while.

Well, it's starting to look like we have a pretty good feeling for how to build this app. Enough to get started anyway. While these unknowns we just reviewed might be disconcerting, in my experience they're pretty par for the course. What we need to do clarify them is to actually build some part of the app for real. If we do that right, it will provide a concrete basis for our thought process about these continuing questions and may even point us in the right direction for a solution.

What part of the app should we build first? When I look at the diagram, I want to start with the paper-approval-category relationship and, specifically with papers. That's the one zone that involves only objects that are unique to our domain; there are no accounts or plugins or anything else external involved at all. Plus, the heart of this app is taking uploaded papers and putting them on the web. If we get that right everything else should fall into place. Or at least, that's what we hope.

  1. This idea is and the resulting project will be a collaboration with Jem, my cousin and one of my partners in MFDZ. []
Tagged: , , , , , ,

Posted by Greg at 9:30 AM | Comments (4)

July 28, 2006

The How-I-Learned-Ruby Quiz

This week, the circus is in town. Well, not quite, but it feels like it. Really, it's just OSCON, O'Reilly's preposterously big, expensive, and prestigious annual open source convention.

One neat side effect of this fact is that yesterday was FOSCON, a free informal one day unconference of Ruby enthusiasts held at Free Geek in parallel to the main conference. FOSCON's organizers took advantage of all the impressive Rubyists in town to put together quite the compelling lineup of talks: an impressive real-time demo of Distributed Ruby, a dandy little talk about how to use scripts, shortcuts, aliases, and Rake tasks to make your Ruby programming more fun and lots more. Jim Weirich, the inventor of Rake, was there for godsakes -- it is quite odd to suddenly meet, in person, someone who built something that's totally indispensable to you, something you use 200 times a day.

But, it was the one totally untechnical talk that's stayed stuck in my head: Amy Hoy's presentation, billed as "The growing nature of the Ruby community & embracing us 'right-brainers' with the minimum amount of trollishness/help vampirism/etc". The gist of it was that Rails' explosive popularity in the last two years has drastically diminished the quality of help available to both Rails and Ruby newbies. Old hands and experts who were formerly willing to help newcomers have been overwhelmed by a swarm of freshly arrived unsophisticates and buried by the deadening repetition of basic questions. Amy's talk featured a clarion call to the elite geeks in the room to do something about this problem: to create central repositories of educational resources, to improve their own writing abilities, and, above all, to retain, and even nurture, their empathy for the less knowledgeable.

In the Q&A, Chris made a great suggestion. Why not gather together stories from experienced Rails and Ruby coders about what things were like for them before they were so good? Such a collection would serve as great encouragement to newbies, seeing that even the exalted experts had to start out somewhere and it would give them a variety of options for paths they could follow to get moving towards greater expertise themselves.

So, to spur the creation of this collection, I give you the How-I-Learned-Ruby Quiz. In the spirit of The chain-letter of musical love and a thousand LiveJournal quizzes, the idea is simple. You answer a few questions about how you came to learn Ruby/Rails and then you pass the baton to some other people who fill out the quiz themselves. They fill it out on their own blogs and get even more people to do the same. And so on. It's easy. I'll show you.

The-How-I-Learned-Ruby Quiz

What was your technical background before you started learning Ruby/Rails?

Pretty strong HTML and CSS, beginner-level PHP, no real programming fundamentals.

How long ago did you start?

About ten months ago. Fall of 2005.

What were the two most useful resources to you in the learning process (not counting the Agile Book or the Pickaxe Book, which we'll assume everyone knows)?

Chris Pine's Learn To Program Ruby was the biggest single factor in my managing to finally really understand object oriented programing and the spirit of Ruby as a whole -- that was the moment when I really began to shoot up the learning curve on Rails. Being so lucky as to work all day right next to someone who is actually really good at this stuff also made a huge difference. The value of having a mentor every step along the way can really never be overestimated.

Tell us the story of how you came to learn Rails:

I learned Ruby because of music.

Four years ago, I graduated college as an art major and set off to start a band. Right at the end of college I'd met Jim Griffin, a visionary Geffen Records executive who'd been involved with the online distribution of music since 1994. Griffin conveyed the horror of the major labels and convinced me of the importance of the web for an independent band.

The eventual result was a company: MusicForDozens.com. We manufactured (burned, printed, glued) and sold CDs to order for any musician who wanted. My much more technical cousin (an American history professor) built the site using FileMaker. I watched. It took about eight months. I learned HTML.

In the following couple of years, I made and maintained my band's website but still spent relatively little time on the web. I hadn't yet caught the bug. In December 2004, I came across a post called Five Mistakes Band and Label Sites Make. It turned out to be on a blog called 43Folders, through which I rapidly discovered Quicksilver, del.icio.us, and, more importantly, a sense of playfulness and excitement about the real world usefulness of computers (well, Macs) and of the web generally.

Around this time, a music friend of mine, Chris, approached me about the idea of making Music For Dozens better, of rebuilding it so that anyone who wanted could upload their music and sell it, without contracts or sending CD masters through the mail. He started writing an mp3 uploading and selling web app in php. I watched. Gradually, I learned some php. I learned programming basics: variable assignment, control flow, etc. I learned php's syntax and how to setup a database using YourSQL. I used subversion for the first time.

I only ever made it so far with php. I could never fully internalize object orientation and so things of great importance, like clearly following the application's structure and totally understanding its model of the domain, always remained outside my grasp.

Awhile after we launched MFDZ 2.0, Rails came along. You can guess much of the rest of the story. I'm now on my second professional Rails contract. I quit my day job in January.

Three Ruby bloggers to whom you're passing the baton:

Chris Anderson, Peat Bakke, and, of course, Amy Hoy.

Tagged: , , , , , , , ,

Posted by Greg at 2:34 AM | Comments (3)

April 6, 2006

learns_to divide by zero

Is it just me, or is it fun when you get to break basic rules you learned in school? And more fun the younger you learned them? Not sharing, not looking both ways before crossing the street, not eating cookies and ice cream in bed before going to sleep without brushing your teeth, and so on.

Last week, at work, I got to divide by zero.

Here's the setup. We wanted to make sure that some value was positive, but we couldn't use a normal greater-than-zero comparison because the Rails situation demanded either a range or an array (it was a validates_inclusion_of :some_value, :in => range, if you're curious).

Anyway, there's the problem: how do you get a range that includes all numbers from zero on up? In mathematical terms, what I wanted was: [0,∞). So, I needed infinity. After staring blankly for awhile, it occurred to me: divide by zero! That would certainly produce something odd, possibly even infinity. I tried it:
>> 1/0 ZeroDivisionError: divided by 0 from (irb):16:in `/' from (irb):16

Ow, snap! Ruby's not having any of that. But don't fear, not all options are exhausted. There are different kinds of one, different kinds of zero. And it turns out, that's the key. For some reason, Ruby won't even think about dividing an integer by zero, but give it a float, and it gets creative:
=>> 1.0/0 => Infinity
Infinity! What's this? It looks a lot like a constant, but no:
>> (1.0/0).class => Float
It's a float. Which is just what ranges take as their bounds. And now we've got it:
>> range = 0..1.0/0 => 0..Infinity >> range.include?(1000000000000000000 * 100000000000000000) => true

Nifty? Subversive? Dangerous? Maybe. Fun? I think so.

Tagged: , , , , , , ,

Posted by Greg at 12:19 AM | Comments (4)

April 3, 2006

learns_to: A None-Too-Brief Introduction to Partials

One of the worst parts about writing websites by hand is the repetition. If you've ever tried it, you've probably run into a problem like this: You've got a set of links at the top (and bottom) of each page that act as persistent navigation. So, they should be the same everywhere. But you also want them to help indicate where the reader is currently. So, they need to be different on each page. To illustrate, here's the header from the redesign of the At Dusk website I'm currently working on: <p> <a href="blog.html">blog</a> <a href="bio.html">bio</a> <a href="music.html" class="selected">music</a> <a href="live.html">live</a> <a href="press.html">press</a> <a href="contact.html">contact</a> </p> This is the header for the "music" page. It is identical to that which appears on all the other pages except for the "selected" class on the link to "music.html". In the CSS, I use that class to add a red underline to the link. And, on each different page, the "selected" class appears on the appropriate link, letting the user know where they are.

With hand-written HTML, this quickly becomes a problem. Every time I start a new page, I've got to not only copy in the navigation, but edit it so that the "selected" class is in the right place, and also go back to all of the pages I've already made and edit their navigation to include a link to the new page. Doing this a couple of times, there gets to be little chance of me not forgetting to add a link or change a class.

Now, this seems like exactly the kind of problem that Rails exists to solve. It would be crazy to make it so easy to do really hard tasks like dynamically accessing a database and then end up with a bunch of repeated hand-coded out-of-sync HTML fragments everywhere. Rails enthusiasts treat eliminating repetition like a religious calling so the framework missing an opportunity like this would certainly send them off on a jihad.

And so, of course Rails has a system to handle the repeated parts of your website, whether they be plain HTML or the most complex ruby allowed in view logic. The system uses files called Partials. Partials look just like other views except that their file names start with an underscore.

So, how do we use partials to fix the problem of the context-dependent header? Well, first let's make certain assumptions about how this project would be structured in Rails (crazily enough, it's not. I'm actually hand-coding the new At Dusk site as a temporary measure until we find some Rails hosting). Let's say that each of these pages are views within a single controller called 'info' except the blog which is the index action of it's own controller called 'blog' (this isn't necessarily one hundred percent realistic, but it's enough to let us explore this issues at hand).

Let's make our partial. As the header, it should probably go in the 'layouts' folder, so create a new file there called '_header.rhtml' and in application.rhtml itself we'll add the following line at the top where we want the header to display: <%= render :partial => 'header' %> Now, in header.rhtml, we can we can whip out some Ruby logic to deal with the problem of placing the "selected" class on the correct link, like so: <a href="blog.html" <%= "class='selected'" if controller.controller_name == 'blog' %> >blog</a> This is pretty intuitive. We're telling Rails to add the "selected" class only if the name of the current controller is 'blog'. We can get even more specific by adding the action_name into the mix: <a href="bio.html" <%= "class='selected'" if controller.controller_name == 'info' && controller.controller_action == 'bio' %> >bio</a> We could go even a step further and specify the link in proper Rails format (in the above examples, the links would have to be absolute):
<% if controller.controller_name == 'info' && controller.controller_action == 'bio' %> <%= link_to 'bio', :controller => 'info', :action => 'bio', :class => 'selected' %> <% else %> <%= link_to 'bio', :controller => 'info', :action => 'bio' %> <% end %> Now, this seems a little more complicated than the plain HTMl version, but a dense, contained chunk of complexity is more easily maintained than a sprawling sea of simplicity.

And, once we make the jump to start using partials, we can take advantage of another powerful feature: the ability to dynamically pass objects into them from the parent views. Normal views have available to them exactly the data made accessible by their controllers. But partials can be called from multiple different controllers. Hence, they get their data by having it passed into them by the view that calls them.

Let's look at an example. One of the things that I'm going to want to do on the At Dusk website is display At Dusk songs. I'm going to want to do this repeatedly and from many different places, so it probably makes sense to create a partial for it in a publicly accessible place. How about a folder in the views directory called 'shared'? Now what are we going to call our partial? Well, there will probably turn out to be lots of different formats in which we'd want to display tracks: as part of a list, in detail including lyrics, etc. We'll start with the the simplest one, the list display. So we'll call our partial _track_in_list.rhtml (remember the underscore is how Rails knows it's a partial) since it's going to display a single track as part of a list.

So, say we're on a page about a particular album and we want to display a list of the tracks on that album, how would we do it? (All through here I'm going to assume that there's a database and Rails models and controllers, and all the workings set up in the standard way.) <p>Here are the tracks on our most recent album: <ul> <%= render :partial => 'shared/track_in_list', :collection => @album.tracks %> </ul> First off, note how we've given the relational location of the partial 'shared/track_in_list' and haven't used the underscore when referring to the file. Second, we've added something new, the collection. The collection is just the group of objects for which we want to display our partial. In this case, we're going to display it once for each track belonging to our album.

Now it's time to write the track_in_list partial itself. Here comes the only tricky bit. When you render a partial from a parent view it defines a local variable in the child AND NAMES IT AFTER THE CHILD'S FILE NAME. I put that last part in all caps because it is both the thing it took me the longest to internalize and the key to the whole business. In our example, this means that, in the track_in_list partial, we'll have a local variable available to us called track_in_list that stores the current track being rendered.

So for example in _track_in_list.rhtml, we could say: <li><%= track_in_list.title %></li> and it would act as expected. But there's two problems with this. First, I find this code confusing. What's a track_in_list? If I don't immediately know off the top of my head all of the places from which this partial is being called, I might have a hard time remembering what actual object it is that we're rendering. Also, this code is useless anywhere outside this file. If I wanted to move this list element into another partial, say _track_detail.rhml, I'd have to change the variable name to track_detail. That doesn't sound too bad when you're talking about moving one line, but it really doesn't scale.

Therefore, I'll add one more line to the top of this partial: <% track = track_in_list %> <li><%= track.title %></li> All I'm doing is storing the object that Rails automatically passed in (in this case, it was @album.track) into a new local variable that has a sensible name, "track". Once I've done that, I can just refer to track throughout the rest of the partial to access the current object. This will prevent a lot of pain if I decide that I want to rename the partial. Say that we wanted to be able to display movies as well in this list format, so change the file name of the partial to _media_in_list.rhtml. All I have to do is change this one line at the top and the rest of my view logic will still work.

So that was a (none-too-brief it turns out) introduction to partials in Rails. If nothing else, hopefully this tutorial will at least make it easier for you to find your way through other people's Rails apps by following the locations of nested partials. As I said above, the key thing to remember is that in each partial there's a local variable named after the partial's file name that holds whatever was passed in. If you can't think of anything to use partials for go read someone else's more complicated code. You'll soon see that the more comfortable you get with this little bit of complexity, the soon you can make your whole life in Rails that much simpler.

Tagged: , , , , , , ,

Posted by Greg at 12:54 PM | Comments (1)

March 8, 2006

learns_to use migrations

As I mentioned earlier, I was recently hired to do actual paid work as a Rails developer (as unbelievable as that may seem to those of you who've actually met me). I'm beginning to realize that, in addition to this stupefying fact in itself, I also won the employer lottery. My bosses are three guys. They're nice, highly communicative, and -- best of all -- extraordinarily responsive to my input at every level from implementation style all the way up to application design and even business model planning. I can't say too much about the specifics of the job (they actually had us sign an NDA, how 1999!), but I can say that the guys are highly experienced java programmers who have more experience with web development and startups than I probably ever will (check out Jonathan's resume. . .and he's just one third).

This project is their first Rails app so part of what they want is, naturally enough, for me to share whatever inside dope I may have on Rails' workings. I decided that the best way to do this would be for me to write a series of learns_to posts here on the topics I decide to cover and then follow-up in private with specifics that relate to what we're building. That way, some poor Googling sap potentially gets some help on his questions (I was that poor sap not so long ago, so I know how much help the well-written random blog post can be) and the guys some customized documentation to get themselves, and anyone else they bring on in the future, up to speed double quick. So, without further ado, let's learn Migrations!

With Migrations, Rails provides a way for you to abstract your database schema in code. In simple terms, a migration is a chunk of Ruby that represents a set of changes that you want to make to the structure of your database -- which sounds complicated and confusing, but becomes suddenly clear if we look at an example. Say, we're creating a simple blog application. We've got users and posts each of which have a couple of attributes. Here's the migration we'd write to get our database set up. class OurFirstMigration < ActiveRecord::Migration def self.up create_table :users do |t| t.column :name, :string t.column :updated_at, :datetime t.column :created_at, :datetime end create_table :posts do |t| t.column :subject, :string t.column :body, :text t.column :updated_at, :datetime t.column :created_at, :datetime end end def self.down drop_table :users drop_table :posts end end Now, this is pretty human readable. Like every migration, ours has two methods, self.up and self.down, the first of which gets run when we're moving up through this particular migration making these changes for the first time, the second when we're going the other way and undoing them (more on the rough-and-tumble details of actually running migrations a little later). The first is obvious: in self.up we create two tables, called "users" and "posts" respectively, each of which has a handful of columns that store various predictable attributes of users and posts. The second method, self.down, rolls back our changes, destroying the two tables we just created. Now, that doesn't seem especially useful when we're just at the point of creating our database in the first place. On the other hand if we were in our seventh or eighth iteration and had made radical (and potentially boneheaded) changes to our schema this method would come in incredibly handy. If we decided that we needed to undo our recent changes (to revert to an earlier database structure), we could do so without nuking the rest of our database schema or any of our data. You can think of these methods as built-in version control for your database.

So, you say: What exactly are the practical advantages of using migrations to manage changes to my database? I already keep my SQL schema under version control. That's how they do it in the Agile Book. What do you know that DHH and Dave Thomas don't? Well, the reason DHH and DT didn't mention migrations in Agile is that, despite their other many achievements, neither of them have (yet) invented a time machine. Migrations are maybe the most exciting of a spate of great additions to Rails that first appeared around in the run up to version 1.0[*]. And their advantage is that they bring all the Rails virtues to database management, making it easy to learn, powerful to use, and beautiful to behold.

First of all, with migrations you'll manage your database schema in Ruby just like the rest of your project; you don't have to jump into SQL (or learn it in the first place) every time you want to make a change. Beyond just the convenience for you, staying within Ruby also means that Rails is able to offer the same two great virtues when you're managing your database that it does everywhere else: abstraction and convenience. Migrations don't care what kind of database you're using. Rails will translate your migration into the right kind of instructions for any of the supported database formats (MySQL, PostgreSQL, SQLite, SQL Server, and Oracle). That means you can develop on MySQL or SQLite (or develop on one and collaborate with someone using the other) and then deploy on Postgres, all without a second thought. Abstracting out the specifics of talking to the database will mean fewer configuration headaches and, therefore, more flexibility to change your system around. And we've already seen the benefit of having access to Rails convenience methods in the concision and clarity of our sample migration. Methods like create_table, drop_table, add_column, and remove_column (see the Active Record Migration documentation for a complete list) make managing our schema as simple and painless as it could reasonably be.

Another almost unreasonably cool side effect of migrations being first class Ruby citizens is that you can use them to populate your tables as you're creating them. Let's say that we wanted to organize the users of our blog application into groups (i.e. Group would have_many :users and User would belong_to :group). We could write a migration that simultaneously creates the groups table, adds the foreign key to the users table, and fills in the necessary data to assign all of our existing users to a group like so: def self.up create_table :groups do |t| t.column :name end add_column :users, :group_id, :integer User.find(:all).each do |p| p.group_id = 1 end end

Rails is also smart enough to add an id to each table without having to be told (you actually have to tell it not to by saying something like "created_table :group, id => false do |t|. . ." if you want it not to create an id column, as you would if you were creating a join table).

So far so good. Let's talk implementation. Let's say you've just started creating your blog software. You've run the Rails command to create your project, created your production, development, and test databases, and edited database.yml so that Rails knows how to talk to them. Now it's time time to write your migration. First, we've got to create the file. Rails, of course, gives us a command for this: > script/generate migration OurFirstMigration You can give your migration any camel-case name you'd like, Rails doesn't care. In our example, this command would create a file called 001_our_first_migration.rb in /db/migrate. Open it up and you'll find a blank migration with empty self.up and self.down methods. Populate them with your migration methods in the style of our first code example above. Once you've done that, all that's left is to run the migration: > rake migrate That comand will bring you up through each of your exiting migrations in turn in order to jibe with the most advanced state of our schema. We've only written one migration so far, but look at your database. It's got tables and columns, created just like we specified, oh my! But wait, that's not all, rake migrate is also smart enough to take a version number. If you want to, say, migrate from the fifteenth iteration of your database schema back down to the seventh, all you've got to do is: > rake migrate VERSION=7 Clearly, depending on the changes your migrations make to the schema some of your data can get lost in the shuffle. Rails isn't actually magic. If you drop your posts table, the data in it gets dropped, too. So, think carefully when you design the granularity of your migrations. If you lump a bunch of schema changes together then it's going to hurt all the more to roll them back when you've got real data. Conversely, while you're developing feel free to migrate up and down willy nilly anytime you've accumulated too much nonsense data in your tables.

There's probably plenty more to be said about migrations, but this should be enough to get you started. And, of course there's always more reading you can do. For example, there's a pretty good discussion on the Rails wiki under the heading Understanding Migrations and when in doubt you can always RTFM. So, go on. What are you waiting for? Rake migrate!

[*]My copy of the Agile book is dated August 2005, and while the Loud Thinking post announcing migrations went up on July 6th, I'm sure that was after the book had gone to press. Stupid dead tree-based publishing!

Tagged: , , ,

Posted by Greg at 2:22 AM | Comments (5)

March 1, 2006

learns_to use the ternary operator to make concise views

Rails' templating system has lots of advantages. It's simple, lightweight, and intuitive. And since it's still just Ruby, you can refer to things exactly the same way as you would anywhere else in your application. The intuition you've built up working in your models and controllers and script/console (you do you use script/console constantly while you're developing, don't you? If not, you're missing out on about half the fun of Rails) about how to call up your user's name will still hold: it's still just @user.name.

The one downside I've found is that the syntax for view logic can get kind of verbose, especially when you're doing repetitive things like displaying defaults for each piece of data you've got that's not set, like so: <h3>Name:</h3> <% if @user.name %> <%= @user.name %> <% else %> <em>[no name set]</em> <% end %> If you have to do that four or five times on a page, then all of a sudden you've got a template that is getting long and hard to follow.

Here's a quick little trick to handle that kind of situation much more concisely and clearly. It takes advantage of the fact that Ruby almost always provides multiple different ways to accomplish the same thing using different syntaxes. In this case, we'll use a very dense C-style if-then syntax that Ruby supports, which uses a ternary operator (the section on this syntax is well down that page, if you search for "c-style" it should pop right up). It looks like this: condition ? if_true : if_false or, in the case of our view logic from above: <h3>Name:</h3> <%= @user.name ? @user.name : "<em>[no name set]</em>" %>

How does this work? Ruby evaluates the expression to the left of the question mark. If that expression is true then it returns the expression between the question mark and the colon, if the expression is false, it returns what comes after the colon. Simple. And suddenly we've lost a lot of lines and lot of potential places for confusion from our views.

Like with anything else, there are a couple of small gotchas to look out for. First of all, you've got to make sure that the attribute you're evaluating on (in our case @user.name) will return either nil or false if it's not set. If it has any kind of default value, it will return true and then your conditional expression will do exactly the opposite of what you were expecting. The other danger here is making sure you use proper parentheses if you have any kind of a compound statement as the conditional as either of the outcome. Without them, things can get confusing in a hurry.

This may seem like a small change, but it will make every time you comeback to look at your views in the future and have to figure out what's going on in them from scratch more pleasant. It's also just the Rails thing to do: less code where possible, and code that feels 'beautiful' to you where it's not.

Tagged: , , , ,

Posted by Greg at 2:53 PM | Comments (1)

February 15, 2006

learns_to use acts_as_versioned

Lately, I've been learning Ruby on Rails. In layman's terms, Ruby on Rails is a system for making dynamic websites like Flickr, Friendster, and yes, Music For Dozens. To drastically oversimplify, some of the smart people have gone ahead and handled all of the common tasks like logging on, handling user input from forms, remembering data and displaying it when appropriate, etc. The idea is to make it easier to create a dynamic website by letting you focus just on the unique thing that you want your site to do and not all the boring and difficult infrastructure all such sites need.

Chris and I are writing a super-secret exciting new MFDZ project in Rails and, eventually, we'll be rewriting MFDZ itself in it. In the process, I've come across some opportunities to actually get paid for working on other people's Rails projects, so I've been devoting some semi-serious time to improving my skills. Predictably, I've encountered a few beginner-type problems to which I couldn't find great solutions online. Some of these I've been able to work through with the online help of some of the many strong Rails programmers in Portland and a few others I managed to solve with just my own wits. I figured it would be the neighborly thing to do to write down some of what I've learned here for the sake of the next poor soul who would otherwise Google-up empty on the same problem.

Thus, I introduce learns_to, a new category of post here at IDFDZ where I'll try to document a little bit of what I've learned about Rails as clearly and concretely as I can. If you're one of the people who emails me about my blog becoming too boring and "technical" or, more specifically, you're just not interested in Rails, I urge you to move on now. But, if you're me from earlier this week -- wondering how you'll ever get acts_as_versioned to work with your project when you barely know the difference between a plugin and a gem -- then hopefully you've come to just the right place.

What it is

Acts_as_versioned is a chunk of Rails code meant to help you keep track of, well, progressive versions of whatever it is your application keeps track of. The quintessential example is the way a wiki keeps track of edits made to a page's information. In fact, acts_as_versioned is great for giving wiki-like revisible user-edited content to any part of your Rails app.

Where to get it

As I alluded to in the introduction, there are two different ways to get your hands on acts_as_versioned: as a plugin or as a gem. RubyGems is Ruby's built-in packaging system. It has a number of conveniences including easy command line installation and updating. These characteristics make it definitely the best way to get Rails itself and to keep your local copy fresh. However, in the context of an add-on like acts_as_versioned, I would recommend using plugins whenever you have the opportunity. Plugins have the great advantage that they are automatically included by Rails itself when you start the server. You don't have to write any fancy setup code. That means not having to venture into environment.rb, or any other exotic places that can be scary for Rails newbies like me.

So, go ahead and download acts_as_versioned now (that link is to a .tgz, just what you'll need on the mac). Unzip it and move it into the /vendor/plugins folder in your Rails app and we'll get started using it.

Database setup

Before we get too far into the workings here, I should point out the documentation. It is not especially human readable, but it is a definitive reference for this stuff (most of what I'm about to say, I figured out by staring at the documentation for the Class Methods for a long time).

Anyway, the first thing we've got to do is get our system setup to use acts_as_versioned. This means database setup and some small changes to our models. Let's say we're doing a music app. We've got artists and we're representing them in an 'artists' table and a corresponding Artist class. Our artists have names and bios and, of course, ids. That means that our migration for creating our table looks like this (pre-versioning): create_table :artists do |t| t.column :name, :string t.column :bio, :text end (If you're not familiar with migrations, they're the system Rails provides for representing your database structure and, especially, changes you make to it in code. They are super convenient. The Understanding Migrations tutorial on the Rails wiki is a great place to get started with them and, if you have any further questions, the Rails migration documentation is comprehensive. The sooner you start using migrations, the sooner you'll fall in love with Rails. Note: When you use migrations to set up your database, Rails adds ids automatically wherever they belong, which is why I didn't specify one here.)

Now, acts_as_versioned is going to mirror our artist data into a parallel table called 'artist_versions'. Each row of that table will represent a subsequent state, or version, of each of our artists. So we need to create the artist_versions table with a version column, a foreign key to tell it which artist its keeping track of (in this case we'll call that one 'artist_id'), and a column for each bit of data we want to version from our original model; for now, let's just do bio. All this adds up to a migration that looks like this: Artist.create_versioned_table do |t| t.column :version, :integer t.column :artist_id, :integer t.column :bio, :text end

So the obvious thing to point out here is that we're using a new method "create_versioned_table" and that it's a method on the Artist model itself. In order for this to work, then, we're going to have to tell our Artist model that something's going on with versions. It's super easy; just one line: 'acts_as_versioned' within the Artist class. I like to keep it up near the relation declarations at the top so I don't lose track of it. Once we add that, the plugin does the rest of the work, adding a whole boatload of methods to our model including this migration method, create_versioned_table (and the converse we'll use in the down part of our migration: "Artist.drop_versioned_table"). Again, if you're doing this by hand, remember that our migration automatically adds an id to our table.

Assuming we've combined these two bits together into a migration and run it, we'll now have our database properly in place and we can start using all the methods the acts_as_versioned plugin has added to our Artist model.

Using the methods provided by acts_as_versioned

As you'll soon learn if you spend some time with the documentation, acts_as_versioned provides methods to do most things you can think of with it. I'm only going to go into detail here on the two that I think are most useful: revert_to, for rolling back to previous versions of your model and find_versions which is great for displaying old states of your data.

Let's start with find_versions. If @artist is an instance variable containing a particular artist then in our view we can do something like: <% for version in @artist.find_versions.reverse %> Version <%= version.version %> <br /> <%= link_to '(revert to this version)', :action => 'revert_to_version', :version_id => version.id, :artist_id => @artist %> <% end %> This view code iterates over all the saved versions of our artist (starting with the most recent and heading backwards) displaying the version number and then providing a link to revert to that version. Other similar methods will let you find specific versions that meet given criteria or to get at just the particular attributes that you're keeping versioned.

Now, let's look closer at that link_to call. It's calling a custom controller action called 'revert_to_version', passing in the id of the version we want to revert to and the id of our artist. We want this link to revert the artist we've got stored in @artist to the version whose id we're passing in. The controller code necessary to do this will use the revert_to method provided by acts_as_versioned, like so: def revert_to_version @artist = Artist.find( params[:artist_id] ) @artist.revert_to! params[:version_id] redirect_to :action => 'show', :id => @artist end All we're doing here grabbing ahold of our artist instance using the normal class 'find' method. Then we just call revert_to! (we use the conventional exclamation mark syntax to save as well as reverting) with the version_id as an argument and the old version of the artist's bio will now be saved in the right place in the artists table. One nice thing about using acts_as_versioned in this way is that it is non-destructive; all the more recent versions since the one to which we just reverted are still saved in our artist_versions table and we can always un-revert to them (if that's not too confusing).

And that is pretty much an introduction to acts_as_versioned. I've just scratched the surface of the subtle things you can do with it, especially when it comes on setting conditions for saving new versions. I've said it twice before, but a third time couldn't hurt: read the documentation. There's lots to learn.

Many thanks to Rick Olson, the author of acts_as_versioned, both for his great plugin and his rapid, clear, and helpful support when I was first trying to use it myself.

Tagged: , , , , , ,

Posted by Greg at 12:36 AM | Comments (16)