« July 2006 | Main | September 2006 »

August 28, 2006

learns_to build Academic Archive::Part 2:Setting up a New Rails App and a First Iteration on the Paper Model, Featuring our First Tests

Welcome to Part 2 of learns_to build Academic Archive, where I try to blog every last detail involved in building a Ruby on Rails application for publishing and peer-editing academic papers. As requested by Benjamin in the comments on Part 1, from now on, I'll be providing a table of contents to each post. So if you're looking for some specific piece of knowledge, you can jump right into the middle to get it. If you have any other ideas on how to make this series better, I'd love to hear about them in the comments.

Contents

  1. Creating a new Rails project
  2. Designing the Paper Model
  3. Setting Up the Database and Generating the Model
  4. Validating the Presence of Papers' Titles
  5. Getting Started with Testing: Fixtures
  6. Testing the Fixtures: Our First Test and First Test Helper
  7. Running Tests: Under Rake, Under Ruby
  8. How To Write a Test: Given, When, Then
  9. Philosophy of Testing

Well, we're airborne now. I posted Part 1 just before boarding a flight for LA and we just reached our cruising altitude.

At the end of Part 1, we'd thought our way through to a good starting design for the whole app and we were ready to start writing some real code. Specifically, we wanted to start with our central object: the Paper model. But before we write even our first line, we've got to do some setup and the tiniest bit more thinking.

Creating a new Rails project

First thing's first: run the "rails" command to generate the spine of a new Rails application in the file system:
gabc:~/Sites Greg$ rails archive I ran this command from my "Sites" directory where I keep all my projects. It will generate a new folder in there called "archive" and inside it will create a whole bunch of files and folders which constitute a fresh default Rails application.

If you cd into this directory and run "rails --version" you may find that you've got an old version of the framework (mine was at 1.2). Rails is a relatively new framework and it's undergoing a ton of rapid development. This is good because it means that new features get added all the time which make your job easier and old bugs get fixed. To take full advantage of this situation, we want to always be running the most recent version (as I write this it's 1.6). Thankfully all this takes is a single command: gabc:~/Sites/archive Greg$ rake rails:freeze:edge We're using Rake, the handy-dandy Ruby build utility. Rake automates common ruby programming tasks like creating, writing, and running files (especially tests). We'll be using Rake constantly in the setup and development of our app; to see all that it can do run "rake -T" and you'll see a list of all the available rake commands with their descriptions. This particular rake command makes sure that we're always running the most recently released version of Rails, going out and grabbing any new versions that come along. When you run it, you'll probably see a bunch of subversion changes scroll down your screen as the framework gets updated to the most recent version.

Now, I've got to confess that I did all of this setup so far at home last night. I knew that I'd be working without internet access while I was traveling and obviously commands like "rake rails:freeze:edge" have to go out over the wire to get their job done. Also, since I was going to be traveling, I wanted to grab a local copy of the Rails documentation which I normally use online. So, if you're working with dependable web access you might skip this step, but it's nice to know how for when you need it: gabc:~/Sites/archive Greg$ rake doc:rails Rake will go ahead and check to see if you've got any of the documentation, downloading it and installing it in your project's doc/api directory where you don't. It will take a good chunk of time and download a whole bunch of files.

Designing the Paper Model

Ok, we're good to go. Setup is done. We could start generating app-specific files and writing code right now if we wanted, but just the slightest bit more thinking and note-taking is probably in order first. We decided at the end of our last post that we were going start work by building papers and then the surrounding paper-approval-category relationship. What we didn't discuss was any of the specifics of the Paper model itself. What is a "paper" really? What attributes does it have? Is that really the right name for it? During the electronic blackout period of our ascent here, I sketched some answers in my moleskine. I'll explain them now.

Oops. Speaking of electronic blackouts, I lost battery power just as I polished off that last paragraph. I spent the rest of the flight into LA napping and reading. Not altogether unpleasant. Now, I'm in the corner of an LAX gate about 100 yards from where my flight will board, hunched over the only open outlet in the vicinity, trying to catch a quick charge before my flight for NY boards in 45 minutes.

Anyway, the last question that I asked in the air over Oregon may seem kind of nit-picky, but when it comes to domain modeling, the names we chose for things turn out to be surprisingly important. They should be expressive and unambiguous. We need be able to remember what they mean without confusion upon returning to our code after a long break. A good rule of thumb is: would this name make sense to someone who knows about the domain, but is not in any way a coder? For example, we could call our main object Article instead of Paper. Usage differs even within academia. In the humanities they tend to be papers when delivered at conferences and articles when printed in journals. Students and teachers think of them as papers. Engineers and scientists tend to lean towards papers as well -- for them "article" has a more formal ring to it. I chose paper instead of article because it has less linguistic ambiguity and talking about "an editor's articles" makes me think of parts of speech as much as written documents. You'll find as we go along that I do some hand wringing each time a new name needs to be coined. The process is even tougher when dealing with join models and other nouns that don't have a precise correlation to words in the real world (at work right now we're thinking about changing the name of a model from Batch to Batching because it really represents an event wherein some things are joined together into a batch. Both of those choices sound ugly and are confusing in different contexts).

So, what attributes does a Paper have? Here's a transcription of the sketch I made on my way in from Portland:

  • title
  • created_at
  • updated_at
  • url?
  • file_column?

The first attribute is pretty self-explanatory. The next two are time stamps; created_at tells you when the paper first entered our system and updated_at when it was last changed. These are pretty standard in database-driven web apps and if you include them on your models in a Rails app, Rails will automatically make sure that they get set in the way you'd expect.

A note here about attributes and the role of the database in a Rails app. So far, we've talked about our models in terms of the way they capture real world objects into the abstraction of our design. From another point of view, though, our models are simply representations in code of the database tables we're going to create. The database acts as persistent memory for our program. Here's how it works. At various points along the way, for example when we create a fresh object, the instance of our model will correspond exactly to the state of one row in our database. In concrete terms, if we wrote: thesis = Paper.create :title => "It's Not Just Academic" Then the object stored in "thesis" would correspond exactly with a row in the papers table. Each of its attribute-reader methods would return precisely the values of the corresponding columns in the database. Now, say we start changing the values of our paper's attributes like so: thesis.title = "It Is Just Academic" Well now the object we have in memory, the paper we're working with in our Ruby code, has diverged from the corresponding paper that we've got saved in the database. This will remain true until we call "save" -- at which point Rails will write our version of the object to the database updating each of the columns so they represent the current values of the attributes -- or "reload," which causes rails to revert the paper we've got in memory to the state that it has stored in the database, attributes will get reset to the values of their corresponding columns, whatever information we'd placed into those variables will be overwritten.

The last two attributes on our Paper model, url and file_column, represent two different ideas I had for keeping track of the location of the actual HTML files that our authors upload. The first and simpler of the two (the one I'll probably start with, in other words) is url. That would just be a string that keeps track of the location in the file system to which we uploaded the HTML file. Under this system, the part of our code that accepts uploads will have to be sure to record the uploaded-file's name so that we'll know where to look for it and how to link to it. The other option "file_column" represents an option I know a little less about, the File Column Plugin. I've never actually used it myself, but I've heard tell of a Rails plugin that allows you to store uploaded files in the actual database itself, handling all of the conversion code so that you can access the file from the database just as you would any other attribute stored there. That sounds intriguing and probably has important optimization repercussions (in other words, it probably plays a big part in determining what resource the application will consume most voraciously: memory on disk, database calls, processor time, etc.). Right now, storing the url as a string seems simpler to me so I'm going to start with that while making a note that the file column plugin is something I should look into more closely later.

Setting Up the Database and Generating the Model

Now that does it for theory and it's time to start actually coding our app (finally!). Wait. Wait. I just realized we've got one more small piece of configuration business to take care of: setting up and configuring the database. This bit is easy and once you've made a few Rails apps you'll be able to do it by rote. There are a ton of different combinations of databases, database engines, operating systems, etc. out there, so I'm just going to tell you what I have to do to get setup. If you're running on a contemporary Mac with a well-configured copy of MySQL things shouldn't be too different for you. If not, Google around, there are plenty of resources out there to help you get things right. Here we go:

First I've got to create the trio of databases on which a Rails app depends: development, test, and production. I'll do this from the command line: gabc:~/Sites/archive Greg$ mysql -p -u root (type your root password) mysql> create database archive_development; mysql> create database archive_test; mysql> create database archive_production; mysql> exit

Then, I'll open up config/database.yml and add my MySQL password to each of the three entires. Now we should be totally good to go. Serious this time. Let's run the server just to make sure: gabc:~/Sites/archive Greg$ mongrel_rails start -d Bringing up localhost:3000 in my browser I see: "Welcome aboard: You're riding the Rails!"

At last, it's time to get started on our Paper model. First I'll run the Rails model generator to get all of the files I'll need created and setup: gabc:~/Sites/archive Greg$ script/generate model Paper This'll give us, in addition to the model itself, a unit test and fixtures that are all set up and ready to go as well as a migration for setting up the database to handle our new model.

I'll write the migration next since we've basically done all the work already when thinking about what attributes our papers need to have. Here it is (archive/db/migrate/001_create_papers.rb): class CreatePapers < ActiveRecord::Migration def self.up create_table :papers do |t| t.column :title, :string t.column :url, :string t.column :updated_at, :datetime t.column :created_at, :datetime end end def self.down drop_table :papers end end The generator left me with empty self.up and self.down methods, which I've filled in to create the papers table with all the proper fields. Like I said above, the table that corresponds to our model is basically just another view on our model. When we save an individual Paper object the table will store the values that we've assigned to the object. And Rails provides us with convenient methods for reading them back out again. In a minute we'll get to using those, but first let's actually run our migration: gabc:~/Sites/archive Greg$ rake migrate Now the papers table exists and has the right fields. We can even go in right away and make a paper by hand if we want via Rails' "console", a shell the framework provide for interacting directly with our data. The console is a great place to sift through your data by hand or try out expressions when you're working on writing custom methods: gabc:~/Sites/archive Greg$ script/console >> thesis = Paper.new :title => "It's Not Just Academic" => #<Paper:0x26b6e5c @attributes={"updated_at"=>nil, "title"=>"It's Not Just Academic", "url"=>nil, "created_at"=>nil}, @new_record=true> >> Paper.count => 0 >> thesis.save => true >> Paper.count => 1 >> thesis => #<Paper:0x26b6e5c @attributes={"updated_at"=>Mon Aug 21 14:47:04 EDT 2006, "title"=>"It's Not Just Academic", "url"=>nil, "id"=>1, "created_at"=>Mon Aug 21 14:47:04 EDT 2006}, @new_record=false, @errors=#<ActiveRecord::Errors:0x2637a6c @base=#<Paper:0x26b6e5c ...>, @errors={}>> >> thesis.title => "It's Not Just Academic" If you follow along with that input, you'll see that I made a new paper with the title "It's Not Just Academic," storing it in a local variable called "thesis". Since I hadn't yet saved the new paper, there were still no papers to be found in the database. Then I did save it, which succeeded, returning true, and re-counted the papers in the database to discover that it was there now. Next, I looked at the object stored in thesis to find a paper different from the one I'd originally put there. It now had non-nil values for "created_at" and "updated_at" along with an additional instance variable by the name of @errors where Rails would store any errors that it happened upon while saving the object (you can read out the current errors on any object by saying something like this: thesis.errors.full_messages). And finally I used a method automatically added by Rails to read off the thesis's title attribute.

Validating the Presence of Papers' Titles

Ok. Now that we're past the total basics of getting our Paper model up and running, we can actually start doing something with it. What do we want the Paper model to do? Well, from when we thought about our screens earlier we know that when users upload papers they're going to be giving us two things: the title, and the HTML file. We're then going to need to store the title in the database, store the file in the filesystem, and store the file's location in the database as well, specifically in the url field we added to the papers table. It would be great if we could give the papers nice urls. For example, I'd love it if the url for my thesis could be something along the lines of: www.academicarchive.org/borenstein/art_history/its_not_just_academic.html. Now I don't want to think too hard about the "/borenstein/art_history" part right now because that's going to have to do with routing and right now I'm trying to concentrate on the Paper model. What I do know from this is that we don't want to save any papers into the database that don't have titles and we're going to want to figure out a system for making the titles our users give us safe to use as urls (there are rules about what can and can't be in a url, i.e. you can't have spaces, can't have apostrophes, they have to be under a certain length, etc.).

I want to take the first of these first: making sure that every paper we save in the database has a title. Thankfully, Rails makes this super easy with a system called validations. In essence, validations are just methods that automatically get run at different points in an object's life cycle (when you make a new one, when it gets saved, etc.), throwing errors unless the object meets certain criteria. When our app has actual views, we can use the validation errors to let our users know that they've done something wrong through on-screen feedback. At this point though, we're just going to use it to make sure that all of our papers have titles. The validation is a one-liner add, like so (in archive/app/models/paper.rb): class Paper < ActiveRecord::Base validates_presence_of :title end

What does the Rails' implementation of this validation actually look like in practice? Let's jump into script/console and find out: gabc:~/Sites/archive Greg$ script/console Loading development environment. >> thesis = Paper.new => #<Paper:0x2662e9c @attributes={"updated_at"=>nil, "title"=>nil, "url"=>nil, "created_at"=>nil}, @new_record=true> >> thesis.title => nil >> thesis.save! ActiveRecord::RecordInvalid: Validation failed: Title can't be blank from ./script/../config/../config/../vendor/rails/activerecord/lib/active_record/validations.rb:756:in `save!' from (irb):3 You can see that we built a new paper and didn't assign it a title. Then when we tried to save the paper, Rails raised an "ActiveRecord::Record Invalid" error that included a message explaining its cause and a traceback showing us exactly where in the code the problem came up (we called "save!" with the exclamation mark at the end because that tells Rails to throw an error in our face if one comes up instead of simply failing silently).

Getting Started with Testing: Fixtures

Now that we've finally written some actual code, our next job is to make sure that code actually works as we expect it to and that means tests. Testing is a big subject, but suffice it to say here that it has two main purposes: to make sure our code does what we think it does and to make it easy for us to change our code later on (if we make a major change and all the tests still pass, that's a good sign that the rest of our code still works; if they don't, well that means we've probably got some fixing to do). (Don't worry if you're totally new to testing and the whole concept seems a little fuzzy to you. It will become clear in a minute when we actually write our first test -- tests are one of those things, like spiral staircases, that are much easier to show than to describe.)

Anyway, for our tests to be most effective, we want to cover as much of our code as possible and that means starting right away. The more untested code you write the less likely you are to ever go back and add tests and the more likely you are to end up with confusing, unmaintainable code. In fact, some people insist that you should "test first," writing tests that define the behavior you want from your code before writing your code itself. That way you don't "overcode"; you make sure not only that your code works, but that it doesn't have any undesirable side effects. We may do some test first development a little later on, but right now we're in a simple enough situation that I'm perfectly happy to start testing with a whopping one line of existing code.

What do we want to test? We want to test that our code actually does require each paper to have a title like we're trying to get it to and, further, that a paper without a title will always throw an error. So, the first thing we need is some fake papers to play around with for testing. As part of its testing suite, Rails gives us a place to create these papers: the fixtures. You can think of fixtures as just like tables in the database, only they happen to be represented in a flat file. At the start of a test run, Rails loads the data in these files into a temporary testing database so you can access it in your test methods. This makes it perfect for creating different scenarios against which to run your code and make sure that it does the right thing. In our case, we're going to want to make some papers and see if our code can tell whether or not they're valid.

Rails already created our fixture file for us when we generated the Paper model, so let's open it up and take a look (it lives at test/fixtures/papers.yml): # Read about fixtures at http://ar.rubyonrails.org/classes/Fixtures.html first: id: 1 another: id: 2 Here's how this works: the non-indented lines are "names" by which we can refer to each entry. The other lines are pairs of column names and row values in the table. It will quickly become clear if I show you how I turned the version of my thesis we were playing with before in script/console into a fixture: thesis: id: 1 title: "It's Not Just Academic" created_at: 2006-08-21 09:34:28 updated_at: 2006-08-21 09:34:28

Pretty self-explanatory. The one gotcha is the format of the "created_at" and "updated_at" fields, which look different than what Ruby printed to the screen when we were in script/console. This is MySQL datetime format. When I can't remember how it goes, I make a new record in script/console and then just go look at my database using a GUI tool like YourSQL (especially when I'm on an airplane on the way from NY to San Francisco with no access to the web). There are a few other things that commonly go wrong when working with fixtures and I'll just point them out here, while we're on the subject: (1) the .yml format (rhymes with "camel") is super picky about white space; indentations need to be 2-spaces wide, there can only be one space between the colon and the value, etc. (2) each entry in a particular fixture file needs to have a unique id; if you accidentally re-use the same id twice in one file everything will go haywire. (3) the test database doesn't necessarily get reloaded each time you run your test, only if you run it under rake; sometimes this can get especially confusing because the fixtures that get loaded up for one test tend to stick around for the next one and so you can have tests that pass or fail depending on what order you run them in (for example a functional test that fails when you run "rake test:functionals" may pass if you run just "rake" (which runs the units first before the functionals)).

If you're totally new to tests, some of that may have just seemed like gibberish. Don't worry about it. You can always reread that paragraph if you're running into mysterious errors as some future point done the line. . .

Testing the Fixtures: Our First Test and First Test Helper

I'm back in Portland now and recovered from my travels. Where were we? That's right. We've got our fixture in place so it's time to write some tests! Before we try and test our actual code, though, it's probably a good idea to make sure that our fixture itself is well-formed, or else our tests will be pretty useless. I've got a little test helper method from some earlier projects that's super helpful for this (for full disclosure, like most things it was probably actually Chris's idea). If we want a method to be available to all our test, we just stick it in test/test_helper.rb, so that's where we'll stick the following code (there's a helpful little comment in test_helper.rb that will guide you once you once you're in there): def assert_all_valid klass klass.find(:all).each do |obj| assert obj.valid?, "#{obj.class} with id #{obj.id} is invalid" end end

Let's walk through this method. First of all, it takes a class as an argument. Since "class" itself is a reserved word (a word that has special properties in Ruby and is hence unavailable as a name for a normal variable) we call it "klass". We might as well have called it "bob," but "klass" is conventional because it's easy to remember what it means. Once that's understood, there's not too much else going on here. We use Rails' "find(:all)" syntax to find all the members of our class and then we assert the validity of each particular member in turn, printing out a helpful message if the object is not valid. When defining custom test_helper methods of your own you'll save yourself a lot of headaches if you add as specific as possible of an error message so that, when the test fails, it will be clear what went wrong as well as, importantly, which particular objects or attributes were involved (hence the inclusion of obj.class and obj.id in the message).

A note of syntactical explanation: Rails adds a method to our objects called "valid?" that returns true if the object passes its class's validations and false if not; "assert" is the simplest testing method, passing if its argument is true and failing if it is false. Put these two together and you've got a test that passes if and only if the object is valid.

Now, let's write and run the test. In the test for our Paper model that Rails automatically stubbed out for us (test/unit/paper_test.rb), we'll replace the sample method with: def test_fixtures assert_all_valid Paper end save the file and then run the tests like so: gabc:~/Sites/archive Greg$ rake test:units (in /Users/Greg/Sites/archive) /opt/local/bin/ruby -Ilib:test "/opt/local/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake/rake_test_loader.rb" "test/unit/paper_test.rb" Loaded suite /opt/local/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake/rake_test_loader Started F Finished in 0.186302 seconds. 1) Failure: test_fixtures(PaperTest) [./test/unit/../test_helper.rb:30:in `assert_all_valid' ./test/unit/../test_helper.rb:29:in `assert_all_valid' ./test/unit/paper_test.rb:7:in `test_fixtures']: Paper with id 2 is invalid. <false> is not true. 1 tests, 2 assertions, 1 failures, 0 errors rake aborted! Command failed with status (1): [/opt/local/bin/ruby -Ilib:test "/opt/local...] What's this? Our very first test and we've already failed it! Well, thanks to the message we added to our custom assertion, it's really easy to tell what's going on: we have an invalid paper in our fixtures (when tests fail or throw errors they print out Es and Fs and then report back on the problem with a trace, showing which lines in which files got run before the problem hit; if you're trying to track down a less obvious problem than this one, that trace will be your lifeline). If we look at our paper fixtures (test/fixtures/papers.yml), we'll see that, in addition to the paper we created above, we've got the second one that Rails automatically created for us still hanging around: another: id: 2 And that paper is definitely not valid. Remember, we're validating the presence of our papers' titles and this one hasn't got one. It's only got an id. So, in order to get this test to pass, we've got to either delete this paper from our fixture or edit it so it'll be valid. Let's do the latter, like so: another: id: 2 title: "Simulacra and Simulacrum" created_at: 1996-08-21 09:34:28 updated_at: 1996-08-21 09:34:28

Now, saving the file and rerunning should result in our first clean test run: gabc:~/Sites/archive Greg$rake db:test:prepare (in /Users/Greg/Sites/archive) rubygabc:~/Sites/archive Greg$ruby test/unit/paper_test.rb Loaded suite test/unit/paper_test Started . Finished in 0.194362 seconds. 1 tests, 2 assertions, 0 failures, 0 errors This is great! After a little bit of setup, we've successfully tested the code we just wrote: our validation catches papers that don't have titles.

Looking a little closer at the output from the test run, notice that we got credit for two assertions rather than just one. That's because rake counted the internal call to "assert obj.valid?" as well as the direct call to "assert_all_valid" itself. If we had two papers in our fixtures, rake would have told us we wrote three assertions, and so on.

Running Tests: Under Rake, Under Ruby

It is probably worthwhile to spend a moment here on some of the specifics involved when running tests. There are four basic ways to run tests: a "full rake", just the units (the tests that exercise our models), just the functionals (those that exercise our controllers), or individual test files one at time. The first three we do by invoking rake ("rake", "rake test:units", and "rake test:functionals" respectively) and the last we do by just running the test file as if it was any other ruby program ("ruby test/units/paper_test.rb", for example). When you run your tests, Rails uses a different database from the one you're developing on. If you remember some of the configuration we did above, when we set up our database.yml file, we told Rails to use a database called "archive_test" for this purpose. At the start of each run, rake clears that database and then loads it up with the data you stored in your fixtures so that you'll have a controlled environment in which to do your testing. Further, the Rails testing framework keeps the data generated in each test method from polluting your database for other methods. Each test method gets a clean start.

Besides running different sets of test files, each of the three different rakes (full, units, and functionals) does this database destroying and recreating process separately. So, if you run a full rake, your test database gets destroyed and recreated twice, once at the start of the rake when the units run and once halfway through before the functionals do. Since rake only loads up the tables that you tell it to (by including different sets of fixtures at the top of each of your test files) this ordering can mean that you can get different results from the same test! Let's say you were working on a functional test. When you run that test under rake test:functionals only the set of tables explicitly asked for in the functionals tests get loaded. Under a full rake the units run first, so by the time your tests get run, the tables created by the units will still be hanging around. If your tests passage or failure hinges on this difference, you'll see different behavior in the two situations. If you encounter this issue just make sure that each of your tests calls all of the fixtures that it needs (don't forget the ones being referenced through associations either!).

And finally when you're just running a single test like "ruby test/units/paper_test.rb" -- which can be a real time saver once you've got a lot of tests written and running the whole suite takes a full minute or two -- you don't have the benefit of rake's database loading at all. Your test will run with whatever the current state of the test database was leftover from your last rake. This can result in some seriously strange results that will have you chasing ghost bugs that aren't really there. To prevent that problem, simply run "rake db:test:prepare" before your test and rake will setup your test database just how you want it.

How To Write a Test: Given, When, Then

Now, while our first test definitely exercised the code we just wrote (the validation obviously got run), it plays kind of a more general role: guarding our paper fixtures from any invalid data. More to the point, if we stopped validating on the presence of a paper's title, the test would still pass (try it, go delete the whole line and then rerun your tests). Therefore, this can't quite be said to be a test on that validation as such. So, let's write one.

How, generally, do you write a test? Well, most tests have three parts: the setup that must be in place to accomplish some action, the actual code that runs the action (this is the code you're trying to test), and then some ideas about what we expect the effect of that action to be. Splitting these parts up in your mind and then addressing them one at a time usually makes it much easier to write a test. When I start my test methods, I find it helps to start by writing these parts down explicitly as comments so I can keep focused on exactly what I have to do (plus it lets me do a bunch of typing, which feels productive, without having to actually do any thinking), like so (in test/units/paper_test.rb): def test_validates_presence_of_title #given #when #then end Giving tests descriptive names is always a good idea since the whole point of them is that if you ever see them in a test run they should tell you exactly what's gone wrong. Rails will only run test methods that actually start with "test_", so a good recipe for naming tends to be appending some description of what you're testing onto there.

Back to the question of how to test our validation. Let's try to say the three parts of our test in words. Given a paper that has no title, when we try to save it, then the paper should throw an error, remain unsaved, and report itself invalid. Now, that's starting to sound like something I could write up in code. I'll give it a shot. Here's my first draft: def test_validates_presence_of_title #given p = Paper.new #when p.save! #then assert !p.valid? end I make a new paper. Don't assign it a title. Try to save it. And then assert that it is not valid. Just like I planned. What happens when I run that test? 1) Error: test_validates_presence_of_title(PaperTest): ActiveRecord::RecordInvalid: Validation failed: Title can't be blank /Users/Greg/Sites/archive/config/../vendor/rails/activerecord/lib/active_record/validations.rb:756:in `save!' ./test/unit/paper_test.rb:14:in `test_validates_presence_of_title' Oops! Trying to save the paper failed, like it was supposed to, but the error that it threw prevented the rest of our test from executing. What we need to do is wrap our save call in an assertion which knows to expect the error, like so: def test_validates_presence_of_title #given p = Paper.new #when assert_raises(ActiveRecord::RecordInvalid){p.save!} #then assert !p.valid? end This is a passing test. Assert_raises takes an error type as an argument (thankfully we knew exactly what type of error to expect since we'd already seen it on the first run) and passes only if the code in its block throws that error.

Now, I'll show you just one more iteration of this test with a few more trimmings: def test_validates_presence_of_title #given paper_count = Paper.count p = Paper.new assert !p.title #when assert_raises(ActiveRecord::RecordInvalid){p.save!} #then assert !p.valid? assert_equal paper_count, Paper.count end What have I added? Start with the first and last lines. One of the things we'd said we wanted to test was that the paper should remain unsaved. Well, there's two sides to that: the object's side and the model's side. We're already testing for the error thrown by the call to "save!", but now we want to test the model side, i.e. that the number of papers in the database doesn't change. To test that, we store the count of papers into a local variable (paper_count) on the first line and then compare it to a fresh count on the last line ("count" is a useful method that Rails adds to all of your model classes, it returns the result of Model.find(:all).length). As long as these two are the same, we'll know that nothing we've done has affected the count of papers in the database.

The other thing I've added is the assertion that, just after it is newly made, the paper does not have a title. While somewhat extraneous, the purpose of this assertion is to make explicit one of the assumptions in our given state: a new paper doesn't have a title. Since it's the very absence of that title that renders the paper invalid, it made sense to write an assertion verifying it before getting to the heart of the matter.

Philosophy of Testing

Is this overkill? This particular example is obviously somewhat contrived. I probably wouldn't be this thorough in testing such a simple situation if I wasn't trying to demonstrate the ins and outs of my thought process while writing tests. But what should our "philosophy of testing" be? Is it possible to have too many tests? What should be the thrust of the tests that we do write?

Like so many other things, answers to these questions are partially a matter of taste and partially a matter of responding to the particular situation you find yourself in, both of which are things that are hard to learn through any other method besides experience (I work all day with coders who are better at them than I), but I think I can lay down a few guidelines that help guide my thinking.

Let's start with some don'ts:

  • Don't test something that's part of the framework or a third-party library. If you don't trust other people's code enough to use it without redundant testing, you should probably just avoid using it altogether. Plus, this is just unnecessary extra work when the whole point of using libraries and frameworks is to avoid duplicating effort that other people have already put in. (To a certain extent we're breaking this rule in our test above, but not too badly. The key difference is that we're testing whether we've successfully used the framework to enforce a business logic rule (that papers must have titles) rather than whether or not the framework's code for enforcing that rule works in the first place.)
  • Don't let your tests lock down the specifics of your code too much. When I first got into the swing of writing tests, I got hooked on assertions. I wanted to run up the score, to see more dots zoom across my screen. And so for awhile, I picked up the bad habit of writing assertions on everything I knew to be true in my code: the exact wording of error messages, the exact values of a bunch of attributes in the fixtures, etc. This turned out to be a bad idea because it made my tests incredibly fragile. Anytime I'd twiddle around with my fixtures at all (say, to fix a typo), my tests would break. My tests were making more work for me when they were supposed to make my life easier. Which brings me to. . .
The dos:
  • Do write tests that ensure outcomes. Our goal with writing tests is to leverage a specific situation we've thought of (and, often, captured in the fixtures) into a general structure that will make sure that our code will act right in all situations. For example, in testing our validation above, we could have written something like this: def test_validates_presence_of_title #given p = Paper.new assert !p.valid? #when p.title = "My title" #then assert p.valid? end On the surface, this test seems a lot like the one we wrote above. It asserts that a paper without a title is not valid, adds a title to the paper, and then asserts that paper is valid. What it doesn't do is engage with the more general purpose of our validation: preventing papers that lack titles from getting saved to the database. It also has some specifics hard coded into it: the choice of "My title" as a title. While that seems fine right now, what if we made a change later on that, say, required all of our titles to be formatted in unicode for internationalization? Then this test would start to fail even though it was unrelated to the new code we were trying to write. It would become yet another spot in our code we had to change to add a new feature or to alter our design.
  • Do write tests first to specify behavior. Often times tests are just a better medium in which to think about the design for your program than the program itself. Writing a test lets your think precisely about what you want your test to do without worrying about how it's going to have to get it done. For example, take the goal I mentioned of having pretty urls for our papers (getting the url for my thesis to end with "its_not_just_academic.html"). Well, I still don't have a clear plan for how to accomplish that goal, but I know how to write a test on it: def test_paper_url #given p = Paper.new :title => "It's Not Just Academic" #when p.save! #then assert_equal "its_not_just_academic.html", p.url end Right now, running this test will result in a failure: 1) Failure: test_paper_url(PaperTest) [./test/unit/paper_test.rb:28]: <"its_not_just_academic.html"> expected but was <nil>. But now I've got the beginning of a kind of objective standard against which I can write my system for generating papers' urls from their titles. If this test (and presumably some others) passed then I would be done. Working this way lets me focus on making the individual parts of my code work without having to constantly be trying to remember what the point of all of this code was. The tests keep track of the larger context so I don't have to.
If all of these testing ideas seem a bit hypothetical to you right now, don't worry about it. Hopefully you'll be seeing them all in practice a lot as we continue work.

Speaking of which, now that we've got a basic Paper model, it's time for us to write some real screens with the forms and views that our users will interact with to upload their own papers. So, stay tuned until next time when we'll: gabc:~/Sites/archive Greg$ script/generate controller papers

Tagged: , , , , , , , , , , , , , ,

Posted by Greg at 1:46 PM | Comments (5)

August 18, 2006

learns_to build Academic Archive::Part 1:Introduction, Concept, and Design

At this year's FOSCON, Amy Hoy issued a clarion call to the elite Ruby hackers in the room: help the newbies! With the spectacular recent growth of Ruby and, especially, Rails there's a great and growing need for educational resources and infrastructure to help newcomers get acclimatized.

Since then, I've had a few ideas that might help. The first I blogged a while ago: The How-I-Learned-Ruby Quiz. The second one is more ambitious. I'd like to introduce it here.

One of the most useful experiences for me in the process of learning Rails was working side by side with a more advanced coder on a real project all the way from the first design sketches through the deployment of a working app. While the Agile Book tries to provide a version of it, this is an experience that is almost wholly unavailable to newbies. Most beginners are stuck trying to puzzle their way through with reference books, source code, and blogged code snippets. While these are sufficient for the experienced coder simply trying to pickup the new hip language, they just don't get the job done for true programming newcomers.

So, I therefore propose to provide some simulacrum of that experience here on this blog. I've got a project I want to build in Rails and as I'm doing so, I'll try to give you a view over my shoulder. I'll write about the practicalities and the philosophy behind each step. Eventually, I hope to make the code I write available through a public repository so you can follow along and even help out (as you'll see shortly, the app is public-minded and will, eventually, be open source).

To complete this introduction, I'm going to take you through all the steps I've gone through so far. . .everything up to actually writing code. First, I'll lay out the concept for the app, summarizing its purpose and aims. Then, I'll talk about design. In many ways the hardest and most interesting part of building a web app, design is the process by which we translate the real world inhabited by the app into abstracted models and relationships we can represent in code. I'll start with the basic screens I imagined when first thinking about how to make this website and then proceed through the first two iterations of "arrows and boxes" I've come up with so far.

Concept: Academic Archive1

Scholars of every rank and level regularly research and write papers which never see publication. Whether written by undergraduates or tenured professors, by amateur local enthusiasts or internationally renowned experts these papers represent a great wealth of research, insight, and argument which remains inaccessible to the wider community of scholars as well as the interested public at large.

Academic Archive seeks to provide a platform for publishing these papers on the web in order to make them universally accessible. The Archive will accept submissions from anyone regardless of qualifications. The Archive's back end will allow a network of volunteers to undertake cursory screening of submitted papers for plagiarism and to ensure that they meet a basic level of quality. The Archive's public website will organize and index these papers for convenient search and browsing.

Design

I don't know about other programmers, but when I'm first brainstorming about an idea for a new web app, the part of it that I can picture in my head is the screens. I can't necessarily see the specific style of how they look, but I can kind of get a sense of what different roles they'll have to play. It's like imagining your dream house. You might not know what color you'll paint it when it's done, but you know you want a hot tub, a racquetball court, a formal dining room, etc.

Anyway, here are the screens I first pictured for Academic Archive:

Author Upload Page
This will be where the whole process starts. Users will come here to upload their papers so obviously it will need a file upload form. Most users probably have their paper in Microsoft Word format, so we'll need to make some decisions about how to prepare papers for the web, In the long run we'd like allow them to be able to accept actual Word files and to process them into HTML ourselves. This would make things easiest for the users and allow us to ensure the best markup for our articles. In a first iteration, though, our goal is to eliminate or put off as much complexity as we possibly can, so we'll probably only accept papers already formatted as HTML (so this page will probably also have some instructions on how to convert Word files).
Editor Approval Page
If you're one of our volunteer editors, this will be your home base. You can see a queue of articles awaiting your approval. From here, you can read each of the articles and approve or reject them for publication.
Index of Papers
This is really a whole section of the site dedicated to browsing through lists of the published papers and searching for information contained in them. It's got a front page with either the most recent papers or some other selections. If you're a visitor, it takes you from arriving at the site all the way up to the point of clicking on a paper to read it. I haven't given this section too much thought since it is the least specific to the particular problem we're trying to address. Lot's of other sites on the web present browsing and searching interfaces and, at least to start out with, I'll probably steal one of those that I think is good.
Individual Paper View
This part couldn't be simpler. The papers come in as HTML and all we've got to do is remember where we stored those files and point the readers at them. Additionally, we may want to provide some location for people to discuss each paper, but again that's not within the scope of what we're doing right now. We're just trying to find simplest site we can build that will solve the problem as we set it out in the Concept. Other feature ideas are great and we'll try to keep them in mind as they come up so we don't make any design decisions that rule them out, but right now they are a dangerous distraction from getting the app onto a solid foundation so we'll put them aside.

Now that we've done some basic thinking about the types of things our app needs to do it's finally time to start thinking about how it's going to do them. That mean its time to "model our domain". Domain modeling is an incredibly deep subject and there are an endless number of books on the subject. In fact, I'm reading the one I hear is the best right now. In a nut shell, domain modeling is the process of building up in code a representation of the parts of the real world pertinent to your problem. The idea is to install in your program abstractions of the people and things you're working with (in our case authors, editor, papers, etc.) and to tie them together into the proper relationships. It's kind of a hard process to get a grasp on, but it will quickly come much clearer if we start work on our specific case.

I made my first design sketch at a pub while waiting for a friend to perform. The Concept was brand new and I was psyched about it. I was drinking a beer. Here's what I sketched in my moleskine:

Academic Archive first design sketch

The first thing I drew, and the part I was most confident about was the author-authorship-paper trio of models. I was confident about this idea because I stole it. (A word about the notation here: a word represents a model, simultaneously a particular type of thing or person we're trying to represent as well as an actual class in our code. The lines represent relationships between them, the stars a "many" relationship on one side about which more in a minute.)

These three models are trying to represent the idea that an author "owns" a paper. That is, that an author has_many papers and a paper belongs_to one author (when I mean these relationships in their technical Rails/relational database sense, I'll write them with underscores like this as they'll appear when they are actual Rails methods. One of the things that's so nice about Rails is the way it's natural syntax let's you kind of slide gradually into from natural language).

So where does authorship come in? Authorship is called a "join model" because it mediates the connection between authors and papers. Instead of asserting that an author has_and_belongs_to_many papers, we'll say that an author has_many authorships and has_many papers through authorship. Join models are helpful in a number of ways. How? Well, there were a couple of things wrong with the author-paper relationship we set out above. First of all, what if a paper has more than one author? In order to model this we've got to say that a paper has_and_belongs_to_many authors and vice versa, which, in the code, means adding a lot of difficulty to the average case just to handle some complexity which shows up on exceptional cases (papers with multiple authors). Never a good idea. Secondly, a join model lets us assign attributes to the relationship between the two things that it joins. So, with the authorship model, we could capture the idea that two authors don't have the same status on a paper, i.e. that one is a research assistant or something. That would be a very hard situation to handle with a normal has_and_belongs_to_many relationship.

What else is going on in this sketch? Well, down the right we've got some lines connecting author to person and from thence to editor with the inscription "STI?" nearby. The idea there was the following: we've got authors and we've got editors. But really, both of those are just different types of people. Single Table Inheritance (or STI) is a pattern that allows you to capture multiple roles that might be played by a certain type of entity while retaining the attributes that are always common across those roles. A common example might be people in a company: the same person could simultaneously be a manager, an employee, and a member of a committee. No matter what role they played they would still probably have a name, contact info, etc. so keeping them in a common table a lot of work could be saved. I won't go into too much depth on STI. When you get to the next design iteration you'll see why I decided to abandon it (or at least postpone thinking about it until later).

How far along was I after making this sketch? I had a pretty good list of the nouns (author, paper, editor, category) but not a very clear sense of their relationships. I knew the authorship join was a good idea because of having seen other smarter people model that exact same situation before. But I didn't really know how papers got into categories or what relationship editors had with them. You can see me brainstorming some ideas for how to solve these problems in the notes at the bottom of the page. I was trying to figure out how papers get assigned to editors for approval. I came up with the vague notion that papers get assigned to editors through categories, writing that papers "can be approved or not, etc. in many different categories." Though only beginning to come into focus, this idea turned out to be the key to unraveling the whole problem. But to get there, I needed help. So I brought in Chris.

On a lunch break from work one day, sitting at an outside table on Morrison between 10th and 11th eating mezzas at a Lebanese restaurant, I pulled out my notebook and started telling Chris about my design for Academic Archive. Very quickly he asked a number of highly clarifying questions and helped me tease out a much more robust design. Here's the sketch I made that day:

Academic Archive second design sketch (with Chris)

The author-authorship-paper relationship is there, but now there are a couple of whole new concepts on the board: approval and editorship. The main idea here is for how categorization would work. In plain language the idea is that a paper could be submitted for approval in any number of different categories. It would then gain membership in each category by gaining approval from each category's editor. So, for example, I might submit my thesis, It's Not Just Academic: The Academy of Motion Picture Arts and Sciences, in both Art History and Film Studies. It would then be subject to approval by two different editors and could end up published in both categories, one, or neither depending on what each of the editors thought of it.

How does the new modeling capture this concept of the paper-category-editor relationship? It does so with two overlapping join model relationships. First we added approval, which stands between papers and categories. A paper has_many approvals and has_many categories through approval (and likewise vice versa for categories). Like in the example of authorship, the presence of this join model gives us an opportunity to hang attributes on the relationship. Here, we'll likely want to keep track of which editor issued the approval and when it took place. Actually, if you look at the diagram that attribute will take the form of a full on relationship. An approval will belong_to an editor. And, in fact, approvals will join papers all the way to editors as well as to categories. This will make it a breeze to figure out all the papers an editor has approved. And, thinking about it a little more, the approval life cycle will likely be the spot where most of the action on the Editor Approval Page will take place. Not to get too deep into implementation details, but I can imagine a scenario where creating a new approval assigns a paper to its editor who then marks it as approved or rejected. We'll have to think this through more precisely at a later point, but it's probably good news that this structure seems so rich even at this early stage.

The one other thing to note before we move on to editorship is the fact that editors have a relationship to papers that is separate from categories. This seems like a good thing since it's easy to imagine a situation where category editors come and go over time. Keeping those relationships separate will mean that we can keep an accurate record of which editor actually approved a paper for a category rather than only knowing the current editor of the paper's category. Without this separation it would be really easy to lose track of the simple factoid: who approved this paper?

Our second join model, editorship, looks a lot like authorship. It's how editors gain the ability to approve papers for categories. It will be really easy to list the categories for which an editor has approval power -- handy when you're trying to build the Editor Approval Page.

What outstanding questions does this design leave us with? Well, beyond editor and author there's no larger sense of a person or a user. Like we thought of the first time through, at some point we're going to have to provide a common ancestor for editors and authors. It will be the place we'll stick the user's personal details as well as their authentication information. That stuff is easy to leave out for the moment since it's both totally unrelated to the specifics of our domain and easy to bolt on later with a third party plugin like acts_as_authenticated. More substantially, we'll be using the idea of a user to make sure that an editor isn't assigned to approve a paper she authored. That's an important rule to capture and I'm pretty sure our design makes it possible, but in the name of limiting complexity, I'm going to stick a pin in this issue and come back to it later once things are more real.

The other big issue we have is that there's no place to do admin type activities: how does an editor gain permission to add an editorship in a new category? Who is allowed to create categories? Again, we're aware that at some point we'll probably need an admin model which is related to our concept of users in the same way that our author and editor models are. Again, this will probably all be done with Single Table Inheritance. And again, we're going to put it off for a little while.

Well, it's starting to look like we have a pretty good feeling for how to build this app. Enough to get started anyway. While these unknowns we just reviewed might be disconcerting, in my experience they're pretty par for the course. What we need to do clarify them is to actually build some part of the app for real. If we do that right, it will provide a concrete basis for our thought process about these continuing questions and may even point us in the right direction for a solution.

What part of the app should we build first? When I look at the diagram, I want to start with the paper-approval-category relationship and, specifically with papers. That's the one zone that involves only objects that are unique to our domain; there are no accounts or plugins or anything else external involved at all. Plus, the heart of this app is taking uploaded papers and putting them on the web. If we get that right everything else should fall into place. Or at least, that's what we hope.

  1. This idea is and the resulting project will be a collaboration with Jem, my cousin and one of my partners in MFDZ. []
Tagged: , , , , , ,

Posted by Greg at 9:30 AM | Comments (4)

August 12, 2006

When Founders Die

What will happen when the visionary founders of today's most important technology companies die or retire?

While the question may seem especially timely with all of the hullabaloo around Bill Gates' recent retirement announcement, what I'm really interested in are the companies dedicated to the public good -- Wikipedia, Linux, Craiglist, etc.. Each of these entities was created by a single charismatic figure who held the goal of making some big positive difference in the world. Each depends on the coordination of a large community of volunteers for its continued functioning. And each is, in turn, depended on by millions of people around the world.

What would happen to each of these companies without their founders? What plan do they have in place to ensure their continuity? How realistic does that plan seem and how well does it line up with their values?

Let's start with Linux. The volunteer programmers that make up the Linux project are organized into a rigid hierarchy based on the different technical divisions of the software and culminating in Linus Torvalds, the project's Benevolent Dictator for Life. Torvalds has sole commit access to the Linux kernel trunk. In other words, he has final and absolute say over all changes made to the heart of the operating system. In practice, of course, Torvalds relies heavily on the project's core committers, technical experts deeply versed in the arcane corners of the code, whose changes may not be fully comprehensible even to him because of their obscurity. He doesn't read every printer driver and graphics card interface. While his technical decisions are occasionally questioned -- a few groups and individuals with ideas for radical different directions for the software have even split off from the main effort from time to time and there have been bitter fights over process -- no disagreement has ever been so heinous as to actually fracture the project at the cost of overall productivity and compatibility. The UNIX wars seem gone for good.

The obvious downside to the Benevolent Dictator for Life model is the high level of dependency it creates on the Dictator. Or, as one of Torvalds' chief lieutenants Andrew Morton phrased the problem in a lecture on IT Conversations : what would happen if Linus got hit by a bus?

Basically, the answer is that Morton, and others in the Linux leadership, would act as stewards in the short run while a "high commission" was convened to select a new dictator to take the reins.

This strategy seems to pose a couple of problems. Obviously, opposing candidates for Linux Dictator would represent, to a certain extent, different visions for the technical future of the project. Maybe one leader would emphasize the move towards the desktop and slicker GUI tools while another would be associated with attempts to optimize Linux for its ubiquitous life on the server. At this point the broad adoption of Linux in so many different environments -- exactly what makes its continued healthy development so important -- means that it also servant to many different masters including a large number of highly powerful corporate interests with big stakes in its direction (IBM, anyone?). What if the high council reached a stalemate or if its choice was unpopular with a large percentage of the Linux constituency?

If history teaches us anything, it's that transference of dictatorial power means major trauma for most societies. With changes of kings, whole nations have changed religion, gained allies, begun and ended wars. Since dictators' policies are generally not so subject to the will of the people (usually a popular dictator is just about as effective as an unpopular one) societies governed by them tend to do all their changing at once between regimes. This makes for long periods of relative stability (or stagnation) punctuated by brief bursts of violence and chaos.

Now, while I'm not predicting that Linux won't outlive Linus -- it is too important to too many for that to be likely -- I would argue that its social structure almost guarantees it a major shake up somewhere along the line.

What about Wikipedia and its founder, Jimmy Whales? In a recent talk at the Long Now Foundation, called Vision: Wikipedia and the Future of Free Culture, Whales explained the intricate collaborative processes involved in maintaining Wikipedia as well as the important role his dictatorial power has sometimes played in maintaining them -- like by promising to ban a large group of skin heads who were plotting to take over the site to twist it to their own message. Whales is skittish about the personal exercise of this kind of power and has gradually worked to wrap it in institutions that involve representative, democratically chosen, groups of Wikipedians in making all important decisions.

Basically, Whales is gradually transforming Wikipedia from a benevolent dictatorship into a constitutional monarchy, with the eventual goal of making of himself a figurehead. And he's said as much, here quoted from his own Wikipedia page: "I'm more like the Queen of England — my power is decreasing over time. Soon, I'll just wave at parades." In contrast to the Benevolent Dictator for Life model, Constitutional Monarchy is extremely messy. It means a contradiction between formal principles and practical reality: a sovereign monarchy overseeing a country that's actually governed by a democratic parliament.

In a way there's an even more interesting parallel between Wikipedia and Great Britain: their reliance on common law, a process whereby rather than intentionally writing down and formalizing them all at once in some variety of constitutional process, a society allows its rules to accrue gradually over time through the settlement of disputes by courts or magistrates. Both Wikipedia and Great Britain base the resolution of new questions on past consensus, which is, in turn, built on a core set of community values: the Magna Carta for Britain, for Wikipedia the Neutral Point of View.

So, what future does this structure bespeak for Wikipedia without Jimmy Whales? Well, constitutional monarchies aren't completely immune to the inheritance problem we discussed in the context of Linux. The death of the monarch will still likely mean significant instability, though in the best case scenario it could mean largely a cultural shakeup rather than a practical one. If Whales manages to make Wikipedia's emergent governing structure independent of him, he will still retain significant importance as a cheerleader and public image tender -- he's the voice we hear defending the site every time a new moral panic about its accuracy sweeps through the mainstream media. And this is not an insignificant role since the morale of Wikipedia's many editors and maintainers plays is so important to the site's quality. As far as I know, Whales hasn't named a successor or to put in place a system for picking one in an emergency. He should. No matter how he sees his importance in Wikipedia's day to day governance.

And, finally, what of Craigslist and its founder Craig Newmark? The organizational structure Newmark created may be the simplest, the most robust, and the most traditional. Craiglist is the town square: a place for merchants and shoppers to find each other, a forum for announcements and advertisements to find eyeballs, and a venue for barkers, yarn-spinners, and soap box-standers of all stripes to harangue passers-by. Like town squares, Craigslist's structure is pretty much the same everywhere, but its content is hyper local, mostly arranging transactions that take place in person: apartment rentals, job interviews, and dates, etc.

Because Craigslist's social function is so time-tested it requires of Craig little evangelism or politicking. He needn't explain to anyone the site's raison d'etre (it's obvious on first glance) or defend it from critics (it doesn't really have any). Instead, Craig is a little like an ascetic saint leading a simple life of sacrifice to set an example. He does customer support. He's like the public minded senior citizen (maybe an Elk) who organizes efforts to pick up litter and paint park benches.

In many ways, Craigslist seems the most sustainable of these three organizations. While Craig's passing would be mourned, little to none of the site's operation is dependent on him. Someone would still be required to make large scale decisions like which new cities to cover and how to pay growing hosting bills, but Craig has traditionally made those decisions simply by responding to demand. He puts up new sites for cities when enough people ask for them and he figured out how to pay the hosting bill by polling his users about what policy they wanted to see. A new leader could likely step into Craig's shoes with little fanfare or difficulty. There's always a town square whether or not the city father who first helped erect it is still around.

To conclude, I thought it would be interesting to set these three strategies for longevity off against the dominant one to which they present alternatives: the publicly owned for profit company. When I started writing this post, I intuitively included Google in the list of companies I planned to discuss. It took me a minute to figure out why it was the odd man out: while Google's mission seems broadly in the public interest, as a publicly held company its objective is by definition solely to make money for its stockholders.

And there lies the rub for using the publicly-traded company as a strategy for longevity. Over time, and especially in the absence of the charismatic founders, the markets will wear away any of the company's objectives which are not explicitly about profit making. After all, that's its job. While Google's one hundred billion dollars of market cap value do a pretty good job ensuring its existence for at least the next dozen or so generations, the ideas and principles that differentiate it from all the other companies in the Dow are necessarily on a timeline for destruction, or at least homogenization.

And as strategies for longevity go that doesn't seem like the greatest tradeoff.

Tagged: , , , , , , , , ,

Posted by Greg at 11:36 AM | Comments (6)

August 2, 2006

Blogging the Sh*t Out of PDX Pop Now! 2006

One of the many things that keeps me from writing more around here is the work I do on PDX Pop Now!, a free all-ages three day Portland music festival I helped found with a group of other Portland musicians and music fans three years ago.

That's why it was particularly satisfying this year that Mikey decided to add PDX Pop to UrHo's Blogging the Sh*t Out of. . . effort. BTSOO is this cool project where Mikey gets a couple of people to attend a large Portland event and then write about as much of it as is humanly possible. They started with this year's PDX Film Fest and PDX Pop was the second victim.

At first, we had these lofty ideas of giving all the bands and volunteers logins to write posts and of setting up a public blogging station where random people from the crowd could do the same, but there turned out, unsurprisingly to be more pressing practical thing to do to, you know, put on the actual festival. Plus, when it comes right down to it, who's dumb enough to want to be running back and forth to a laptop blogging twice an hour for all of a three day music festival?

Well, apparently, I am. I wrote 19 posts covering 44 of the 48 bands that played over the course of the weekend -- plus I cajoled a volunteer into writing one additional post about a band I didn't get to see -- making for more blogging than I've done here so far this year put together.

So, what did I take away from this binge bloggin? Well, there are all my new musical discoveries: Alela Diane, Evolutionary Jass Band, Please Step Out of the Vehicle, Thanksgiving, etc. Also, I think something about having to blog at festival speed was good for me. Trying to get down my ideas about bands in short between set breaks in a chaotic environment didn't leave any room for second thoughts or hesitation. While I like the longer more analytical pieces I write here, I could definitely stand to add this new ability as well.

Anyway, with no further delay, I present Blogging the Sh*t Out of. . .PDX Pop Now! 2006:

Friday

Saturday

Sunday

Tagged: , , , , , , ,

Posted by Greg at 6:55 PM | Comments (3)