Making Things See: A Book for O’Reilly about the Kinect

It’s been quiet around here for awhile, but not for me out in the real world. Since my last post here, I’ve finished my master’s thesis and graduated from ITP. Over the next couple of weeks I plan to catch up on documenting many of the projects from the last few months of the school year so keep watch here for more posts soon.

My time at ITP was an amazing two years that I can’t believe is already over. I spent so long looking forward to the program and planning for it that, in the end, it seemed to go by in a whirlwind. I met some of the most brilliant, creative, and kind people I’ve ever come across and I hope to be working and collaborating with many of them for years to come.

So, what comes next? I have two announcements to share about my future plans.

First off, I’m excited to announce that I’ve been contracted by O’Reilly to write a book about the Microsoft Kinect. The book is tentatively titled Making Things See: Computer Vision with the Microsoft Kinect, Processing, and Arduino. My goal is to introduce users to working with the Kinect’s depth camera and skeleton tracking abilities in their own projects and also to put those abilities in the wider context of the fields of gestural interfaces, 3D scanning, computer animation, and robotics. It is my belief that the Kinect is just the first in a generation of new computer vision technologies that are going to enable a vast number of programmers and creative technologists to build all kinds of amazing new applications and hacks. The key to building these applications will not simply be grokking the particular workings of the Kinect and its associated libraries as they happen to exist right now, but gaining a richer understanding of the wider areas of study that these new abilities unlock. Gaining the ability to access depth images and user skeleton data unlocks a door that opens into a vast territory that researchers and computer scientists have been exploring for some time now. My hope is to translate some of what those researchers have learned into a map that students, creative coders, and other newcomers like myself can use to make really cool stuff in this area.

Further, since the particular details of working with the Kinect are rapidly evolving with developments in the associated open source and commercial communities — and since the Kinect is unlikely to remain the only widely available depth camera for long — providing some broader context is the only way to ensure that the book stays relevant beyond the current micro-moment in the technological development cycle.

Making Things See is going to use the Processing creative coding environment for all of its examples. It will be targeted at people with some familiarity with Processing or equivalent programming experience (i.e. people who’ve read Casey Reas and Ben Fry’s Getting Started with Processing or Dan Shiffman’s excellent Learning Processing or something similar). I’ll start with the absolute basics of accessing the Kinect and work my way up to showing you how to build relatively sophisticated projects that take advantage of rich concepts from the application areas such as user-mimicking robotic armss, gestural interfaces that automate the occupational therapy exercise process, digitally-fabricated prints from 3D scans, and motion capture puppets that render beautiful 3D animations.

The book is tentatively scheduled to come out in physical form this holiday season with a digital Early Release this summer as soon as I have the first major section completed. If you have any requests for areas you’d like to see covered or any other constructive comments, I’d love to hear about them in the comments. You can also follow me on twitter: @atduskgreg for updates and announcements as things progress.

I’m also proud to announce that I’ve been selected as one of ITP’s Resident Researchers for the 2011-2012 school year. This means that I’ll be helping teach students as well as working on my own projects with support from the school. I’m especially excited to continue the transformation of ITP’s web curriculum I helped kick off this year. It looks like ITP is going to be using Ruby and Sinatra in nearly all of its web programming classes come the fall so I’ll be doing all I can to use my experience in this area to support those efforts. I’ll be working closely with Clay Shirky who heads up the web applications effort in the department as well as fellow resident Rune Madsen. In addition to the transition to Ruby we’ll be working to make it ever easier to build web-connected physical devices with the Arduino. With all the excitement around the Android ADK and the use of the iPhone headphone jack to connect arbitrary hardware it’s going to be a very exciting year for integrating physical computing projects with mobile devices and, hence, for networked objects in general. Clay, Rune, and I will be working with Tom Igoe and the rest of the ITP faculty to make it as easy-as-possible for students to dive into this area.

I’m sad my time as a student at ITP has come to an end, but I’m incredibly excited about these (and other still developing) prospects.

This entry was posted in Opinion. Bookmark the permalink.

0 Responses to Making Things See: A Book for O’Reilly about the Kinect

Oliver Rokison says:

June 7, 2011 at 12:36 pm

I’d really like to see some 3D scanning applications, especially if they can then link to a 3D printer.

greg says:

June 7, 2011 at 4:48 pm

Definitely planning to cover that. My outline includes an entire chapter dedicated to taking a scan, outputting it in a printable format, processing it with other applications and preparing it for fabrication using a variety of printing techniques.

ChuckEye says:

June 8, 2011 at 4:07 am

I did a residency with the Texas Learning & Computation Center at the University of Houston last year, and the end product of my work there was a Kinect based theremin-like instrument, using Processing for gesture tracking and passing out MIDI notes to Max/MSP for sound-making. There was definitely a lot more that I could have done with the project, because the possibilities are so wide-open.

Which Processing library are you using for Kinect work? I’d done all my stuff with Paul King’s library, but wanted to give Schiffman’s a look at some point. (His is cross-platform, yes? King’s was Mac-only)

greg says:

June 8, 2011 at 5:47 pm

That sounds like a cool project. Did you use skeleton tracking or just closest-point in the depth image?

I’m using Simple OpenNI: http://code.google.com/p/simple-openni It’s cross-platform, comes with a double-click installer for OpenNI and NITE and has pretty extensive support for most of the functionality exposed in OpenNI including extensive access to the skeleton data. I used Shiffman’s when I was first getting started with the Kinect back last October/November, but for the book I need something that has support for the skeleton data. Since Shiffman’s library is based on libfreenect, which doesn’t have a skeleton implementation, there’s no way for him to add it without duplicating a huge amount of the work done in simple-openni.

Shiffman is actually a professor of mine (I guess now technically one of my bosses since I graduated and am now working for the department as a resident researcher). He and I have talked some about merging his library with simple-openni into some kind of mega Processing kinect library that could use either libfreenect or OpenNI and have an API that was possibly more accessible than the raw OpenNI API exposed by simple-openni. That work is only in the very early brainstorming/chatting stages now, though.

ChuckEye says:

June 9, 2011 at 6:37 pm

I just used Casey’s “brightest pixel” algorithm (brute force) on the depth channel. I mapped the X coordinate to panning, the Y to pitch (on a fixed pentatonic scale), and the brightness to volume.

I also used an OpenGL 3D routine to generate stereo 3D visualization that I’ve scaled for both a 4k stereo projection theater and a 60″ 3D TV.

I’m interested in some of the skeletal stuff, so I’ll give OpenNI a look.

Allison Eve Zell says:

August 10, 2011 at 2:18 pm

Wow Greg. I’ve just come around to reading this… congrats! So looking forward to having you around my thesis year. See you in a few.