On Being Seen by the Machine

The light edged into view through my tightly shut eyelids. Waves of stippled white pulsed down towards a corona of laser pink before spilling into a liquid bokeh of soft hexagons. I tried to relax my eyelids against the involuntary movements of my pupils, briefly seeing strands of eyelash catch the light when my eyes threatened to open. I could feel the light on my skin as heat even though I knew the LEDs were dead cool. It swept across the center of my vision to settle somewhere low and right, crystalline geometries floating where my sight settled back to black.

The intensity wavered momentarily. I felt the invisible presence of the tech moving soundlessly somewhere in front of me. I listened to the chorus of humming coming from the printer room, a cacophony of mistuned exhaust fans and above it someone whistling semi-tunelessly.

How long had the scan been underway? The tech took seemingly endless pauses, saying nothing, the light of the scanner off or at least off my face. Just as I’d gird myself to open my eyes, the light would flare up and I’d squeeze them shut feeling the dots on my face shift subtly, their edges peeling away from the peaks of folding flesh. I straightened my back and willed my facial muscles to stay still. The flickering of the light made my eyes jitter involuntarily beneath their lids in a parody of REM sleep.

The tech had explained that the scanner could find the calibration dots, knew where they should be. So even if I shifted position it would be able to reassemble my face. If I moved too much, he said, the scanner would fail to put my face back together, leaving it a blend between the two expressions. I fought the urge to grimace, arch my eyebrow, or feel the itch tickling the side of my mouth.

I listened to the muffled sound of the shop manager berating a junior employee about links on a website. The light shut off again with finality. I relaxed and felt the inside of my face unstick from my skull as the muscles slackened.

I opened my eyes.

The tech was standing with his back to me, hunched over a laptop suspended on a chest-high shelf, his broad back a vivid swath of red across my vision as I blinked my eyes clear. His short sleeves revealed faded tattoos on his right wrist as it hovered over the computer.

“It will take a few minutes to process,” he said. “I wasn’t setup for you and this is the first time we’ve tried the full resolution.”

I stood up and peered past his shoulder at the screen. Thin strokes outlined a black cube over a pale blue background. The cube enclosed what looked like a picture of my face in three-quarter profile. My skin was rubbery and dead, shiny with flat reflections from a light that seemed to press in from all sides. It was mottled with what looked like lesions where the system had failed to fully fold the tracking dots back into the surrounding texture. A death mask.

The tech tilted my head down, bringing my forehead into closer view. At the hairline my forehead dissolved into a skerry of varicolored slashes, the lines of data from the laser forming a riotous coastline against the calm blue below. The tech toggled the texture on and off. Without it, my head was reduced to a dully gray, revealing the raw geometry. He rotated and probed as he ran various operations designed to close the thousands of small holes in my neck, whose lower reaches sprouted enough extending tendrils to make the whole assemblage resemble the lost android head of Philip K. Dick. My cheeks revealed a distorting ripple that the tech smoothed with a virtual brush.

Finishing cleanup operations the tech demonstrated how his new software could shell my head — expanding it from a volumeless surface into a thin solid. With some of the most fragile fringes paired away, this could even be printed in powder or resin by one of the humming behemoths in the next room. Samples of prints littered the shelves around the office: intricate assemblages in honey-colored translucent resin, a planetary gear and a working wrench in a strangely heavy dark dray plastic, and a few small heads and insects in the gleaming white of the powder printer.

Imagining my shelled face printed in this last material completed its imaginary transformation into a death mask. I could see the object like something out of Greek ritual, the third theatrical mask after comedy and tragedy: mundanity.

After a few more minutes at the controls, which for the first time in the session he punctuated with polite chatter about the sudden storm outside, the tech finished his work. He handed me back my key drive with the results of the scan. The full geometry both cooked and raw. The colored texture file which un-peeled the sides of my face to lay flat, like the skin of a Yakuza gangster mounted to display its tattoos in the Tokyo University museum.

I took the drive, thanked him, and turned to leave. A few steps on, I stopped at the door and looked back, suddenly confused, possessed by a strong sensation that I was somehow shoplifting, that I’d forgotten to pay some unasked cost. The tech disappeared around a sterile mounting table and back into his office, out of view. I turned and continued out and down towards the street. I pushed through a throng of students and out towards the rotating doors. I felt light, sure I’d lost something, but clueless as to what.

Note: this brief story attempts to describe an experience I had earlier this week having my face scanned with the new ZScanner 700 CX 24-bit color laser scanner at NYU’s Advanced Media Studio. Below I present a number of pictures captured in the course of that experience. While I think these images have a powerful visual impact, part of what I wanted to capture in the story was the profound non-visual nature of the experience. It was an experience not of seeing, but of being seen. Hence my decision to present these images at the bottom of this post rather than interspersed throughout. There’s an increasing body of writing that looks at these New Aesthetic or Robot-Readable World images in order to analyze their visual qualities in an attempt to understand how these new seeing technologies picture us. Here instead I wanted to capture an intimate experience of being the conscious subject of this new form of vision — in this case even when our eyes are literally closed in the process. I found the experience simultaneously meditative, reflective, and mildly alienating. I hope this piece of writing communicates some sense of that. Enjoy the pictures. I do.

Head scan with photo texture

IMG_1057

IMG_1054

IMG_1053

IMG_1044

Posted in Art | Leave a comment

Presenting at MakerFaire NYC This Weekend

MakerFaire NYC 2011 is this weekend. It’s a breathtakingly big event that’s equal parts science fair, renn fayre, and trade show from the near future. In a recent interview with Dale Dougherty Anil Dash spoke eloquently about the positive political meaning embodied in the “Maker Movement”, how it focuses us away from the conflict that’s so common in our culture towards the shared desire to figure out “what our country’s going to be when we grow up”.

From that sublime sentiment to the gloriously geeky opportunity to get my hands on the new Makerbot MK7 Stepstruder and the brand new 1.0 version of the Arduino IDE I’ve seen Massimo, David, and Tom working on frantically around ITP this week, I’m very proud to be participating in the Faire.

I’ll be giving two talks at MakeFaire this weekend. I’ll be presenting the Kinect Abnormal Movement Assessment System at the Health 2.0 tent. I’ll explain how the project came about, do my best to describe some of the science behind how it might be able to help, and announce some progress and plans for the near future (we have an intern!). This session will be Saturday morning at 11am.

I’m also be teaching a tutorial session at the ITP Cafe later in the day, starting at 4:30pm. I’ll cover an introduction to using the Kinect for skeleton tracking in Processing and a give little background about how it works. It’s a mini preview of some of the topics in my book.

Whether you’re geeking out or glorying in democratic optimism, I’ll hope to see you there!

Posted in Opinion | Leave a comment

Back to Work No Matter What: 10 Things I’ve Learned While Writing a Technical Book for O’Reilly

I’m rapidly approaching the midway point in writing my book. Writing a book is hard. I love to write and am excited about the topic. Some days I wake excited and can barely wait to get to work. I reach my target word count without feeling the effort. But other days it’s a battle to even get started and every paragraph requires a conscious act of will to not stop and check twitter or go for a walk outside. And either way when the day is done the next one still starts from zero with 1500 words to write and none written.

Somewhere in the last month I hit a stride that has given me the beginnings of a sense confidence that I will be able to finish on time and with a text that I am proud of. I’m currently preparing for the digital Early Release of the book which should happen by the end of the month, which is a big landmark that I find both exciting and terrifying. I thought I’d mark the occasion by writing down a little bit of what I’ve learned about the process of writing.

I make no claim that these ten tips will apply to anyone else, but identifying them and trying to stick by them has helped me. And obviously my tips here are somewhat tied in with writing the kind of technical book that I’m working on and would be much less relevant for a novel or other more creative project.

  1. Write everyday. It gets easier and it makes the spreadsheet happy. (I’ve been using a spreadsheet to track my progress and project my completion date based on work done so far.)
  2. Everyday starts as pulling teeth and then goes downhill after 500 words or so. Each 500 words is easier than the last.
  3. Outlining is easier than writing, if you’re stuck outline what comes next.
  4. Writing code is easier than outlining. if you don’t know the structure, write the code.
  5. Making illustrations is easier than writing code. If you don’t know what code to write make illustrations or screen caps from existing code.
  6. Don’t start from a dead stop. read, edit, and refine the previous few paragraphs to get a running start.
  7. If you’re writing sucky sentences, keep going, you can fix them later. Also they’ll get better as you warm up.
  8. When in doubt make sentences shorter. they will be easier to write and read.
  9. Reading good writers makes me write better. This includes writers in radically different genres from my own (DFW) and similar ones (Shiffman).
  10. Give yourself regular positive feedback. I count words as I go to see how much I’ve accomplished.

A note of thanks: throughout this process I’ve found the Back to Work podcast with Merlin Mann and Dan Benjamin to be…I want to say “inspiring”, but that’s exactly the wrong word. What I’ve found useful about the show is how it knocks down the process of working towards your goals from the pedestal of inspiration to the ground level of actually working every day, going from having dreams of writing a book to being a guy who types in a text file five hours a day no matter what. I especially recommend Episode 21: Assistant to the Regional Monkey. and the recent Episode 23: Failure is ALWAYS an Option. The first of those does a great job talking about how every day you have to start from scratch, forgiving yourself when you miss a day and not getting too full of yourself when you have a solid week of productivity. The second one speaks eloquently of the dangers of taking on a big project (like writing a book) as a “side project”. Dan and Merlin talked about the danger of not fully committing to a project like this. For my part I found these two topics to be closely related. I’ve found that a big part of being fully committed to the project is to forgive myself for failures — days I don’t write at all, days I don’t write as much as I want, sections of the book I don’t write as well as I know I could. The commitment has to be a commitment to keep going despite these failures along the way.

And I’m sure I’ll have plenty more of those failures in the second half of writing this book. But I will write it regardless.

Posted in kinect | Leave a comment

Physical GIF Launches on Kickstarter

I’m proud to announce the launch of Physical GIF on Kickstarter. Physical GIF is a collaboration with Scott Wayne Indiana to turn animated GIFs into table top toys. We use a laser cutter and a strobe light to produce a kind of zoetrope from each animated GIF so you can watch it on your coffee table. Here’s our Kickstarter video which explains the whole process and shows you what they look like in action:

For our Kickstarter campaign we have four main pledge levels. At $50 you get a Physical GIF along with everything you need to play it at home: the strobe, the plastic GIF disc and frames, and the hardware. You can choose from three designs that scott created, BMX Biker:

Elephant-Rabbit Costume Party:

and New York Fourth of July:

For a $100 pledge, we’ll send you a kit with all three of these Physical GIFs.

We’ve also recruited four amazing animated GIF artists to design special limited edition Physical GIFs: Ryder Ripps, Nullsleep, Sara Ludy, and Sterling Crispin. More info about these artists on our project page At $250, you can reserve one of the Physical GIFs from any of these artists. We’re going to be working with them to explore materials and techniques for turning their designs into Physical GIFs. We’re hoping that they explore some of the limitations and possibilities of this new medium. Each of the Physical GIFs they produce will come in a limited numbered edition with documentation from the artist.

And at the top pledge level, we’ll work with you directly to manufacture your own custom Physical GIF from your design. We’ve only made five of this reward available because we want to be able to spend as much time as it takes working with you to turn your animated GIF ideas into physical reality.

We’re incredibly excited about this project and can’t wait to see how people react to it. Head over to Kickstarter right now to give us some help: Physical GIF on Kickstarter. Thanks!

Posted in Art, Business, Physical GIF | Leave a comment

Two Kinect talks: Open Source Bridge and ITP Camp

In the last couple of weeks, I’ve given a couple of public presentations about the Kinect. This post will be a collection of relevant links, media, and follow up to those talks. The first talk, last week, was in Portland, Oregon at Open Source Bridge. It was a collaboration with Devin Chalmers, my longtime co-conspirator. We designed out talk to be as much like a circus as possible. We titled it Control Emacs with Your Beard: the All-Singing All-Dancing Intro to Hacking the Kinect.

Control Emacs with Your Beard: the All-Singing All-Dancing Intro to Hacking the Kinect

Devin demonstrates controlling Emacs with his “beard”.

Our first demo was, as promised in our talk title, an app that let you control Emacs with your “beard”. This app included the ability to launch Emacs by putting on a fake beard, to generate all kinds of very impressive looking C code by waving your hands in front of you (demonstrated above), and to quit Emacs by removing your fake beard. Our second app sent your browser tabs to the gladiator arena. It let you spare or execute (close) each one by giving a caesar-esque thumbs up or thumbs down gesture. To get you in the mood for killing it also played a clip from Gladiator each time you executed a tab.

Both of these apps used the Java Robot library to issue key strokes and fire off terminal commands. It’s an incredibly helpful library for controlling any GUI app on your machine. All our code (and Keynote) is available here: github/osb-kinect. Anyone working on assistive tech (or other kinds of alternative input to the computer) with gestural interfaces should get to know Robot well.

In addition to these live demos, we also covered other things you can do with the Kinect like 3D printing. I passed around the Makerbot-printed head of Kevin Kelly that I made at FOO camp:

Kevin Kelly with 3D printed head

Kevin Kelly with a tiny 3D printed version of his own head.

We also showed Nicholas Burrus’s Kinect RGB Demo app which does all kinds of neat things like scene reconstruction:

Control Emacs with Your Beard: the All-Singing All-Dancing Intro to Hacking the Kinect

Me making absurd gestures in front of a reconstructed image of the room

Tonight I taught a class at ITP Camp about building gestural interfaces with the Kinect in Processing. It had some overlap with the Open Source Bridge talk. In addition to telling the story of the Kinect’s evolution, I showed some of the details of working with Simple OpenNI’s skeleton API. I wrote two apps based on measuring the distance between the user’s hands. The first one simply displayed the distance between the hands in pixels on the screen. The second one used that distance to scale an image up and down and the location of one of the hands to position that image: a typical Minority Report-style interaction.

The key point was: all you really need to make something interactive in a way that the user can viscerally understand is a single number that tightly corresponds to what you’re doing as the user. With just that ITP-types can make all kinds of cool interactive apps.

The class was full of clever people who asked all kinds of interesting questions and had interesting ideas for ways to apply this stuff. I came away with a bunch of ideas for the book, which is helpful because I’m going to be starting the skeleton tracking chapter soon.

Of course, all of the code for this project is online in the
ITP Camp Kinect repo on Github. That repo includes all of the code I showed as well as a copy of my Keynote presentation.

Posted in kinect | Leave a comment

Into The Matrix: Proposal for a Platform Studies Approach to OpenGL

In the last few years, new media professors Ian Bogost (Georgia Tech) and Nick Montfort (MIT) have set out to advance a new approach to the study of computing. Bogost and Montfort call this approach Platform Studies:

“Platform Studies investigates the relationships between the hardware and software design of computing systems and the creative works produced on those systems.”

The goal of Platform Studies is to close the distance between the thirty thousand foot view of cultural studies and the ant’s eye view of much existing computer history. Scholars from a cultural studies background tend to stay remote from the technical details of computing systems while much computer history tends to get lost in those details, missing the wider interpretative opportunities.

Bogost and Montfort want to launch an approach that’s based “in being technically rigorous and in deeply investigating computing systems in their interactions with creativity, expression, and culture.” They demonstrated this approach themselves with the kickoff book in the Platform Studies series for MIT Press:
Racing the Beam: The Atari Video Computer System. That book starts by introducing the hardware design of the Atari and how it evolved in relationship to the available options at the time. They then construct a comprehensive description of the affordances that this system provided to game designers. The rest of the book is a history of the VCS platform told through a series of close analyses of games and how their creators co-evolved the games’ cultural footprints with their understanding of how to work with, around, and through the Atari’s technical affordances.

Bogost and Montfort have put out the call for additional books in the Platform Studies series. Their topic wish list includes a wide variety of platforms from Unix to the Game Boy to the iPhone. In this post, I would like to propose an addition to this list: OpengGL. In addition to arguing for OpenGL as an important candidate for inclusion in the series, I would also like to present a sketch for what a Platform Studies approach to OpenGL might look like.

According to Wikipedia, OpenGL “is a standard specification defining a cross-language, cross-platform API for writing applications that produce 2D and 3D computer graphics.” This dry description belies the fact that OpenGL has been at the center of the evolution of computer graphics for more than 20 years. It has been the venue for a series of negotiations that have redefined visuality for the digital age.

In the introduction to his seminal study, Techniques of the Observer: On Vision and Modernity in the 19th Century, Jonathan Crary describes the introduction of computer graphics as “a transformation in the nature of visuality probably more profound than the break that separates medieval imagery from Renaissance perspective”. Crary’s study itself tells the story of the transformation of vision enacted by 19th century visual technology and practices. However, he recognized that, as he was writing in the early 1990s, yet another equally significant remodeling of vision was underway towards the “fabricated visual spaces” of computer graphics. Crary described this change as “a sweeping reconfiguration of relations between an observing subject and modes of representation that effectively nullifies most of the culturally established meanings of the term observer and representation.”

I propose that the framework Crary laid out in his analysis of the emergence of modern visual culture can act as a guide in understanding this more recent digital turn. In this proposal, I will summarize Crary’s analysis of the emergence of modern visual culture and try to posit an analogous description of the contemporary digital visual regime of which OpenGL is the foundation. In doing so, I will constantly seek to point out how such a description could be supported by close analysis of OpenGL as a computing platform and to answer the two core questions that Crary poses of any transformation of vision: “What forms or modes are being left behind?” and “What are the elements of continuity that link contemporary imagery with older organizations of the visual?” Due to the nature of OpenGL, this analysis will constantly take technical, visual, and social forms.

As a platform, OpenGL has played stage to two stories that are quintessential to the development of much 21st century computing. It has been the site of a process of industry standardization and it represents an attempt to model the real world in a computational environment. Under close scrutiny, both of these stories reveal themselves to be tales of negotiation between multiple parties and along multiple axes. These stories are enacted on top of OpenGL as what Crary calls the “social surface” that drives changes in vision:

“Whether perception or vision actually change is irrelevant, for they have no autonomous history. What changes are the plural forces and rules composing the field in which perception occurs. And what determines vision at any given historical moment is not some deep structure, economic base, or world view, but rather the functioning of a collective assemblage of disparate parts on a single social surface.”

As the Wikipedia entry emphasized, OpenGL is a platform for industry standardization. It arose out of the late 80s and early 90s when a series of competing companies (notably Silicon Graphics, Sun Microsystems, Hewlett-Packard, and IBM) each brought incompatible 3D hardware systems to market. Each of these systems were accompanied by their own disparate graphics programming APIs that took advantage of the various hardware systems’ different capabilities. Out of a series of competitive stratagems and developments, OpenGL emerged as a standard, backed by Silicon Graphics, the market leader.

The history of its creation and governance was a process of negotiating both these market convolutions and the increasing interdependence of these graphics programming APIs with the hardware on which they executed. An understanding of the forces at play in this history is necessary to comprehend the current compromises represented by OpenGL today and how they shape the contemporary hardware and software industries. Further OpenGL is not a static complete system, but rather is undergoing continuous development and evolution. A comprehensive account of this history would represent the backstory that shapes these developments and help the reader understand the tensions and politics that structure the current discourse about how OpenGL should change in the future, a topic I will return to at the end of this proposal.

The OpenGL software API co-evolved with the specialized graphics hardware that computer vendors introduced to execute it efficiently. These Graphical Processing Units (GPUs) were added to computers to make common graphical programming tasks faster as part of the competition between hardware vendors. In the process the vendors built assumptions and concepts from OpenGL into these specialized graphics cards in order to improve the performance of OpenGL-based applications on their systems. And, simultaneously, the constraints and affordances of this new graphics hardware influenced the development of new OpenGL APIs and software capabilities. Through this process, the GPU evolved to be highly distinct from the existing Central Processing Units (CPUs) on which all modern computing had previously taken place. The GPU became highly tailored to the parallel processing of large matrices of floating point numbers. This is the fundamental computing technique underlying high-level GPU features such as texture mapping, rendering, and coordinate transformations. As GPUs became more performant and added more features they became more and more important to OpenGL programming and the boundary where execution moves between the CPU and the GPU became one of the central features in the OpenGL programming model.

OpenGL is a kind of pidgin language built up between programmers and the computer. It negotiates between the programmers’ mental model of physical space and visuality and the data structures and functional operations which the graphics hardware is tuned to work with. In the course of its evolution it has shaped and transformed both sides of this negotiation. I have pointed to some ways in which computer hardware evolved in the course of OpenGL’s development, but what about the other side of the negotiation? What about cultural representations of space and visuality? In order to answer these questions I need to both articulate the regime of space and vision embedded in OpenGL’s programming model and also to situate that regime in a historical context, to contrast it with earlier modes of visuality. In order to achieve these goals, I’ll begin by summarizing Crary’s account of the emergence of modern visual culture in the 19th century. I believe this account will both provide historical background as well as a vocabulary for describing the OpenGL vision regime itself.

In Techniques of the Observer, Crary describes the transition between the Renaissance regime of vision and the modern one by contrasting the camera obscura with the stereograph. In the Renaissance, Crary argues, the camera obscura was both an actual technical apparatus and a model for “how observation leads to truthful inferences about the world”. By entering into its “chamber”, the camera obscura allowed a viewer to separate himself from the world and view it objectively and completely. But, simultaneously, the flat image formed by the camera obscura was finite and comprehensible. This relation was made possible by the Renaissance regime of “geometrical optics”, where space obeyed well-known rigid rules. By employing these rules, the camera obscura could become, in Crary’s words, an “objective ground of visual truth”, a canvas on which perfect images of the world would necessarily form in obeisance to the universal rules of geometry.

In contrast to this Renaissance mode of vision, the stereograph represented a radically different modern visuality. Unlike the camera obscura’s “geometrical optics”, the stereograph and its fellow 19th century optical devices, were designed to take advantage of the “physiological optics” of the human eye and vision system. Instead of situating their image objectively in a rule-based world, they constructed illusions using eccentricities of the human sensorium itself. Techniques like persistence of vision and stereography manipulate the biology of the human perception system to create an image that only exists within the individual viewer’s eye. For Crary, this change moves visuality from the “objective ground” of the camera obscura to posses a new “mobility and exchangability” within the 19th century individual. Being located within the body, this regime also made vision regulatable and governable by the manipulation and control of that body and Crary spends a significant portion of Techniques of the Observer teasing out the political implications of this change.

But what of the contemporary digital mode of vision? If interactive computer graphics built with OpenGL are the contemporary equivalent of the Renaissance camera obscura or 19th century stereography, what mode of vision do they embody?

OpenGL enacts a simulation of the rational Renaissance perspective within the virtual environment of the computer. The process of producing an image with OpenGL involves generating a mathematical description of the full three dimensional world that you want to depict and then rendering that world into a single image. OpenGL contains within itself both the camera obscura, its image, and the world outside its walls. OpenGL programmers begin by describing objects in the world using geometric terms such as points and shapes in space. They then apply transformations and scaling to this geometry in absolute and relative spatial coordinates. They proceed to annotate these shapes with color, texture, and lighting information. They describe the position of a virtual camera within the three dimensional scene to capture it into a two dimensional image. And finally they animate all of these properties and make them responsive to user interaction.

To extend Crary’s history, where the camera obscura embodied a “geometric optics” and the stereograph a “physiological optics”, OpenGL employs a “symbolic optics”. It produces a rule-based simulation of the Renaissance geometric world, but leaves that simulation inside the virtual realm of the computer, keeping it as matrices of vertices on the GPU rather than presuming it to be the world itself. OpenGL acknowledges its system is a simulation, but we undergo a process of “suspension of simulation” to operate within its rules (both as programmers and as users of games, etc. built on the system). According to Crary, modern vision “encompasses an autonomous perception severed from any system”. OpenGL embodies the Renaissance system and imbues it with new authority. It builds this system’s metaphors and logics into its frameworks. We agree to this suspension because the system enforces the rules of a Renaissance camera obscura-style objective world, but one that is fungible and controllable.

The Matrix is the perfect metaphor for this “symbolic optics”. In addition to being a popular metaphor of a reconfigurable reality that exists virtually within a computer, the matrix is actually the core symbolic representation within OpenGL. OpenGL transmutes our description of objects and their properties into a series of matrices whose values can then be manipulated according to the rules of the simulation. Since OpenGL’s programming models embeds the “geometric optics” of the Renaissance within it, this simulation is not infinitely fungible. It posses a grain towards a set of “realistic” representational results and attempting to go against that grain requires working outside the system’s assumptions. However, the recent history of OpenGL has seen an evolution towards making its system itself programmable, loosening these restrictions by providing programmers ability to reprogram parts of its default pipeline themselves in the form of “shaders”. I’ll return to this topic in more detail at the end of this proposal.

To illustrate these “symbolic optics”, I would conduct a close analysis of various components of the OpenGL programming model in order to examine how they embed Renaissance-style “geometric optics” within OpenGL’s “fabricated visual spaces”. For example, OpenGL’s lighting model with its distinction between ambient, diffuse, and specular forms of light and material properties would bear close analysis. Similarly, I’d look closely at OpenGL’s various mechanisms for representing perspective, from the depth buffer,to its various blending modes and fog implementation. Both of these topics, light and distance, have a rich literature in the history of visuality that would make for a powerful launching point for this analysis of OpenGL.

To conclude this proposal, I want to discuss two topics that look forward to how OpenGL will change in the future both in terms of its ever-widening cultural application and the immediate roadmap for the evolution of the core platform.

Recently, Matt Jones of British design firm Berg London and James Bridle of the Really Interesting Group, have been tracking an aesthetic movement that they’ve been struggling to describe. In his post introducing the idea, The New Aesthetic, Bridle describes this as a “a new aesthetic of the future” based on seeing “the technologies we actually have with a new wonder”. In his piece, Sensor-Vernacular, Jones describes it as “an aesthetic born of the grain of seeing/computation. Of computer-vision, of 3d-printing; of optimised, algorithmic sensor sweeps and compression artefacts. Of LIDAR and laser-speckle. Of the gaze of another nature on ours.”

What both Jones and Bridle are describing is the introduction of a “photographic” trace of the non-digital world into the matrix space of computer graphics. Where previously the geometry represented by OpenGL’s “symbolic optics” was entirely specified by designers and programmers working within its explicit affordances, the invention of 3D scanners and sensors allows for the introduction of geometry that is derived “directly” from the world. The result is imagery that feel made up of OpenGL’s symbols (they are clearly textured three dimensional meshes with lighting) but in a configuration different from what human authors have previously made with these symbols. However these images also feel dramatically distinct from traditional photographic representation as the translation to OpenGL’s symbolic optics is not transparent, but instead reconfigures the image along lines recognizable from games, simulations, special effects, and the other cultural objects previously produced on the OpenGL platform. The “photography effect” that witnessed the transition from the Renaissance mode of vision to the modern becomes a “Kinect effect”.

A full-length platform studies account of OpenGL should include analyses of some of these Sensor-Vernacular images. A particularly good candidate subject for this would be Robert Hodgin’s Body Dysmophic Disorder, a realtime software video project that used the Kinect’s depth image to distort the artist’s own body. Hodgin has discussed the technical implementation of the project in depth and has even put much of the source code for the project online.

Finally, I want to discuss the most recent set of changes to OpenGL as a platform in order to position them within the framework I’ve established here and sketch some ideas of what issues might be in play as they develop.

Much of the OpenGL system as I have referred to it here assumes the use of the “fixed-function pipeline”. The fixed-function pipeline represents the default way in which OpenGL transforms user-specified three dimensional geometry into pixel-based two dimensional images. Until recently, in fact, the fixed-function pipeline was the only rendering route available within OpenGL. However, around 2004, with the introduction of the OpenGL 2.0 specification, OpenGL began to make parts of the rendering pipeline itself programmable. Instead of simply abiding by the logic of simulation embedded in the fixed-function pipeline, programmers began to be able to write special programs, called “shaders”, that manipulated the GPU directly. These programs provided major performance improvements, dramatically widened the range of visual effects that could be achieved, and placed programmers in more direct contact with the highly parallel matrix-oriented architecture of the GPU.

Since their introduction, shaders have gradually transitioned from the edge of the OpenGL universe to its center. New types of shaders, such as geometry and tessellation shaders, have been added that allow programmers to manipulate not just superficial features of the image’s final appearance but to control how the system generates the geometry itself. Further in the most recent versions of the OpenGL standard (versions 4.0 and 4.1) the procedural, non-shader approach, has been removed entirely.

How will this change alter OpenGL’s “symbolic optics”? Will the move towards shaders remove the limits of the fixed-function pipeline that enforced OpenGL’s rule-based simulation logic or will that logic be re-inscribed in this new programming model? Either way how will the move to shaders alter the affordances and restrictions of the OpenGL platform?

To answer these questions, a platform studies approach to OpenGL would have to include an analysis of the shader programming model, how it provides different aesthetic opportunities than the procedural model, how those differences have shaped the work made with OpenGL as well as the programming culture around it. Further, this analysis, which began with a discussion of standards when looking at the emergence of OpenGL would have to return to that topic when looking at the platform’s present prospects and conditions in order to explain how the shader model became central to the OpenGL spec and what that means for the future of the platform as a whole.

That concludes my proposal for a platform studies approach to OpenGL. I’d be curious to hear from people more experienced in both OpenGL and Platform Studies as to what they think of this approach. And if anyone wants to collaborate in taking on this project, I’d be glad to discuss it.

Posted in kinect | Leave a comment

All Watched Over: On FOO, Cybernetics, and Big Data

Last weekend I had the privilege to attend FOO Camp. FOO is a loosely structured conference organized by O’Reilly, my publisher. At FOO, O’Reilly brings together a couple hundred people they think have an interesting perspective on contemporary goings on in the world of technology. Having all these people together gives O’Reilly a chance to take the pulse of the contemporary tech world and to create relationships with some of the people shaping that world. Being asked to attend is a privilege both because the invitation means that O’Reilly thinks you’re interesting enough to contribute and also because actually attending means you get to share this perspective — you get your own chance to see what the big trends are across the industry.

From my perspective, the central theme of FOO this year was: big data will save us. There were a bunch of participant-organized sessions about working with data (Big Data Best Practices, The Unreasonable Effectiveness of Data, Towards A Non-Creepy People Database, Data-Driven Parenting, even). One of the sub-themes was using data for social and political good. There were a number of participants from Code For America and “…for America!” became a kind of running gag, a suffix you could append to any technology or idea to make it fit the theme of the conference. Further, a striking number of the participants I met worked with data at various web companies big and small (including Google, Linked In, etc (interestingly, no one, that I met at least, was there from Twitter)). “Data scientist” was a relatively common job title.

Overall, there seemed to be a pervasive worldview that, if stated reductively, might be expressed thusly: Now, with so much of human behavior taking place over the web, mobile devices, and through other information-producing systems, we are collecting so much data that the only rational way of approaching most decision-making is through rigorous data analysis. And through the kind of thorough data analysis made possible by our new massive cloud computing resources we can finally break through the inherent irrationalities and subjectivities built into our individual observations, mental models, worldviews, and ideologies and into a new more objective data-driven representation of the world that can improve and rationalize our decision making.

I’m intentionally stating this idea more strongly and starkly than any individual FOO participant would ever have done in an actual session. These are incredibly smart people who live in the midst of the subtle distinctions and limitations that come up in practice when working on these kinds of problems in real life. By stating the underlying worldview this way, I’m not trying to create a straw man, but just to demonstrate the striking irony of these ideas emerging as dominant in this particular community and in this particular setting. The more I saw this “big data will save us” theme emerge, the more jarring it felt in contrast with the structure of FOO itself and, in many ways, O’Reilly’s philosophy as a company.

O’Reilly’s company slogan is: “changing the world by spreading the knowledge of innovators”. And FOO Camp itself is a perfect example of the company’s approach to achieving this goal. O’Reilly operates through a kind of personal networked social intelligence. They identify early adopters, create relationships with them, introduce them to each other, find out what they’re working on and what they’re interested in, and then use that knowledge to make informed guesses about where the tech world is going. Nearly all of these activities happen in person. All the books they publish, conferences they put on, and blogs they run are an epiphenomenon of this underlying process of personal relationship-building and hunch creation.

The key thing about this process is how human it is. O’Reilly’s process relies almost exclusively on human traits that aren’t represented in data or reproduced in a model: the trust between two peers that allows them to talk about a crazy idea that one of them is thinking about taking on outside of work, the ability to tell who’s highly respected in a field by the tone of voice people use when mentioning a name, the gut instinct of an experienced industry visionary for what will happen next, etc.

So, one question, then, is what would O’Reilly look like if they reinvented themselves as a Big Data company? Given all the resources of Big Data and the computation to crunch it how would you detect and spread the knowledge of innovators? How would you use data to attack the problem of identifying, tracking, predicting and collaborating with the early adopters and big thinkers that drive technological change?

I can think of a couple of notable attempts to do just this, but neither of them are exactly Big Data’s biggest triumphs. At the height of the blog era Technorati and other aggregators tried to automate the processes of bringing together this kind of knowledge by tracking blogs. And today Twitter Trends (along with a half-jillion Twitter analytics startups) does something similar. But neither of these seems to be any real threat to O’Reilly.

But that doesn’t mean that there isn’t a good idea out there somewhere to do just that. And if someone came up with a data-driven way to automate and beat O’Reilly’s human-centric process it seems like there’d be quite a lot of money in it — O’Reilly’s estate in Sebastopol is really quite nice.

All Watched Over by Machines of Loving Grace

There were two sessions at FOO that addressed this contrast between Big Data and Personal Knowledge head-on, attempting to put them into historical and theoretical context. The first one, organized by Matt Jones from Berg London, was a screening of All Watched Over by Machines of Loving Grace by Adam Curtis. Specifically, we watched Episode Two of this BBC documentary series, The Use and Abuse of Vegetational Concepts.

All Watched Over by Machines of Loving Grace is a three-part documentary film/polemical essay about the relationship between humans and computers. Episode Two looks at the history of cybernetics, how it arose out of developments in computer science and ecosystems theory, and how it came to shape much contemporary thinking about computers, the web, and web-mediated culture.

Cybernetics is the discipline of modeling the world as series of interlocking regulatory systems. It was developed by Norbert Weiner, Jay Forrester, and other engineers building massively complicated undertakings around the time of World War Two such as the SAGE anti-aircraft radar system. These systems were so complex that their designers had to move beyond the simple determinative rules normally enforced by computer code. Instead, they developed and embedded into their projects sophisticated models of the world. In order to do this, they developed a symbolic language for describing how various parts of the world interact as components in a system, influencing and altering each other in the process in order to produce various steady states.

This approach to design was known as cybernetics or systems theory. It gave birth to much of the last half century of ideas in computer and software design from object-oriented programming to HTTP. However, the cyberneticists didn’t stop with designing computer systems. They applied their systems modeling approach to everything from architecture to ecosystems to global economics.

Jay Forrester went on from his work on SAGE to to produce World3, a comprehensive computer simulation of the entire economy and environment of the earth. In the 70s, World3 predicted economic and societal collapse for the world from pollution and overcrowding and became the basis for the controversial Club of Rome study, The Limits to Growth. The World3 model included representations of people, technology, government, economics resources, nature, and, most importantly, all the interconnections between them. It attempted to model the entire world just as the SAGE software had attempted to model the relationship between the radar tracks of enemy aircraft and the anti-aircraft guns under its control.

Jay Forrester with his model for the entire global economic and environmental system.

One of the keys to supporting these models was collecting a huge amount of data. For example ecologist George Van Dyne attempted to build a comprehensive computer model of the Colorado grasslands ecosystem. He hired graduate students to watch and record every bite of food taken by every antelope on the slopes. He even went so far as to cut holes in the side of living bison so his staff could look inside the bison’s stomach to examine their daily diet.

Colorado bison with an open API.

As a polemic, Curtis’s film does more than present this history in a neutral manner. He constructs a critique of cybernetics. He argues that this emphasis on building ever-more accurate models of the world — and, especially, automating their results through the supposedly objective computer — represses any idea of individual agency to change the system while simultaneously causing us to project a false agency onto the system itself. In other words, Curtis focuses on cybernetics’ conservative political repercussions. In his account, this faith in the technologically augmented system model becomes a reason to defend the status quo.

In some ways, showing this film at FOO was an epic act of trolling on Matt Jones’ part. Cybernetics was the dominant philosophy of the 60s and 70s techno-counterculture within which O’Reilly arose. And much of how O’Reilly thinks and talks shows this influence clearly. Showing this film at an elite pow-wow at O’Reilly’s Sebastopol headquarters is a bit like screening Michael Moore’s Roger and Me at the annual GM shareholders meeting.

Curtis’s critique of cybernetics also served as an implicit critique of the “big data will save us” thematics of this year’s FOO. After watching the film, it was hard not to think of the data scientists as similar to those guys out in Colorado elbow-deep in bison trying to make their models add up.

This critique became even more explicit at the second Big Data vs. Personal Knowledge session of FOO: “Welcome to the Anthropocene”. Hosted by Matt Jones (again), Ben Ceverny (from Bloom), and Matt Biddulph (latest of Nokia), this fun sprawling session was much harder to pigeonhole than the All Watched Over screening (especially because of its deceptive title, which, as far as I could tell had only the most oblique relationship to the subject matter).

The subject matter was the coming return of predictive interfaces. The most famous (and despised) predictive interface is probably Clippy:

Clippy was a Microsoft Office feature that attempted to predict the user’s intention based on his preceding actions in order to jump in and help. The panel imagined a new Augmented Reality version of Clippy, “Reality Clippy”, that would use all of the available online data about your preferences and past actions (all your Foursquare checkins, Yelp and Amazon reviews, credit card purchases, Tweets, etc.) to suggest next actions you could take while moving about the city. As Cal Henderson reductio ad absurdum-ed it at one point: maybe we could get the interface down to an iPhone app that would superimpose a bright white line over the camera’s view of the surrounding street just telling us where to walk and what to do and buy all day long. Wouldn’t that be a bit of a relief?

Ben Ceverny described the instinct towards wanting to build these kinds of interfaces, and generally to so thoroughly augment our decision making with data, as a desire to “live in the model”. In other words, this new belief in Big Data takes the cybernetic vision one step further. Rather than simply building a comprehensive model of the world inside the computer so that we can make predictions and plans about it, the next step this time around is to actually try to live our lives inside this model, to superimpose it on the world in front of our eyes and place it as an intermediary in all of our online social interactions.

On my way home from FOO I sat staring out the car window, all of these impressions, ideas, and seeming contradictions bouncing around in my head. And then something occurred to me. O’Reilly’s human-centered approach is still a kind of systems thinking. O’Reilly is still building a model of what the geek world is working on. They’re just doing it through the social relationships that their employees form with other geeks. The “data” they gather is stored in their employees heads and hearts and in those of the wider community of geeks they bring to events like FOO. Instead of trying to live in the model, O’Reilly tries to live in the community.

As Matt Jones said in a kind of punchline to the Antropocene session: “The map-reduce is not the territory.” But the community just might be.

Posted in Opinion | 11 Comments

On the Future and Poetry of the Calibration Pose

Interesting stuff here from Tom Armitage on a subject that’s been much on my mind lately: Waving at the Machines.

How does a robot-readable world change human behaviour?

[…]

How long before, rather than waving, or shaking hands, we greet each other with a calibration pose?

Which may sound absurd, but consider a business meeting of the future:

I go to your office to meet you. I enter the boardroom, great you with the T-shaped pose: as well as saying hello to you, I’m saying hello to the various depth-cameras on the ceiling that’ll track me in 3D space. That lets me control my Powerpoint 2014 presentation on your computer/projector with motion and gesture controls. It probably also lets one of your corporate psychologists watch my body language as we discuss deals, watching for nerves, tension. It might also take a 3D recording of me to play back to colleagues unable to make the meeting. Your calibration pose isn’t strictly necessary for the machine – you’ve probably identified yourself to it before I arrive – so it just serves as formal politeness for me.

Nice little piece of gesture recognition sci-fi/design fiction here looking at how the knowledge that we’re surrounded by depth sensors and pose recognition systems may alter human behavior and custom.

I’m fascinated by (and deeply share) people’s fixation on the calibration pose. It comes up over and over again as people have their first exposure to the Kinect.

The use of this particular pose to calibrate gesture recognition systems seems to have originated in security procedure where it’s known as the “submission pose”, but in the academic computer science literature it tends to get referred to by the much drier “Psi pose”.

On the one hand this calibration pose is comforting because it represents a definable moment of interaction with the sensor system. Instead of simply being tracked invisibly it gives us the illusion that our submission to that kind of tracking must be conscious — that if we don’t assume the calibration pose then we can’t be tracked.

On the other hand, we find the pose disturbing because it brings the Kinect’s military and security heritage to the surface. The only other times we stand in the submissive pose are while we’re passing through security checkpoints at airports or the like or, even more vividly, when we’re being held at gunpoint. Intellectually we may know that the core technology of the Kinect came from military and security research funding in the last decade’s war on terror. When the Kinect first launched, Matt Webb captured this reality vividly in a tweet:

“WW2 and ballistics gave us digital computers. Cold War decentralisation gave us the Internet. Terrorism and mass surveillance: Kinect.”

However, it’s one thing to know abstractly about this intellectual provenance and it’s another thing to have to undergo a physical activity whose origins are so obviously in violent dominance rituals every time we want to play a game or develop a new clever hack.

I think that it’s the simultaneous co-existence of these two feelings, the oscillation between them, that makes the existence of the calibration pose so fascinating for people. We can’t quite keep them in our minds at the same time. In the world we know they should be parts of two very different spheres hence their simultaneous co-existence must be a sign of some significant change in the world, a tickle telling us our model of things needs updating.

Technically speaking, the necessity of the pose is already rapidly fading. It turns out that pose tracking software can record a data sample for a single person and then use that to obviate the need for future subjects to actually perform the pose themselves. This works so long as those people are of relatively similar body types to the person who performed the orientation.

I wonder if the use of the calibration pose will fade to the point where it becomes retro, included only by nostalgic programmers who that want to create that old 11-bit flavor of early depth cameras in their apps. Will we eventually learn to accommodate ourselves to a world where we’re invisibly tracked and take it for granted. Will the pose fall away in favor of new metaphors and protocols that are native to the new interface world slowly coming into existence?

Or, conversely, maybe we’ll keep calibration around because of its value as a social signifier like the host in Armitage’s story who goes through the calibration pose as part of a greeting ritual even though it’s not necessary for tracking. Will it sink into custom and protocol because of its semantic value in a way that preserves it even after it loses its technical utility?

Either way, it’s a post-Kinect world from here on in.

Posted in kinect | Leave a comment

Making Things See: A Book for O’Reilly about the Kinect

It’s been quiet around here for awhile, but not for me out in the real world. Since my last post here, I’ve finished my master’s thesis and graduated from ITP. Over the next couple of weeks I plan to catch up on documenting many of the projects from the last few months of the school year so keep watch here for more posts soon.

My time at ITP was an amazing two years that I can’t believe is already over. I spent so long looking forward to the program and planning for it that, in the end, it seemed to go by in a whirlwind. I met some of the most brilliant, creative, and kind people I’ve ever come across and I hope to be working and collaborating with many of them for years to come.

So, what comes next? I have two announcements to share about my future plans.

First off, I’m excited to announce that I’ve been contracted by O’Reilly to write a book about the Microsoft Kinect. The book is tentatively titled Making Things See: Computer Vision with the Microsoft Kinect, Processing, and Arduino. My goal is to introduce users to working with the Kinect’s depth camera and skeleton tracking abilities in their own projects and also to put those abilities in the wider context of the fields of gestural interfaces, 3D scanning, computer animation, and robotics. It is my belief that the Kinect is just the first in a generation of new computer vision technologies that are going to enable a vast number of programmers and creative technologists to build all kinds of amazing new applications and hacks. The key to building these applications will not simply be grokking the particular workings of the Kinect and its associated libraries as they happen to exist right now, but gaining a richer understanding of the wider areas of study that these new abilities unlock. Gaining the ability to access depth images and user skeleton data unlocks a door that opens into a vast territory that researchers and computer scientists have been exploring for some time now. My hope is to translate some of what those researchers have learned into a map that students, creative coders, and other newcomers like myself can use to make really cool stuff in this area.

Further, since the particular details of working with the Kinect are rapidly evolving with developments in the associated open source and commercial communities — and since the Kinect is unlikely to remain the only widely available depth camera for long — providing some broader context is the only way to ensure that the book stays relevant beyond the current micro-moment in the technological development cycle.

Making Things See is going to use the Processing creative coding environment for all of its examples. It will be targeted at people with some familiarity with Processing or equivalent programming experience (i.e. people who’ve read Casey Reas and Ben Fry’s Getting Started with Processing or Dan Shiffman’s excellent Learning Processing or something similar). I’ll start with the absolute basics of accessing the Kinect and work my way up to showing you how to build relatively sophisticated projects that take advantage of rich concepts from the application areas such as user-mimicking robotic armss, gestural interfaces that automate the occupational therapy exercise process, digitally-fabricated prints from 3D scans, and motion capture puppets that render beautiful 3D animations.

The book is tentatively scheduled to come out in physical form this holiday season with a digital Early Release this summer as soon as I have the first major section completed. If you have any requests for areas you’d like to see covered or any other constructive comments, I’d love to hear about them in the comments. You can also follow me on twitter: @atduskgreg for updates and announcements as things progress.

I’m also proud to announce that I’ve been selected as one of ITP’s Resident Researchers for the 2011-2012 school year. This means that I’ll be helping teach students as well as working on my own projects with support from the school. I’m especially excited to continue the transformation of ITP’s web curriculum I helped kick off this year. It looks like ITP is going to be using Ruby and Sinatra in nearly all of its web programming classes come the fall so I’ll be doing all I can to use my experience in this area to support those efforts. I’ll be working closely with Clay Shirky who heads up the web applications effort in the department as well as fellow resident Rune Madsen. In addition to the transition to Ruby we’ll be working to make it ever easier to build web-connected physical devices with the Arduino. With all the excitement around the Android ADK and the use of the iPhone headphone jack to connect arbitrary hardware it’s going to be a very exciting year for integrating physical computing projects with mobile devices and, hence, for networked objects in general. Clay, Rune, and I will be working with Tom Igoe and the rest of the ITP faculty to make it as easy-as-possible for students to dive into this area.

I’m sad my time as a student at ITP has come to an end, but I’m incredibly excited about these (and other still developing) prospects.

Posted in Opinion | Leave a comment

Why The Pro Apps Matter: Final Cut Pro and Apple’s Market Focus

On the most recent episode of The Talk Show, John Gruber and Dan Benjamin discussed the release of Final Cut Pro X. Above all, they were asking the question: why is Apple in this market? Why does Apple spend money producing professional grade video editing software, conducting full extensive rewrites of apps like Final Cut that make up a tiny percentage of their total revenue, when they are, fundamentally, a consumer hardware company?

I think the answer to this question can be traced back to the 1997 Boston Macworld Expo. That event marked Steve Jobs’ first public presentation on behalf of Apple since returning to the company after a 12 year absence. In fact, Jobs announced that he was officially joining the company’s board as part of the keynote.

The 1997 keynote is best remembered for the boos Steve received upon announcing a partnership with Microsoft that would guarantee the presence of Office on the Mac. However it’s real meaning — and the answer to the question of Final Cut Pro — lies in the strategy that Jobs articulated for Apple’s recovery. In describing that strategy, Jobs spelled out how Apple imagines the core market for the Mac, the role the pro apps play in servicing that market, and how Apple could expand from that market out to wider competitiveness.

In retrospect, 15 years on, this keynote hints at exactly the map that Apple has followed to its current explosive success.

The relevant portion of the keynote starts at 18:50 in the video when Jobs begins his description of Apple’s “market focus”.

“The question is: where is Apple relevant? Where is Apple still the dominant player? Which market segments? And there are two. The first one I call ‘creative content’: publishing, design, prepress, etc. It’s creative professionals using computers. And Apple is still the dominant market leader for creative professionals by far.”

Jobs goes on to site statistics: the Mac represents “80% of all computers used in advertising, graphic design, prepress and printing” and “64% of Internet websites are created on Macintosh”. However, before Steve’s return, Apple had been failing in this market:

“We haven’t been doing a good enough job here. As an example, something like 10-15% of Mac sales can be traced back to people using Adobe Photoshop as their power app. When was the last time you saw Adobe and Apple co-marketing Photoshop? When was the last time we went to Adobe and said, ‘How do we make a computer that will run Photoshop faster?’ These things haven’t been as cohesive as they could’ve been and I think we’re going to start proactively focusing much more on how we do these things.”

The core of Apple’s Mac business in 1997, according to Steve Jobs, was creative professionals, people who make media on the computer as part of their living. While the last decade since the advent of the iPod has seen the Mac become an ever-dwindling percentage of Apple’s total business, I think this is still the soul of the market for Macs: people who use their computers to make things. During that time those things have evolved from print designs to websites and iOS apps and as multimedia has become ever easier to work with they’ve grown to include music and movies as well.

Now, there’s an obvious objection to this as an argument in defense of continuing to make Final Cut Pro. Apple has another app that allows creative-types to make video content: iMovie. If the logic for going after movie makers and musicians is the same as that for serving designers and publishers, then why aren’t iMovie (and GarageBand for that matter) good enough? After all, Apple has ceded the market in professional print and web design tools to Adobe and others. Why continue to make Final Cut Pro and Logic, Apple’s professional grade movie and music creation tools?

I think the answer has to do with aspiration.

While there are many more creative professionals who do visual design than create music or movies, music and movies are the media that have the strongest imaginative and emotional hold on all creatives. How many designers do you know that aren’t serious music fans or even amateur musicians? How many don’t love movies and even fantasize about making one themselves someday? Even Gruber and Benjamin take a major segment out of each episode of The Talk Show to discuss their obsession with the Bond movies.

While very few of these designers may go on to actually record albums or create movies, they want to know that they could if they decided to. Their self-identity, and especially the part of it that aligns itself with Apple, includes the sense of these things as within their possibilities. If they knew that they’d have to switch away from the Mac in order to get serious about these dreams, that would have a dramatic impact on their relationship with Apple.

Further, many of these creatives are, to a greater or lesser degree, not yet fully professional. They might be corporate code monkeys who play in a band in the evening or contract designers who are working on their own screenplay. As such, they are obsessed with what professionals in their industry of aspiration do and they take their cues on tools from them. So, seeing features about Walter Murch editing the new Coppola film in Final Cut Pro, for example, have a big impact.

Accessible tools clearly intended for amateurs like GarageBand and iMovie don’t satisfy this aspirational fantasy. You don’t want to imagine cutting your first indie feature in the same tool your uncle uses to edit home movies of his kids.

Unlike in graphic design where the Adobe suite is the de facto industry standard, their aren’t obvious replacements for professional editing and recording on the Mac in the absence of Final Cut Pro and Logic. Adobe treats Premiere as a relatively low priority; Avid and Pro Tools are orders of magnitude more expensive and involve complex hardware integration (pulling them outside of the realm of an aspirational purchase).

In order to keep the dreams of creative professionals alive, Apple has to keep making Final Cut Pro and Logic.

Creative tools + Education = Media Players

But what about that second market Steve mentioned back in 1997?

“The second market, one that’s very close to my heart, is education. Apple put the first computers in education. Apple did it again with the Macintosh. And Apple is still the dominant leader in education. Now I’m going to ask you a question: who is the largest education company in the world? The answer is Apple. Apple is the single largest education supplier in the world.”

Just like for creative professionals, Jobs cited impressive stats: “60% of all computers in education are Apples”, “64% of all computers teachers use” are Apples.

But what does education have to do with Apple’s strategy in the last 15 years? Apple’s big successes have been in media; what does education have to do with that? The answer is: the youth market.

Before the rise of mobile devices and the major fall in computer prices since the 90s, how many young people owned their own computer before college? Educational institutions, either in the classroom or the dorm room, are the site of most kids’ first computer experience. By committing to making great computers for education, Apple was really committing to making computers that kids, and especially teens and college kids, liked.

One of the main things that teens and college kids like to do with their computers is listen to music and watch movies. The iPod arose out of the attention Apple paid to this market. Apple knew what young people wanted to do with technology, it knew exactly how to sell it to them, and it had a brand they respected and understood, all because of its long-standing involvement in the educational market.

And Apple had deep technical expertise about media because of their work on creative tools. They had Quicktime and extensive knowledge about media decoding; they were deeply involved in the definition of codecs. Out of the combination of this media expertise with their knowledge of the youth market came the iPod and everything it has yielded.

Posted in Opinion | Leave a comment