A Tour of the Arduino Internals: How does Hello World actually work?

Posted on May 18, 2009 by greg

Everyone who’s ever played with an Arduino has seen the following sketch:

void setup(){
  pinMode(13, OUTPUT);
}

void loop(){
  digitalWrite(13, HIGH);
  delay(500);
  digitalWrite(13, LOW);
  delay(500);
}

It’s the classic physical computing ‘hello world’: blink the onboard LED on and off every 500ms. But how does it really work? How does this code actually translate into the Arduino’s AVR microcontroller sending the right electrical signals to turn the LED on and off?

This question turns out to be a great entryway into learning about the architecture of the AVR chip as well as the Arduino software itself. In this post, I’ll explain how the AVR actually controls the pins on the Arduino and then walk through the implementation of each of the Arduino functions used for hello world (pinMode, digitalWrite, and delay) to show how the Arduino software uses the AVR’s capabilities to implement them.

AVRchitecture: Registers, PORTs, PINs, and DDRs

Like every processor, the AVR microcontroller is organized as a series of registers. Like tiny digital cubby holes, registers are little slots that the AVR can stick data into or grab data from.

Different registers can be used for different tasks. Some are meant for temporarily stashing data you’re in the midst of working with just like you’d put your ipod and headphones down on a shelf so you could have your hands free to tie your shoes. Other registers are actually connected to peripheral devices — like the EEPROM, serial port, and analog and digital pins — and reading and writing from those registers is how you communicate with those devices just like dropping an envelope full of cash in your landlord’s mail slot to pay your rent.

Each pin on the Arduino is hooked up to three registers: a Data Direction Register (DDR), a PORT register, and a PIN register. The AVR uses all three registers in concert to control each pin. The DDR is how the AVR configures the pin for either input or output (this is the heart of the Arduino’s pinMode function). If the pin is setup for output then the AVR uses the PORT register to send values to it (think digitalWrite); if it’s setup for input, the AVR uses the PIN register to read its value (digitalRead).

Each register on the AVR is 8-bits wide, which means that each cubby hole has eight individual sub-slots which can each store either a 1 or a 0. On the DDR, PORT, and PIN registers each of these sub-slots corresponds to exactly one of the Arduino’s physical pins. Hence, in order to have enough slots to control all of the Arduino’s pins, the AVR has 4 sets of these registers labeled ‘A’ through ‘D’ as in: DDRA, PORTA, PINA, DDRB, PORTB, PINB, etc.

In our hello world sketch, the LED is plugged into pin 13 which happens to be connected to the B set of registers. Since we’re only outputting to our pin and not reading from it, blinking the LED involves the use of only two registers: DDRB and PORTB. First, we’ll have to turn on the bit in DDRB corresponding to pin 13 to configure it for output. Then we’ll send alternating 1s and 0s to the correct bit in PORTB to turn pin 13 on and off.

Halt and Ponder the Magic of Electricity!

Before we move on to the Arduino software itself, it’s worth stopping to ponder this last step in a little more depth because it is truly a magical moment. Up until this last step of outputting to PORTB, we’ve been thinking of the bits in the AVR’s registers as having logical meaning. That is, the bits represent data, whether settings for our pins (as in the DDR) or other data we’re using for calculations (as in the working registers where you leave stuff while you tie your shoes).

However, since this is an electrical digital computer, whatever the bits might represent, they are stored and transmitted as electrical signals. In other words, when we talk about individual bits in a given register being set to 1 or 0 what we really mean is that there either is or isn’t electricity flowing through them. And so, when we send a 1 to the bit in PORTB corresponding to pin 13, we are in fact sending not just a logical value, but electricity as well, 5 volts of the stuff, enough to light up the attached LED.

This moment represents the heart of physical computing: electrical circuits representing logical values and conducting digital computation suddenly transmute into a change in the actual world. This is where the magic happens.

Arduino Software: pinMode()

Now that we’ve seen how the AVR chip controls the Arduino’s pins, let’s take a look at how the Arduino library tells the AVR itself what to do. To do that, I’ll walk you through the implementation of each of the functions used in our hello world example above to see how they work.

We’ll start by diving into the implementation of pinMode defined in arudino-0015/hardware/cores/arduino/wiring_digital.c thusly:

void pinMode(uint8_t pin, uint8_t mode)
{
  uint8_t bit = digitalPinToBitMask(pin);
  uint8_t port = digitalPinToPort(pin);
  volatile uint8_t *reg;
  
  if (port == NOT_A_PIN) return;
  
  // JWS: can I let the optimizer do this?
  reg = portModeRegister(port);
  
  if (mode == INPUT) *reg &= ~bit;
  else *reg |= bit;
}

We already know the overall goal of this code. It takes as input an integer representing a pin on the Arduino and another integer representing the mode into which we want to set the pin (in practice, we tend to pass the constants INPUT and OUTPUT, but if we look in arduino-0015/hardware/cores/arduino/wiring.h (lines 38 and 39) we can see that these constants are defined to be 1 and 0 respectively). From our discussion of the AVR architecture we know that in order to configure a pin, we’ll want to set the corresponding bit on the right DDR. So, this code will need to do two things: it’ll need to figure out which bit on which DDR corresponds to the given pin and it’ll need to set that bit high or low as appropriate given the desired mode.

Let’s walk through this code line by line to see how it accomplishes these things.

The first thing pinMode does is call the function digitalPinToBitMask, which is defined as a macro in arduino-0015/hardware/cores/arduino/pins_arduino.h:

#define digitalPinToBitMask(P) ( pgm_read_byte( digital_pin_to_bit_mask_PGM + (P) ) )

A macro is a kind of magic inline function that gets expanded at the time your program is compiled rather than when it’s already running on the AVR. Even though they can make things harder to understand, macros are used widely throughout the Arduino source (and even more so in the avr-gcc code it depends on) because of the extremely limited program space available on the AVR and to help with performance.

(Note: while unpacking this first function call, I’m going to be extremely thorough, following all sub-routine calls back through to their definition in both the Arduino source and its dependencies. This may make the flow of things hard to follow, but I want to give you a taste for how this stuff is really implemented all the way to the metal and also, hopefully, provide a little bit of a map that will help with your own future explorations into other parts of the internals. For subsequent steps, I’ll stay at more of a summary level of abstraction, trusting that you can follow this example to dive into the actual source if it suits your fancy.)

Now, let’s unpack this macro. This function is going to create a special byte that will be set to 0 at seven of its eight bits and 1 at only the bit corresponding to our pin’s location in DDRB. This byte is called a “bit mask” because when you combine it with another byte using the logical OR operator, it only sets the one slot where your mask had a 1, leaving all other slots at their pre-existing values. Once we’ve got that mask, our pinMode function will use it to make sure we only set the mode on the desired pin.

So, how do we construct the bit mask we’re going to use to make sure we only turn on pin 13? Let’s proceed through the macro in execution order starting at the center of the nested parentheses with: digital_pin_to_bit_mask_PGM + (P). Even though it may look like an addition, this is actually an array indexing operation. We’re looking up the 14th element in an array called digital_pin_to_bit_mask_PGM (C arrays start counting their indices at 0). That array is defined in arduino-0015/hardware/cores/arduino/pins_arduino.c like so:

const uint8_t PROGMEM digital_pin_to_bit_mask_PGM[] = {
  // PIN IN PORT		
  // -------------------------------------------		
  // [...]
  _BV( 4 )	, // PB 4 ** 10 ** PWM10	
  _BV( 5 )	, // PB 5 ** 11 ** PWM11	
  _BV( 6 )	, // PB 6 ** 12 ** PWM12	
  _BV( 7 )	, // PB 7 ** 13 ** PWM13	
  _BV( 1 )	, // PJ 1 ** 14 ** USART3_TX	
  //[...]
}

I’ve elided some array elements for the sake of brevity, but the numbering system is well indicated by the comments; the 14th element in this array, the one corresponding to our 13th pin is the element that reads: _BV( 7 ).

What, for goodness’ sake is _BV( 7 )? It’s another macro. But this time, instead of being defined in the Arduino library itself, it’s defined in one of its dependencies, avr-gcc, specifically arduino-0015/hardware/tools/avr/avr/include/avr/sfr_defs.h:

#define _BV(bit) (1 << (bit))

_BV returns the result of shifting the number 1 the given places to the left. In other words, _BV takes a byte that looks like this: 0b00000001 and moves that sole 1 to the left by the number of places indicated, in our specific case seven, to return: 0b10000000. (This 0b-a-bunch-of-zeros-and-ones notation is meant to indicate a single byte with 8 bits set to the values given to the right of the 'b'.)

Jumping back to our macro, we can now see that digital_pin_to_bit_mask_PGM + (P) returns a byte with the leftmost bit set to 1 and all the other bits set to zero: 0b10000000. The macro then takes this value and passes it to pgm_read_byte, another function defined inside the avr-gcc library, this time in arduino-0015/hardware/tools/avr/avr/include/avr/pgmspace.h:

#define pgm_read_byte(address_short)    pgm_read_byte_near(address_short)

Understanding pgm_read_byte is going to be the most challenging part of this whole exercise. In order to explain it clearly, I need to first give you a little more background on the architecture of the AVR.

Programs, Data, and Harvard vs. von Neumann Architectures

In normal processors (like the one in the computer you're probably using to read this post) program instructions and data are both stored side-by-side in the same kind of memory. In other words, your computer has one big bank of memory (you probably remember seeing it in the spec when you bought your computer: 1GB or 2GB of RAM, most likely) and it stores both the programs you run (for example, the Arduino IDE or iTunes) and the data your run them on (your mp3s) all in that same bank of memory. This design is called a von Neumann architecture and a lot of the more advanced things your computer can do come from the interchangeability it provides between programs and data.

The AVR does not have a von Neumann architecture. Instead, it's built with an alternative design called the Harvard Architecture. In this design, programs and data live separately in different chunks of memory. Program instructions live in Program Memory and the data these instructions operate on live in Data Memory. The advantage of this divide is that it makes the AVR both dramatically simpler and more reliable: once loaded onto the chip, programs can't be changed by the data on which they operate. The disadvantage is that programs become more rigid, losing the ability to change and transform themselves as they encounter new data in the way allowed by the von Neumann architecture. For embedded devices expected to operate with high reliability for long periods of time doing relatively simple tasks, this is a great trade.

What does all of this have to do with understanding pgm_read_byte? Well, if you look a little more closely at the digital_to_pin_to_bit_mask_PGM array we visited earlier, you'll uncover the answer.

const uint8_t PROGMEM digital_pin_to_bit_mask_PGM[] = {
  // PIN IN PORT		
  // -------------------------------------------		
  // [...]
  _BV( 4 )	, // PB 4 ** 10 ** PWM10	
  _BV( 5 )	, // PB 5 ** 11 ** PWM11	
  _BV( 6 )	, // PB 6 ** 12 ** PWM12	
  _BV( 7 )	, // PB 7 ** 13 ** PWM13	
  _BV( 1 )	, // PJ 1 ** 14 ** USART3_TX	
  //[...]
}

Notice the declaration of this array. It's declared as being "const uint8_t PROGMEM". The first two of these types are familiar: "const" means that it won't change in the course of running the program and "uint8_t" means that it will be an array of 8-bit integers, but what's "PROGMEM"? Reading, the avr-gcc documentation on the subject reveals that PROGMEM is a macro that allows you to store data in Program Memory.

But wait! Didn't I just spend two paragraphs telling you all about how the AVR's Harvard Architecture kept a rigid separation between Program and Data Memory in order to ensure stability and performance? I did. But here's the thing: on most AVR processors, the amount of Data Memory is miniscule compared to the Program Memory. For example, the ATMega168 that comes standard on Arduinos these days has 16K of Program Memory and only 1024 bytes of Data Memory (albeit supplemented by an addtional 0.5K of external EEPROM). If the core Arduino and avr-gcc code went around filling up these paltry 1024 bytes with the data structures they need to do their job, pretty soon there'd be none left for your code to use. Hence, avr-gcc provides the PROGMEM macro so they can store data in Program Memory and pgm_read_byte (and some sibling functions for other data types) to pull it back out again.

How is pgm_read_byte actually implemented, then? If we follow the chain of macros begun in the define I showed earlier, we discover that the heart of pgm_read_byte is a macro that uses inline assembly code to read a byte that corresponds to the location we calculated above (0b10000000) out of the AVR's Program Memory ('lpm' is the assembly instruction for Load Program Memory) (arduino-0015/hardware/tools/avr/avr/include/avr/pgmspace.h line 298):

#define __LPM_enhanced__(addr)  \
(__extension__({                \
    uint16_t __addr16 = (uint16_t)(addr); \
    uint8_t __result;           \
    __asm__                     \
    (                           \
        "lpm %0, Z" "\n\t"      \
        : "=r" (__result)       \
        : "z" (__addr16)        \
    );                          \
    __result;                   \
}))

Don't worry too much if you don't understand every single character of this macro. It's the deepest darkest part of the internals we're going to look at and as long as you followed the core of the Program Memory v. Data Memory story, you'll know what you need to follow along as we proceed.

Back to pinMode

We're finally at the point where we can pop back up the stack and continue plugging through the rest of pinMode! As I promised earlier, now that we've gotten a flavor for how this stuff works all the way down to the metal, I'm going to keep things at a higher, more comprehensible level from here on in.

So, in case your memory doesn't stretch back more than a couple thousand words, here was the full implementation of pinMode:

void pinMode(uint8_t pin, uint8_t mode)
{
  uint8_t bit = digitalPinToBitMask(pin);
  uint8_t port = digitalPinToPort(pin);
  volatile uint8_t *reg;
  
  if (port == NOT_A_PIN) return;
  
  // JWS: can I let the optimizer do this?
  reg = portModeRegister(port);
  
  if (mode == INPUT) *reg &= ~bit;
  else *reg |= bit;
}

We've seen how that first line works. "bit" is now set to be a bit mask that selects just the bit on any given register corresponding to Arduino pin 13. Now the next thing we've got to do is find the PORT register our pin lives on. Remember, to configure a pin for output, we need to set the correct bit in the correct Data Direction Register and there's a different DDR for each PORT, eg. DDRA for PORTA and DDRB for PORTB, etc. Therefore, before we can figure out which DDR we need to work with to configure out pin for output we need to figure out which PORT it's on.

The second line of this function calls digitalPinToPort. This function is a macro just like digitalPinToBitMask, but instead of creating a bit mask, it composes a byte with the address of the correct PORT for the given pin, in our case this will be the address of PORTB since that's where pin 13 lives.

The next line of the function initializes a pointer called "reg" that we're going to aim at the DDR once we find it. But first, what if the pin number that got passed in was crazy? What if some nutso user tried to configure pin 97 for input? pinMode protects against that by checking the return value of the digitalPinToPort macro against the NOT_A_PIN constant (defined to be 0 in arduino0015/hardware/cores/arduino/pins_arduino.h, if you're curious). If digitalPinToPort returned 0, meaning it failed to find a PORT corresponding to the given pin, then pinMode just returns without going any further. No DDRs are altered.

However, if the given pin matched a real PORT, then we go ahead and call another macro, portModeRegister, to look up its corresponding DDR. Again since our pin is on PORTB the result from this will be a byte representing the address of DDRB. (As far as the comment there from "JWS", I'm not actually sure how avr-gcc's optimizer could possibly know to pull the correct DDR address out for you; this is one piece of this function I don't fully understand).

Now we've got all the data that we need in order to go ahead and actually set the right bit on our DDR. Obviously, we'll have to do something different depending on if we're configuring our pin for input or output: we'll have to set the bit high if we're doing output and low if we're doing input. The last two lines of this function use C's logic functions to do just that.

The first case (if val == LOW) is when we're configuring the pin for input (remember, LOW == 0 which, in the DDR means input). Working from right to left, the first thing we do is take the inverse of the bit mask. So, in our example for pin 13 where we had 0b10000000, the inverse comes out to 0b01111111. Since reg was set to be the address of our DDR, dereferencing that pointer gives us the actual value of the DDR. The &= operator set the DDR to be the result of a logical AND between the current value of the DDR and the inverse of our bit mask. In other words, it will leave the DDR the same for every bit set to one in our inverse bit mask and it will turn the DDR off at any bit set to zero. The result? The most significant bit, the one in DDRB corresponding to pin 13 is turned off and hence pin 13 is configured for input.

If mode was set to OUTPUT (i.e. 1), we do a different logical operation. We don't invert our bit mask, so it's remains 0b10000000. But then, instead of &= we combine it with the DDR via |=, logical OR. As opposed to logical and, this operation leaves the DDR alone for any bit where our mask is set to 0 and turns it on anywhere the mask is set to one. Since our mask is set to be 1 at only the bit representing pin 13, the corresponding bit in DDRB will be turned on and pin 13 will be configured for output.

Whew. We've made it all the way through pinMode! Now we only have two more Arduino library functions we need to understand before we'll have seen all of what makes hello world tick. Thankfully these other two functions reuse concepts we've already learned from our deep dive into pinMode so the rest of this will go rather rapidly, especially our next function: digitalWrite

Arduino Software: digitalWrite()

One of the most commonly used functions in the Arduino library, digitalWrite turns on a given digital pin, i.e. it makes the Arduino send 5V out to the pin. Translating that into the register vocabulary we learned above: digitalWrite turns on one bit on one PORT register corresponding to the desired pin. In the case of pin 13, we know that will be PORTB. Let's take a look at the implementation of digitalWrite (arduino-0015/hardware/cores/arduino/wiring_digital.c):

void digitalWrite(uint8_t pin, uint8_t val)
{
  uint8_t timer = digitalPinToTimer(pin);
  uint8_t bit = digitalPinToBitMask(pin);
  uint8_t port = digitalPinToPort(pin);
  volatile uint8_t *out;
  
  if (port == NOT_A_PIN) return;
  
  // If the pin that support PWM output, we need to turn it off
  // before doing a digital write.
  if (timer != NOT_ON_TIMER) turnOffPWM(timer);
  
  out = portOutputRegister(port);
  
  if (val == LOW) *out &= ~bit;
  else *out |= bit;
}

Most of this is exactly parallel to pinMode. We use digitalPinToBitMask again to find the bit corresponding to our target pin (remember that will return 0b10000000) and digitalPinToPort to find the correct port value (PORTB), again just like last time. This time, instead of using the port value to go ahead and detect the DDR, we use it to get the address of the PORT register itself. And then, once we've got those things, we use the same bit operations to turn on the right bit in the PORT register to send 5V out to our pin.

In addition to all of this parallel code, though, there's another strain running through this function that seems to have to do with something called timers. In addition to calling digitalPinToBitMask and digitalPinToPort, we also invoke another macro called digitalPinToTimer. And a few lines later we do some logic to figure out if we should call the function turnOffPWM with that timer as an argument. What's going on here?

As you know, each pin on the Arduino can be used for either digital or analog input and output. In addtion to this digitalWrite function which can send only 5V or 0V, the Arduino library provides a parallel method, analogWrite, that can send a range of values. You've probably used that function to dim LEDs or control the speed of motors.

Now that we've seen how the AVR chip is actually connected to the output pins via registers, you might find this ability to output analog values somewhat surprising. After all, at a hardware level we only have the ability to turn each bit in each register on and off. We didn't mention any tiny dials on any of those registers that can be set to a range of values. How does the Arduino use this purely digital hardware to achieve analog output?

The answer is a trick you may have heard of before called Pulse Width Modulation. Basically, the scheme works like this: say you want to output an analog value of 50% using a pin that can only be on or off. In order to output that analog value, you'll have to use the only other variable axis available to you: time. If you turned your digital pin on and off very rapidly, keeping it on the same amount of time it was off over a given period then someone on the other end could sample the signal you were sending out, discover the percentage of on time you were demonstrating and interpret that as an analog value. If you then varied the ratio of on-time to off-time you could express different analog values, communicating any percentage you chose.

But doesn't this scheme require the part plugged into our flickering signal to be smart, to sample the signal and interpret it as an analog value? How can this work with a dumb little LED in the dimmer example so commonly used to demonstrate analogWrite? Thankfully, our eyes are a great example of a device capable of converting a flickering digital signal into a smooth analog one. Using analogWrite to fade the intensity of an LED works exactly this way: it flickers the LED on and off so fast that we perceive it as a smooth fade. The same thing goes for using analogWrite to drive a speaker to produce different tones. And the same scheme works in reverse in analogRead to convert a continuous signal into a digital one via rapid sampling.

But what does all of this have to do with understanding these last few lines of digitalWrite? Well, the timer we saw that code referring to is exactly how the Arduino keeps track of time in order to accomplish Pulse Width Modulation or analog-to-digital conversion for its analog functions.

The timer is a built-in function on the AVR. At a basic level, you can think of it as a simple number that the AVR increases by one as fast as it can manage. In other words with each instruction that the AVR executes (technically, at each clock-cycle) it also increments the timer. Depending on the chip's clock speed, this incrementing can happen anywhere from a few thousand to millions of times per second (the ATMega168, the typical Arduino chip, is usually clocked at either 8Mhz or 16Mhz, i.e. 8 or 16 millions of ticks per second).

If we were looking at the implementation of analogRead or analogWrite (something I'd like to do in a future post), this timer would come into play significantly. However, since we're only looking at digitalWrite in this example, all we have to do is make sure that the timer doesn't get in our way. Since the Arduino library lets us use some of the AVR's pins for either digital or analog operations, there might be a timer setup to correspond to the digital pin we're trying to operate. Hence the two timer-related lines in this function which find the associated timer and turn it off.

And that's digitalWrite. In plain English: find the correct bit and PORT register for the pin we're trying to turn on; make sure there's no timer attached to that pin; send a byte to the PORT register that leaves all the other bits the same and turns on or off the bit corresponding to our pin: 5V go out to an LED or solenoid or whatever and things blink or move in the real world.

Arduino Sofrware: delay()

We've very nearly accomplished our goal. We know exactly how the Arduino software uses the AVR's built-in capabilities to do two of the three essential tasks for 'hello world': we've configured pins and we've turned them on and off. All that we've got left now is delay. Without delay, no matter how elegantly we turned the LED on and off, it would just appear solidly dark or lit, rather than blinking in such a satisfactory manner. Let's take a look at the implementation (arduino-0015/hardware/cores/arduino/wiring.c):

void delay(unsigned long ms)
{
  unsigned long start = millis();
	
  while (millis() - start <= ms)
    ;
}

This seems exceedingly straightforward. We call a function, millis, which returns the "current time" in milliseconds and then we loop repeatedly, checking the amount of time that has passed (the difference between the new "current time" and start, the time when we first checked it) until it is equal or greater to the amount of time we were trying to delay. (It's interesting to note that this greater or equal to means that delay only guarantees that it waits the given number of milliseconds as a minimum, rather than a precise duration.)

In light of our previous discussion about timers, though, we can see that something subtle must be happening inside of that millis function. It is somehow managing to convert the movement of the AVR's clock (those 8 or 16 millions ticks per second) into milliseconds of real world time. In order to explain how millis accomplishes this trick, we have to introduce our last new concept of this post: interrupts.

Interrupts

So far, every example of functionality we've examined so far has found the Arduino actively taking action: twiddling register bits to configure pins, sending around electrical signals to light up pins, etc. But sometimes instead we want the Arduino to respond dynamically to sudden changes in its environment. To accomplish this, calling normal functions in our regular code won't be sufficient, we need a mechanism that lets us declare what kind of changes we want to respond to and then indicate what should happen when those changes come. Thankfully, the AVR has just such a mechanism: interrupts.

Imagine Interrupts thusly: some kind of sudden event takes place, for example a new serial message arrives; the AVR sends out a signal announcing the event; if any part of our code cares to respond to that particular kind of event, it declares itself as a "handler" for that signal; if there's a matching handler for the given interrupt signal, the AVR stops whatever it was doing, temporarily putting aside whatever work was already underway, and the chunk of code declared to handle the present signal gets run; finally, after our code is done handling the interrupt, the AVR picks up back up on the original work that had been interrupted and continues on where it left off.

The whole process is very similar to what might happen if I threw football at you while you were in the midst of tying your shoes. Your dangerous-flying-object handler would trigger; you'd drop your shoes on the spot and either catch or deflect the incoming football; then, once the danger was past, you'd pick your lacing back up where you'd left it with the possible side effect of being marginally angrier or more paranoid.

On the AVR, the interrupts are actual hardware components that can be triggered by various different events including the completion of an analog-to-digital conversion, the availability of a signal on the SPI or UART serial communication lines, or one of timers reaching various given values.

Aha! This last interrupt trigger points us strongly in the direction for how interrupts will be relevant for the delay method we're in the process of examining. A timer-based interrupt is essential for the process of converting clock ticks into real world milliseconds. Here's why.

The AVR is an 8-bit architecture. Hence, the largest number it can comfortably store, an "unsigned long" which stretches over four bytes for a total of 32-bits, has a range of 0 to 4,294,967,296. As we mentioned above, a typical Arduino normally runs at a clock rate of 16Mhz. At that rate, we could store a running tick count in an unsigned long for 0.001024 seconds before reaching the maximum value we can store in an unsigned long. In other words, we run out of space in our largest variable for counting ticks after about 1 millisecond. Obviously, this is woefully inadequate for even the brief delay of 500ms we need in this sketch let alone the much longer ones we might want for other applications.

So, what's the solution? An interrupt, by jove! Whenever the timer reaches its maximum value, the AVR triggers an interrupt and we get a chance to bring another variable into play. That second variable, rather than simply counting ticks, instead counts how often our timer has overflowed. In other words, every increment of that variable represents 1ms of elapsed time. And if we use another unsigned long we'll suddenly have bootstrapped ourselves up to a whopping 48.54 days that we can count-off before overflowing. A dramatic improvement. And it's easy to imagine how we could repeat the whole process again to deal with even longer durations.

In fact, the Arduino code for handling the interrupt generated by timer overflows does exactly that (arduino-0015/hardware/cores/arduino/wiring.c):

volatile unsigned long timer0_overflow_count = 0;
volatile unsigned long timer0_clock_cycles = 0;
volatile unsigned long timer0_millis = 0;

SIGNAL(TIMER0_OVF_vect)
{
  timer0_overflow_count++;
  // timer 0 prescale factor is 64 and the timer overflows at 256
  timer0_clock_cycles += 64UL * 256UL;
  while (timer0_clock_cycles > clockCyclesPerMicrosecond() * 1000UL) {
    timer0_clock_cycles -= clockCyclesPerMicrosecond() * 1000UL;
    timer0_millis++;
  }
}

This code is, more or less, an implementation of exactly the scheme we just described. SIGNAL() is avr-gcc's syntax for defining interrupt handlers. The argument that gets passed to it, "TIMER0_OVF_vect", stands for Timer 0 Overflow Vector, which is just a very precise way of indicating the exact interrupt flag that gets set when Timer 0 overflows (there are a series of timers on the AVR that get used for different purposes).

So, this handler will trigger whenever Timer 0 overflows, just as I described above. And what does it do at that point? First, it increments timer0_overflow_count, which is an unsigned long keeping track of how often this timer has overflowed just like we expected. Then, it does some other complicated stuff with timer0_clock_cycles, clockCyclesPerMicrosecond(), and some multiples of something called "UL". Basically, what's happening here is scaling. The relationship between clock cycles and real world time is different depending on a number of different hardware-specific factors ranging from clock speed to the size of an unsigned long (remember, the ATMega168 that I'm treating as typical is actually in the middle of the family in terms of capacity, supported chips range down to the ATMega8 with its slower clock and 8-bit longs all the way up to the ATMega1280 on the new Arduino Megas). The rest of this interrupt handler just uses the values of those variables to correctly convert clock ticks into milliseconds (saved into timer0_millis, which it increments at the heart of the while loop).

To summarize, this interrupt handler gets triggered every time timer0 overflows the chip's unsigned long variable size and then it does the proper math to keep a series of other unsigned longs that represent real clock time properly updated. Now that we've got that down, let's jump back into the implementation of millis to see how it pulls this back out to provide the easy-to-use value that made the implementation of delay that we saw above so clear (arduino-0015/hardware/cores/arduino/wiring.c):

unsigned long millis()
{
  unsigned long m;
  uint8_t oldSREG = SREG;
  
  // disable interrupts while we read timer0_millis or we might get an
  // inconsistent value (e.g. in the middle of the timer0_millis++)
  cli();
  m = timer0_millis;
  SREG = oldSREG;
  
  return m;
}

First off, millis configures an unsigned long called "m" that it will use to hold the result once it's been calculated. Next, something funky happens. We store the value of something called "SREG" into a variable called "old_SREG". SREG stands for Status Register. The Status Reigister is a special register the AVR uses for holding the results of various different operations on the chip.

Remember that an interrupt breaks right into the flow of the executing program. Any code that deals with interrupts has a special responsibility to put things back just how it found them after it finishes it work, otherwise pre-existing parts of the program that depended on those values would be totally hosed, having had the rug pulled out from under them in mid operation. In order to fulfill this responsibility, millis saves the contents of the Status Register before it starts doing anything funny. That way, upon completion, it can put things back just how it found them and the rest of the program will resume smoothly, none-the-wiser of the interruption.

Having saved out the Status Register, millis goes ahead and calls cli(), "clear interrupts", a function that suspends the operation of any other interrupts. Since millis is going to read timer0_millis, a value we've just seen being updated in an interrupt handler, it needs to make sure that it has exclusive access to that variable, that it doesn't get grabbed by the interrupt handler itself right in the middle of things. If we don't disable them, interrupt handlers can fire at any time even while we're right in the midst of working with values that they modify!

Now, we've read out timer0_millis which has the "current time" so all we've got to do is restore the Status Register to its value from before we started monkeying with the interrupts and return the value we read out. But wait! We disabled the interrupts, but never re-enabled them. How come this code doesn't break all subsequent Arduino functionality that depends on them? The answer is that we did actually re-enable the interrupts when we restored the Status Register. One of the bits in the Status Register is the interrupt enable, which is set to allow interrupts to take place. Since that bit was set when we first arrived in this function, restoring the SREG to the value at which we originally found it has the very important side effect of re-enabling the interrupts and leaving everything hunky-dory.

Summary and Conclusion

We're done! We've now seen how every single step of the physical computing hello world on the Arduino actually works. Let's summarize it really quickly to make sure we've got everything straight. First, we configure pin 13 for output by setting a bit in Data Direction Register B. Then we turn pin 13 by setting its bit in PORTB. Then we delay 500 milliseconds by marking the current value of timer0, converted into milliseconds, and letting the counter be incremented repeatedly at each clock cycle and overflowed into ever larger holders until the new value of timer0 represents an elapsed time of 500ms. Next we set the bit for pin 13 in PORTB low to turn the LED off. We do the delay dance again and we're home.

All-in-all the process isn't that complicated, especially once you've gotten the hang of the basic architecture of the AVR and how to navigate the Arduino and avr-gcc source code. Granted, it took me about a year to grok both of those things, but that's why I've made it a point to set things down here for other people who don't want to spend quite the full 365.

I'm planning on making this post the first of a series. Next up, I'll tackle the digital-to-analog conversion functionality by diving into analogRead and analogWrite. After that I might try some of the more advanced topics like serial communication and maybe the bootloader. If you've got any suggestions or requests for topics, I'd love to hear them.

And finally, if you made it this far, definitely drop me a line to let me know if all of this made sense to you and if you found any mistakes or bugs in my write-up.

This entry was posted in Opinion. Bookmark the permalink.

33 Responses to A Tour of the Arduino Internals: How does Hello World actually work?

Jonathan Oxer says:

May 18, 2009 at 10:04 pm

Great explanation! It’s long, but it has to be given the level of detail you’re covering and your conversational writing style makes it very easy to read. Please do more posts like this: I get the feeling it’s going to become a frequently referenced resource.

Reply
Andy Gelme (geekscape) says:

May 18, 2009 at 10:22 pm

Good to see such a detailed description.
Looking forward to more !
Some corrections …
In the section “Interrupts”, you say … “unsigned long” which stretches over two bytes for a total of 16-bits, has a range of 0 to 65,536.
However, an unsigned long is 32-bits (four bytes).
http://www.arduino.cc/en/Reference/UnsignedLong
To determine the interrupt timing, you’ll need to take into account the prescaler value of 64 and the timer overflows at 256 … and you’ll find that the timer overflow interrupt occurs every 16,000,000 / 64 / 256 = 976.5625 times per second (or once every 1.024 milliseconds).
So, “timer0_overflow_count” increments every 1.024 milliseconds (not 5ms). Given that “timer0_overflow_count” is a 32-bit number, it can handle up-to approximately 48.54 days (not 262.144 seconds).
The rest of “SIGNAL(TIMER0_OVF_vect)” deals with turning 1.024 millisecond “overflow counts” into more useful 1.000 millisecond increments stored in “timer0_millis”.

Reply
Greg Borenstein says:

May 19, 2009 at 10:49 am

Thanks for the correction, Andy! I’ve fixed it in the post.

Reply
Anon. says:

May 19, 2009 at 6:23 pm

“This byte is called a “bit mask” because when you combine it with another byte using the logical OR operator, it turns off all the bits in the other byte except the one slot where your mask had a 1.”
I think you mean logical AND? OR would turn on all the bits in the other byte + the bit in your bitmask.

Reply
Josiah says:

May 19, 2009 at 8:10 pm

A printable version of this would be great. When I go to print this in Firefox on Mac, it cuts off the majority of it. Maybe just a print stylesheet would do the trick or something? (I’ll just cut & paste for now. Thanks for writing this up.

Reply
Aviv Ben-Yosef says:

May 20, 2009 at 3:35 am

Really a great read!
Just a small note, I felt some parts were too detailed (I don’t know if there are people that program Arduino’s and don’t know a byte is 8 bits long etc.)

Reply
TonyD says:

May 20, 2009 at 4:41 am

Thanks for the really detailed artical. I agree with what Josiah says a printable version would be really useful for offline reference.

Reply
Greg Borenstein says:

May 20, 2009 at 8:54 am

@anon: Actually, the bit mask is designed to leave all but one of the bits in the target register unaltered when its combined with that register with OR. For bits set to zero, OR results in the value in the bit from the target register; for bits set to high, it sets them high. It’s a way to turn on a selected bit without messing with the other ones. I’ve clarified that in the post.
@josiah and @tonyd: A printed version is an interesting thought. Maybe once I’ve written a couple more posts in the series, I’ll throw the whole thing up on Lulu for people to download and/or buy. Thanks for the suggestion!
@Avi: It’s meant to be that way! I know that for more advanced readers some pieces (like the explanation of bytes vs. bits) might be remedial, but I guarantee that there are people who use the Arduino who don’t know that. One of the great things about the Arduino is that it’s accessible to artists, designers, and other people without any programming background whatsoever. These types of people could get a lot done without knowing how many bits are in a byte: LED blinking, reading buttons, serial communication, etc. None of these things require doing any bit math. I think part of what makes it hard for non-technical people to follow their curiosity to learn more about how things like Arduino actually work is that technically-inclined people tend to take low level stuff like that for granted and not explain it. This was exactly the problem I was trying to avoid in this post.

Reply
Josiah says:

May 21, 2009 at 6:04 pm

I just mean something like a movable type plugin that would allow me to print this page out without the comment form and other things that aren’t relevant when the same information is on paper, but a book format may also be handy.

Reply
finsprings says:

May 22, 2009 at 9:57 am

Great article!
One minor point: the digitalRead() explanation refers to “(if val == LOW)” which is part of digitalWrite(); I think it means to refer to “if (mode == INPUT)” instead.

Reply
finsprings says:

May 22, 2009 at 9:04 pm

Er, pinMode() I mean, not digitalRead() 🙂

Reply
Jarav says:

May 23, 2009 at 4:13 am

I have recently got myself an arduino. I have no knowledge of electronics or assembly language and this article has been a great learning experience.
I have a small confusion. In your explanation of the pinMode function( for pin 13), the variable ‘bit’ is the byte that results from pgm_read_byte(0b10000000). That is, according to your explanation, the byte at the location 0b10000000. And yet you seem to indicate( when you explain setting the DDR to high or low ) that ‘bit’ is 0b10000000 itself. In fact since all you need is a bit mask why call pgm_read_byte ? Why not stop with the _BV macro which itself returns a bitmask?

Reply
Michael Schieben says:

October 22, 2009 at 4:05 am

I like how you use metaphors. Thank you so much for this very well written description.

Reply
David Durman says:

November 9, 2009 at 1:14 pm

This article is exactly what I was looking for. Clear explanation. Hope to see another post about this topic. Thanks!

Reply
Brian says:

November 15, 2009 at 11:01 am

What a great post! Really informative. I haven’t found much on the web that takes this format of taking something simple, and drilling down all the way to the core, and then unwinding to completion. Awesome.
My only disappointment is that there aren’t more posts on the Arduino. After seeing the teaser about getting into serial and other stuff I searched high and low on the site, but I couldn’t find any more stuff. I *know* the audience is there, so please! More posts! Thanks a lot!

Reply
Greg Borenstein says:

November 15, 2009 at 5:27 pm

Brian — Thanks for your compliments about this post. As far as the series goes, I’m still working on it. Learning enough to write a comprehensive post like this takes time! I spent more than a year figuring this stuff out including auditing a semester long 400 level CS class. I’m hoping to have the next post (on analog operations and/or timers and interrupts) done before the end of the year. I appreciate your patience and hope you tune in to read that one when it’s ready.

Reply
Brian says:

November 15, 2009 at 8:29 pm

Great to hear that it’s still in the works! I’ll definitely check your site often. I’ve been enjoying some of your other stuff as well, so thanks again for the great content!

Reply
Lex Talionis says:

June 15, 2010 at 12:54 am

Amazing post! I just spent about 2hrs trying to find out what _BV() actually does. When I stumbled on this post I was just stunned by the depth and breadth of knowledge.

Reply
jonah says:

August 17, 2010 at 1:55 am

Thank you, thank you, thank you so very much.
This is an exceptionally clear and thorough article. I think many of us also appreciate you comment above concerning the need for accessibility in technical tutorials such as these (I myself am an architecture student with no CS background). Much respect.

Reply
Gandhi says:

August 1, 2011 at 8:24 am

Wow.,
Really nice article.,

Reply
Jonas says:

September 2, 2011 at 9:14 pm

none of the code listings (except the very first one) are visible (firefox 6, chrome 13).
really nice article!, would be nice if you made it working again.

Reply
mitch says:

February 2, 2012 at 5:24 am

Just working my way through this very cool article. As the previous poster mentioned, all the code snippets after the first one seem to have disappeared. Would be great if they were inlined again, but for now, you can find them here:

http://code.google.com/p/arduino/source/browse/trunk/hardware/arduino/cores/arduino/?r=1088

Reply
- mitch says:
  
  February 2, 2012 at 5:27 am
  
  Wow – crazy. I hit “post comment”, and all the code snippets magically returned. Either the author simultaneously updated the article, or there’s something funky in the blog code. Anyway, posting seems to have fixed it for me.
  
  Reply
greg says:

February 2, 2012 at 5:47 am

I fixed the code snippets earlier this evening. Sorry it was broken so long everyone! It was hard to fix!

Reply
Lutieri says:

June 2, 2012 at 10:00 pm

very nice article. neat and clear explanations. looking forward to more!

Reply
Martijn says:

June 29, 2012 at 6:52 am

Code snippets seem to have disappeared again. So sad.
But what a great article! I’m keeping the link on my site for reference.
The only thing that still mystifies me, however, is the mechanism by which the right bit in a PORT register switches on a voltage. In other words: how is this “software-controlled relay” implemented? I miss this last step in this “all the way to the metal” story.

Reply
Wengu Zhoudong says:

June 30, 2012 at 10:03 am

Thank you for the interesting written. I am wondering if you could provide the explanation for VB. As you probably know, VB is used much more than the ATMEL softs and I think it could be possible to make the VB example instead of the ATMEL.

Thank you!

Reply
MurphySquint says:

December 29, 2012 at 1:20 am

Your knowledge level is just what I have been looking for. I have posted the following on several sites but so far no luck getting an explanation to my question.
How do I use an Arduino to switch another IC’s pin (reset) to ground?
I am trying to turn an IC’s reset pin to ground with an Arduino. It is a 5 LED array that has an IC controlling several flashing modes. I need to reset the IC to change modes and I want to use an Arduino Nano. I have seen an explanation setting an digital pin to input and then making it low and then setting the pin to output but I don’t understand the process. Supposedly it simulates a momentary push button switch using an Arduino.
Here’s the code that I have been given but I totally don’t understand how it works.
pinMode(X, INPUT);
….
digitalWrite(X, LOW);
pinMode(X, OUTPUT);
delay(100);
pinMode(X, INPUT);
I guess I need a very thorough explanation for someone from the slide rule era.

Thank you in advance.

Reply
fabelizer says:

April 13, 2013 at 12:44 am

Very nicely done! It took a while to find it, but a great article!
@MurphySquint: X is a place holder for the Arduino pin number you will connect to your reset pin.
digitalWrite(X, LOW); tells the Arduino to send a low to the pin, putting your IC into reset as soon as…
pinMode(X, OUTPUT); makes ththe pin an output.
Delay(100); tells the Arduino to stay like that 100 milliseconds.
pinMode(X, INPUT). tells the Arduino to now make that pin a high impedence so it neither is high or low.
For this to work, you need a pull up resistor from your reset pin and Arduino pin to +5V. A resitor in the range of 4.7K to 10K ohms would work well. The code assumes your IC reset pin is active low. In other words, the IC works normally when the reset pin is high (5V), and the IC is in reset when the reset pin is low (0V).
-fab

Reply
Ake says:

May 14, 2013 at 9:27 am

Thank you!

There’re not many people in this world understand this, and it’s lessen that those who understand share it to other people.

ขอบคุณครับ

Reply
Jeff says:

June 17, 2013 at 3:42 am

hmm, code snippets seem to be missing…

Reply
micah epps says:

September 11, 2013 at 7:59 pm

I don’t usually bother leaving a comment while trolling for info on the net, but WOW. I do like this level of detail. Thanks for sharing !!

Reply
Bryan Taylor says:

September 21, 2013 at 5:08 pm

Hey Greg. Great post. I’m often left wondering when (reading through someone else’s arduino sketch) what these little secretive code snippets with bits, bytes and masks mean. You’ve gone a long way to explaining how the pieces of the puzzle fit in this post. Your writing style is also easy to read (though I sometimes find myself wishing for a picture or a diagram…).

If you ever get the time (I know that the prescaler of life takes a lot of the clock cycles away…) I’d really love to see posts that explain more on how to follow higher level functions and macros back to their roots in the header files, and also how to look at a pinout diagram and begin to use it with bits and bytes. There’s a lot more I’d like to know now that I’m thinking about it, but I’m sure I’ve already hit my quota.

Thanks again. Being a nuts and bolts guy, I like the inner details of how things work.

Reply