Jump to content

Duff's Device




What every embedded programmer should know about … Duff's Device

"A trick used three times becomes a standard technique"
(George Polya)

C is a peculiar language and knowing the internal details of how it works can be very powerful. At the same time, when we say things like that of course we first think of the Story of Mel which we wrote about before, so be careful not to write unmaintainable code!

Probably the most famous "trick" in C is something called  Duff's Device. It was discovered in 1983 by Duff while working at Lucasfilm (since they were working on Indiana Jones and the Temple of Doom around that time I am just going to believe that the booby traps were the inspiration for this one).

The purpose of this "trick" is to unroll a loop in C, to get the performance increase of having the loop unrolled even if the number of times the loop runs is not always the same.  We all know unrolling a loop which runs a fixed number of times is trivially easy, but doing it for an arbitrary number of cycles is quite a challenge.

It relies on two facts about C. Firstly that case labels fall through if there is no break, and secondly that C allows safely jumping into the middle of a loop. Duff originally had this problem when he had to simply copy integers. This is very similar to the problem we often have in embedded programs that need to copy data, or do anything mundane repeatedly in a loop.

Duff's problem was sending multiple src bytes to the same destination, similar to sending a string on a UART where to points to the send register location (of course his machine did not require waiting between bytes! 

send (uint8_t* to, uint8_t* from, uint16_t count)
    do {                          /* count > 0 assumed */
        *to = *from++;
    } while(--count > 0);

The problem with this code is that the post-increment together with the assignment is a fairly cheap operation, but the decrement of the count, comparison to zero together with the jump is more than twice that expense, and doing this all once for every byte means we are spending a lot more time on the loop than copying the data!

Duff realized that he could change the time spent by any ratio he desired which would allow him to make a tradeoff at will between code space and speed. Of course the mileage you get will vary based on what you are doing in the loop and the capability of the processor to do the +7 and /8 math, but if you keep to powers of 2 and the divide becomes a simple shift, or even a swap and a mask like you have for 16, this can yield impressive results. 

The solution looks like this:

send (uint8_t* to, uint8_t* from, uint16_t count)
    register uint16_t n = (count + 7) / 8;
    switch (count % 8) {
       case 0: do { *to = *from++;
       case 7:      *to = *from++;
       case 6:      *to = *from++;
       case 5:      *to = *from++;
       case 4:      *to = *from++;
       case 3:      *to = *from++;
       case 2:      *to = *from++;
       case 1:      *to = *from++;
        } while (--n > 0);

Just to test that it works, let's go through 2 examples.

If we pass in a count of 1 the following happens

  1. the variable n will become (1+7)/8 = 1
  2. the switch will jumpt to (1%8) = "case 1"
  3. n-- becomes 0  and the while loop terminates immediately and we are done

If we pass in a count of 12 the following happens:

  1. the variable n will become (12+7)/8 = 2
  2. the switch will jumpt to (12%8) = "case 4"
  3. The unrolled loop will execute and fall through 4 case statements
  4. n-- becomes 1  and the while loop jumps to "case 0"
  5. The loop executes 8 more times
  6. n-- becomes 0  and the while loop terminates and we are done.

As you can see the loop was executed 4 + 8  = 12 times but the loop variable (n) was only decremented twice and the jump happend 2 times instead of 12 for around 80% saving in loop costs.

This concept is easily expanded to something like memcpy which increments both the from and to pointers.

As always there is a link at the end with a zipped MPLAB-X project you can use with the simulator to play with Duff's Device!


Wikipedia page on Duff's Device 
Duff's device entry in The Jargon File 

Source Code





Recommended Comments

There are no comments to display.

Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...