Jump to content
 

Search the Community

Showing results for tags 'programming'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

  • Top Level
    • Announcements
    • The Lounge
    • Questions and Answers
    • Forum Help
    • Project Gallery
    • Vendor Bug Reports

Blogs

  • What every embedded programmer should know about ...
  • Programming Lore
  • Toe-to-toe

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


About Me

Found 14 results

  1. Beginners always have a hard time understanding when to use a header file and what goes in the .c and what goes in the .h. I think the root cause of the confusion is really a lack of information about how the compilation process of a C program actually works. C education tends to focus on the semantics of loops and pointers way too quickly and completely skips over linkage and how C code is compiled, so today I want to shine a light on this a bit. C code is compiled as a number of "translation units" (I like to think of them as modules) which are in the end linked together to form a complete program. In casual usage a "translation unit" is often referred to as a "Compilation Unit". The process of compilation is quite nicely described in the XC8 Users Guide in section 4.3 so we will look at some of the diagrams from that section shortly. Compilation Steps Before we get into a bit more detail I want to step back slightly and quite the C99 standard on the basic idea here. Section 5.1.1.1 of the C99 standard refers to this process as follows: A C (or C++ for that matter) compiler will process your code as shown to the right (pictures from XC8 User's Guide). The C files are compiled completely independenty of each other into a set of object files. After this step is completed the object files are linked together, like a set of Lego blocks, into a final program during the second stage. It is possible for 2 entirely different programs to share some of the same obj files this way. Re-usable object files that perform common functions are often bundled together into an archive of object files referred to as a "Library" by the standard, and is often "zipped" together into a single file called .lib or .a This sequence is pretty much standard for all C compilers as depicted on the right. Looking at XC8 in particular there is a little more details that only applies to this compiler. The PIC16 also poses some challenges for compilers as the architecture has banked memory. This means that moving code around from one location to another may not only change the addreses of objects (which is quite standard) but it may also require some extra instructions (bank switch instructions) to be added depending on where the code is ultimately placed in memory. We will not get any deeper into the details here, but I want to point out the most important aspects. Some useful tips: Most compilers will have an option to emit and keep the output from the pre-processor so that you can look at it. When debugging your #define's and MACROs are getting you under this is an excellent debugging tool. With the latest version of XC8 the option to keep these files is "-save-temps" which can be passed to the linker as an additional argument. (They will end up in a folder called "build" and the .pre files in the diagram may have an extention ".i" depending on the processor you are on). During the linking step all the objects (translation units) which will be combined to create the final program will be linked together to produce an executable. This process will decide where each variable and function will be placed, and all symbolic references are replaced with actual addresses. This process is sometimes referred to as allocation, re-allocation or fix-up. At this step it is possible to supply most linkers with a linker file or "linker script" which will guide the linker about which memory locations you want it to use. Although the C standard does not specify the format of the object files, some common formats do exist. Most compilers used to use the COFF format (Common Object File Format) which typically produces files with the extension .o or .obj. Another popular format favored by many compilers today is the ELF (Executable and Linkable Format). The most important thing to take away from all that is that your C files the H files they includ will be combined into a single translation unit which will be processed by the compiler. The compiler literally just pastes you include file into the translation unit at the place you include it. There is no magic here. So why do I need an H file at all then? As noted in the standard different translation units communicate with each other through either calling functions with external linkage, or manipulating objects with external linkage. When 2 translation units communicate in this way it is very important that they both have the exact same definition for the objects they are exchanging. If we just had the definitions in C files without any headers then the definitions of everything that is shared would have to be very carefully re-typed in each file, and all of these copies would later have to be maintained. It really is that simple, the only reason we have header files is to save us from maintaining shared code in multiple places. As you should have noticed by now the descriptions of translation units sound very similar to "libraries" or "modules" of your program, and this is precisely what they are. Each translation unit is an independent module which may or may not be dependent on one or more other modules, and you can use these concepts to split your programs into more managable and re-usable modules. This is the divide and conquer strategy. In this scheme the sole purpose of header files is to be used by more than one translation unit. They represent a definition of the interface that 2 modules in your program can use to communicate with each other and saves you from typing it all multiple times. Let's look at a simple example of a how a header file named module.h may be processed when it is included into your C code. // This is module.h - the header file // This file defines the interface specification for this module. // It contains all definitions of functions and/or variables with external linkage // The purpose of this file is to provide other translation units with the names of the objects that this translation unit // provides, so that they can be used to communicate with this translation unit. // This declaration promises the compiler that somewhere there will be a function called "myFunction" which the linker will be able to resolve void myFunction(void); // This declaration promises the compiler that somewhere there will be a variable called "i" which the linker will be able to resolve extern int i; And it's corresponding C source file module.c // This is module.c - the C file for my module // This file contains the implemenation of my module // It is typical for a module to include it's own interface, this makes it easier to implement by ensuring the interface and implemenation are identical #include "module.h" // Declaring a variable like this will allocate storage for it. // In C Variables with global scope has external linkage by default (This is NOT true for C++ where this would have internal linkage) int i = 42; // Functions have external linkage by default, it is not necessary to say extern void myFunction as the "extern" is implied void myFunction(void) { ... // some code here } As described above the pre-processor will convert this into a file for the compiler to process which looks like this : // This is module.c - the C file for my module // This file contains the implemenation of my module // It is typical for a module to include it's of interface, this makes it easier to implement in many ways // This is module.h - the header file // This file defines the interface specification for this module. // It contains all definitions of functions and/or variables with external linkage // The purpose of this file is to provide other translation units with the names of the objects that this translation unit // provides, so that they can be used to communicate with this translation unit. // This declaration promises the compiler that somewhere there will be a function called "myFunction" which the linker will be able to resolve void myFunction(void); // This declaration promises the compiler that somewhere there will be a variable called "i" which the linker will be able to resolve extern int i; // Defining a variable like this will allocate storage for it. // In C Variables with file scope has external linkage by default (This is NOT true for C++ where this would have internal linkage) int i = 42; // Functions have external linkage by default, it is not necessary to say extern void myFunction as the "extern" is implied void myFunction(void) { // some code here } void main(void) { myFunction(void); } Now you will notice that with the inclusion of the header like this some things like the "int i" end up occuring in the file twice. This can be very confusing when we try and establish exactly which of these statements are just declarations of the names of variables and which ones actually allocate memory. If a symbol like "int i" is declared more than once in the file how do we ensure that memory is not allocated more than once, especially if "int i" occurs in the global file scope of more than one tranlsation unit! In order to make more sense of this we can go over how the compiler will process the combined "tranlation unit" from top to bottom for our simple example. When the compiler processes this file it first finds a declaration of a function without an implementation/definition. This tells the compiler to only declare this name in what is commonly referred to as it's "dictionary". Once the name is established it is possible for the implementation to safely refer to this name. Such a declaration of a function without an implementation is called a function prototype. The next code line contains a declaration of an integer called "i" with external linkage (we will get to linkage in the next section). This is a declaration as opposed to a definition, as it does not have any initializer (an assignment with an initial value). This declaration places "i" in the dictionary, but does not allocate storage for the variable. It also marks the object as having external linkage. When 2 compilation units declare the same object with external linkage the compiler will know that they are linked (refer to the same thing), and it will only allocate space for it once so that both translation units end up manipulating the same variable! Later on the compiler finds "int i = 42", this is a definition of the same symbol "i", this time it also supplies an initializer, which tells the compiler to set this variable to 42 before main is run. As this is a definition this is the statement that will cause memory to be allocated for the variable. If you try and have 2 definitions for the same object (even in 2 separate translation units) the compile will report an error which will alert you that the object was defined more than once. It will either say "duplicate symbol" or "multiple definition for object" or something along these lines (error messages are not specified by the standard so these messages are different on each compiler). Next we encounter the implementation/definition of the function myFunction. Lastly we encounter the implementation/definition of main, which is traditionally the entry point of the application. I encourage you to cut and paste that snippet above into an empty project and compile it to assure yourself that this works fine. After that I want you to paste these examples so we can better understand the mechanics here, and prove that I am not smoking something here! // Some test code to show why include files actually end up working int j; int j; int j = 1; void main(void) { // Empty main } You can compile this and note that there will be no error. (a project with this code is attached for your convenience). This is because int j; is what is called a "tentative definition". What this means is that we are stating that there is a definition for this variable in this translation unit. If the end of the tranlation unit is reached and no definition has been provided (there was no definition with an initializer) then the compiler must behave as if there was a definition with an initializer of "0" at the end of the translation unit. You can have as many tentative definitions as you want for the same object, even if they are in the same compilation unit, as long as their types are all the same. The third line is the only definition of "j" which also triggers the allocation of storage for the variable. If this line is removed storage will be allocated at link time as if there was a definition with initializer of 0 at the end of the file if no definitions can be found in any of the translation units being linked. Now change the code to look as follows: // Some test code to show why include files actually end up working int j; int j = 1; int j = 1; void main(void) { // Empty main } This will result in the following error message, since more than one initializer is provided we have multiple definitions for the object in the same translation unit, which is not allowable. This is because a declaration with an initializer is a definition for a variable and a definition will allocate storage for the variable. We can only allocate storage for a variable once. On CLang the error looks only slightly different: Now let's try something else. Change it to look like this. This time the variable is an auto variable, so it has internal linkage (this is also called a local variable). In this case we are not allowed to declare the same variable more than once because there is no good reason (like with header files) to do this and if this happens it would most likely be a mistake so the compiler will not allow it. // Some test code to show why include files actually end up working void main(void) { int j; int j; int j = 1; } The error produced looks as follows on XC8, and I will get this even if I have only 2 j's with no initializer: An important note here, auto variables (local variables) will NOT be automatically initialized to 0 like variables at file scope with external linkage will be. This means that if you do not supply any initializer the variable can and will likely have any random value. Linkage We spoke about linkage quite a bit, so lets also make this clear. The C Standard states in section 6.2.2: For any identifier with file scope linkage is automatically external. External linkage means that all variables with this identical name in all translation units will be linked together to point to the same object. Variables with file scope which has a "static" storage-class specifier have internal linkage. This means that all objects WITHIN THIS TRANSLATION UNIT with the same name will be linked to refer to the same object, but objects in other translation units with the same symbol name will NOT be linked to this one. Local variables (variables with block or function scope) automatically has no linkage, this means they will never be linked, which means having the same symbol twice will cause an error (as they cannot be linked). An example of this was shown in the last sample block of the previous section. Note that adding "extern" in front of a local variable will give it external linkage, which means that it will be linked to any global variables elsewhere in the program. I have made a little example project to play with which demonstrates this behavior (perhaps to your surprize!) If two tranlation units both contain definitions for the same symbol with external linkage the compiler will only define the object once and both definitions will be linked to the same definition. Since the definitions provide initial values this only works if both definitions are identical. There is a nice example, as always, in the C99 standard. int i1 = 1; // definition, external linkage static int i2 = 2; // definition, internal linkage extern int i3 = 3; // definition, external linkage int i4; // tentative definition, external linkage static int i5; // tentative definition, internal linkage int i1; // valid tentative definition, refers to previous int i2; // 6.2.2 renders undefined, linkage disagreement int i3; // valid tentative definition, refers to previous int i4; // valid tentative definition, refers to previous int i5; // 6.2.2 renders undefined, linkage disagreement extern int i1; // refers to previous, whose linkage is external extern int i2; // refers to previous, whose linkage is internal extern int i3; // refers to previous, whose linkage is external extern int i4; // refers to previous, whose linkage is external extern int i5; // refers to previous, whose linkage is internal // I had to add the missing one extern int i6; // Valid declaration only, whose linkage is external. No storage is allocated. Note that if we have a declaration only such as i6 above in a compilation unit the unit will compile without allocating any storage to the object. At link-time the linker will attempt to locate the definition of the object that allocates it's storate, if none is found in any other compilation unit for the program you will get an error, something like "reference to undefined symbol i6" Looking over those examples you will note that the storage-class specifier "extern" should not be confused with the linkage of the variable. It is very possible that a variable with external storage class can have internal linkage as indicated by the examples from the standard for "i2" and also for "i5". To see if you understand "extern" take a look at this example. What happens when you have one file (e.g. main.c) which defines a local variable as extern like this? [ First try to predict and then test it and see for yourself] #include <stdio.h> void testFunction(void); int i = 1; void main(void) { extern int i; testFunction(); printf("%d\r\n", i); } And in a different file (e.g. module.c) place the following: int i; void testFunction(void) { i = 5; } You should be able to tell if this will compile or not, and if not what error it would give, or will it compile just fine and print 5 ? Also try the following: What happens when you remove the "extern" storage-class specifier? What happens when instead you just emove the entire line "extern int i;" from function main ? (no externs in either file). Is that what you expected? What happens when you move the initializer from file-scope (just leaving the int i), to the function scope definition inside of main (when you have "extern int i = 1;" inside of the main function)? What happens when you add "extern" to the file scope declaration (replace "int i = 1;" with "extern int i = 1;") In Closing When you are breaking your C code into independent "Translation Units" or "Compilation Units", keep in mind that the entire header file is being pasted into your C file whenever you use #include. Keeping this in mind can help you resolve all kinds of mysterious bugs. Make sure you understand when variables have external storage and when they have external linkage. Remember that if 2 modules declare file scope variables with external linkage and the same name, they will end up being the same variable, so 2 libraries using "temp" is a bad idea, as these will end up overwriting each other and causing hard to locate bugs. Answers Cheat Sheet: The code listed compiles fine and prints "5". Additional exercises: When you remove the extern from the first line in function main it prints some random value (on my machine 287088694). This is because the local variable does not have linkage to the other variable called i. When you instead remove the entire first line from the main function it compiles and prints "5" like before. Having "extern int i = 1" inside of function main does not compile at all, complaining that "an extern variable cannot have an initializer" Having "extern int i = 1" in file scope is allowed though! This compiles just fine although CLang will give a warning for this one as long as only one definition exists. If you now also add an initializer to the file scope int i in module.c it will not compile any more.
  2. Structures in the C Programming Language Structures in C is one of the most misunderstood concepts. We see a lot of questions about the use of structs, often simply about the syntax and portability. I want to explore both of these and look at some best practice use of structures in this post as well as some lesser known facts. Covering it all will be pretty long so I will start off with the basics, the syntax and some examples, then I will move on to some more advanced stuff. If you are an expert who came here for some more advanced material please jump ahead using the links supplied. Throughout I will refer to the C99 ANSI C standard often, which can be downloaded from the link in the references. If you are not using a C99 compiler some things like designated initializers may not be available. I will try to point out where something is not available in older complilers that only support C89 (also known as C90). C99 is supported in XC8 from v2.0 onwards. Advanced topics handled lower down Scope Designated Initializers Declaring Volatile and Const Bit-Fields Padding and Packing of structs and Alignment Deep and Shallow copy of structures Comparing Structs Basics A structure is a compound type in C which is known as an "aggregate type". Structures allows us to use sets of variables together like a single aggregate object. This allows us to pass groups of variables into functions, assign groups of variables to a destination location as a single statement and so forth. Structures are also very useful when serializing or de-serializing data over communication ports. If you are receiving a complex packet of data it is often possible to define a structure specifying the layout of the variables e.g. the IP protocol header structure, which allows more natural access to the members of the structure. Lastly structures can be used to create register maps, where a structure is aligned with CPU registers in such a way that you can access the registers through the corresponding structure members. The C language has only 2 aggregate types namely structures and arrays. A union is notably not considered an aggregate type as it can only have one member object (overlapping objects are not counted separately). [Section "6.5.2 Types" of C99] Syntax The basic syntax for defining a structure follows this pattern. struct [structure tag] { member definition; member definition; ... } [one or more structure variables]; As indicated by the square brackets both the structure tag (or name) and the structure variables are optional. This means that I can define a structure without giving it a name. You can also just define the layout of a structure without allocating any space for it at the same time. What is important to note here is that if you are going to use a structure type throughout your code the structure should be defined in a header file and the structure definition should then NOT include any variable definitions. If you do include the structure variable definition part in your header file this will result in a different variable with an identical name being created every time the header file is included! This kind of mistake is often masked by the fact that the compiler will co-locate these variables, but this kind of behavior can cause really hard to find bugs in your code, so never do that! Declare the layout of your structures in a header file and then create the instances of your variables in the C file they belong to. Use extern definitions if you want a variable to be accessible from multiple C files as usual. Let's look at some examples. Example1 - Declare an anonymous structure (no tag name) containing 2 integers, and create one instance of it. This means allocate storage space in RAM for one instance of this structure on the stack. struct { int i; int j; } myVariableName; This structure type does not have a name, so it is an anonymous struct, but we can access the variables via the variable name which is supplied. The structure type may not have a name but the variable does. When you declare a struct like this it is not possible to declare a function which will accept this type of structure by name. Example 2 - Declare a type of structure which we will use later in our code. Do not allocate any space for it. struct myStruct { int i; int j; }; If we declare a structure like this we can create instances or define variables of the struct type at a later stage as follows. (According to the standard "A declaration specifies the interpretation and attributes of a set of identifiers. A definition of an identifier is a declaration for that identifier that causes storage to be reserved for that object" - 6.7) struct myStruct myVariable1; struct myStruct myVariable2; Example 3 - Declare a type of structure and define a type for this struct. typedef struct myStruct { int i; int j; } myStruct_t; // Not to be confused by a variable declaration // typedef changes the syntax here - myStruct_t is part of the typedef, NOT the struct definition! // This is of course equivalent to struct myStruct { int i; int j; }; // Now if you placed a name here it would allocate a variable typedef struct myStruct myStruct_t; The distinction here is a constant source of confusion for developers, and this is one of many reasons why using typedef with structs is NOT ADVISED. I have added in the references a link to some archived conversations which appeared on usenet back in 2002. In these messages Linus Torvalds explains much better than I can why it is generally a very bad idea to use typedef with every struct you declare as has become a norm for so many programmers today. Don't be like them! In short typedef is used to achieve type abstraction in C, this means that the owner of a library can at a later time change the underlying type without telling users about it and everything will still work the same way. But if you are not using the typedef exactly for this purpose you end up abstracting, or hiding, something very important about the type. If you create a structure it is almost always better for the consumer to know that they are dealing with a structure and as such it is not safe to to comparisons like == to the struct and it is also not safe to copy the struct using = due to deep copy problems (later on I describe these). By letting the user of your structs know explicitly they are using structs when they are you will avoid a lot of really hard to track down bugs in the future. Listen to the experts! This all means that the BEST PRACTICE way to use structs is as follows, Example 4- How to declare a structure, instantiate a variable of this type and pass it into a function. This is the BEST PRACTICE way. struct point { // Declare a cartesian point data type int x; int y; }; void pointProcessor(struct point p) // Declare a function which takes struct point as parameter by value { int temp = p.x; ... // and the rest } void main(void) { // local variables struct point myPoint = {3,2}; // Allocate a point variable and initialize it at declaration. pointProcessor(myPoint); } As you can see we declare the struct and it is clear that we are defining a new structure which represents a point. Because we are using the structure correctly it is not necessary to call this point_struct or point_t because when we use the structure later it will be accompanied by the struct keyword which will make its nature perfectly clear every time it is used. When we use the struct as a parameter to a function we explicitly state that this is a struct being passed, this acts as a caution to the developers who see this that deep/shallow copies may be a problem here and need to be considered when modifying the struct or copying it. We also explicitly state this when a variable is declared, because when we allocate storage is the best time to consider structure members that are arrays or pointers to characters or something similar which we will discuss later under deep/shallow copies and also comparisons and assignments. Note that this example passes the structure to the function "By Value" which means that a copy of the entire structure is made on the parameter stack and this is passed into the function, so changing the parameter inside of the function will not affect the variable you are passing in, you will be changing only the temporary copy. Example 5 - HOW NOT TO DO IT! You will see lots of examples on the web to do it this way, it is not best practice, please do not do it this way! // This is an example of how NOT to do it // This does the same as example 4 above, but doing it this way abstracts the type in a bad way // This is what Linus Torvalds warns us against! typedef struct point_tag { // Declare a cartesian point data type int x; int y; } point_t; void pointProcessor(point_t p) { int temp = p.x; ... // and the rest } void main(void) { // local variables point_t myPoint = {3,2}; // Allocate a point variable and initialize it at declaration. pointProcessor(myPoint); } Of course now the tag name of the struct has no purpose as the only thing we ever use it for is to declare yet another type with another name, this is a source of endless confusion to new C programmers as you can imagine! The mistake here is that the typedef is used to hide the nature of the variable. Initializers As you saw above it is possible to assign initial values to the members of a struct at the time of definition of your variable. There are some interesting rules related to initializer lists which are worth pointing out. The standard requires that initializers be applied in the order that they are supplied, and that all members for which no initializer is supplied shall be initialized to 0. This applies to all aggregate types. This is all covered in the standard section 6.7.8. I will show a couple of examples to clear up common misconceptions here. Desctiptions are all in the comments. struct point { int x; int y; }; void function(void) { int myArray1[5]; // This array has random values because there is no initializer int myArray2[5] = { 0 }; // Has all its members initialized to 0 int myArray3[5] = { 5 }; // Has first element initialized to 5, all other elements to 0 int myArray3[5] = { }; // Has all its members initialized to 0 struct point p1; // x and y both indeterminate (random) values struct point p2 = {1, 2}; // x = 1 and y = 2 struct point p3 = { 1 }; // x = 1 and y = 0; // Code follows here } These rules about initializers are important when you decide in which order to declare your members of your structures. We saw a great example of how user interfaces can be simplified by placing members to be initialized to 0 at the end of the list of structure members when we looked at the examples of how to use RTCOUNTER in another blog post. More details on Initializers such as designated initializers and variable length arrays, which were introduced in C99, are discussed in the advanced section below. Assignment Structures can be assigned to a target variable just the same as any other variable. The result is the same as if you used the assignment operator on each member of the structure individually. In fact one of the enhancements of the "Enhanced Midrange" code in all PIC16F1xxx devices is the capability to do shallow copies of structures faster thought specialized instructions! struct point { // Declare a cartesian point data type int x; int y; }; void main(void) { struct point p1 = {4,2}; // p1 initialized though an initializer-list struct point p2 = p1; // p2 is initialized through assignment // At this point p2.x is equal to p1.x and so is p2.y equal to p1.y struct point p3; p3 = p2; // And now all three points have the same value } Be careful though, if your structure contains external references such as pointers you can get into trouble as explained later under Deep and Shallow copy of structures. Basic Limitations Before we move on to advanced topics. As you may have suspected there are some limitations to how much of each thing you can have in C. The C standard calls these limits Translation Limits. They are a requirement of the C standard specifying what the minimum capabilities of a compiler has to be to call itself compliant with the standard. This ensures that your code will compile on all compliant compilers as long as you do not exceed these limits. The Translation Limits applicable to structures are: External identifiers must use at most 31 significant characters. This means structure names or members of structures should not exceed 31 unique characters. At most 1023 members in a struct or union At most 63 levels of nested structure or union definitions in a single struct-declaration-list Advanced Topics Scope Structure, union, and enumeration tags have scope that begins just after the appearance of the tag in a type specifier that declares the tag. When you use typedef's however the type name only has scope after the type declaration is complete. This makes it tricky to define a structure which refers to itself when you use typedef's to define the type, something which is important to do if you want to construct something like a linked list. I regularly see people tripping themselves up with this because they are using the BAD way of using typedef's. Just one more reason not to do that! Here is an example. // Perfectly fine declaration which compiles as myList has scope inside the curly braces struct myList { struct myList* next; }; // This DOES NOT COMPILE ! // The reason is that myList_t only has scope after the curly brace when the type name is supplied. typedef struct myList { myList_t* next; } myList_t; As you can see above we can easily refer a member of the structure to a pointer of the structure itself when you stay away from typedef's, but how do you handle the more complex case of two separate structures referring to each other? In order to solve that one we have to make use of incomplete struct types. Below is an example of how this looks in practice. struct a; // Incomplete declaration of a struct b; // Incomplete declaration of b struct a { // Completing the declaration of a with member pointing to still incomplete b struct b * myB; }; struct b { // Completing the declaration of b with member pointing to now complete a struct a * myA; }; This is an interesting example from the standard on how scope is resolved. Designated Initializers (introduced in C99) Example 4 above used initializer-lists to initialize the members of our structure, but we were only able to omit members at the end, which limited us quite severely. If we could omit any member from the list, or rather include members by designation, we could supply the initializers we need and let the rest be set safely to 0. This was introduced in C99. This addition had a bigger impact on Unions however. There is a rule in a union which states that initializer-lists shall be applied solely to the first member of the union. It is easy to see why this was necessary, since the members of a each struct which comprizes a union do not have to be the same number of members, it would be impossible to apply a list of constants to an arbitraty member of the union. In many cases this means that designated initializers are the only way that unions can be initialized consistently. Examples with structs. struct multi { int x; int y; int a; int b; }; struct multi myVar = {.a = 5, .b = 3}; // Initialize the struct to { 0, 0, 5, 3 } Examples with a Union. struct point { int x; int y; }; struct circle { struct point center; int radius; }; struct line { struct point start; struct point end; }; union shape { struct circle mCircle; struct line mLine; }; void main(void) { volatile union shape shape1 = {.mLine = {{1,2}, {3,4}}}; // Initialize the union using the line member volatile union shape shape2 = {.mCircle = {{1,2}, 10}}; // Initialize the union using the circle member ... } The type of initialization of a union using the second member of the union was not possible before C99, which also means if you are trying to port C99 code to a C89 compiler this will require you to write initializer functions which are functionally different and your port may end up not working as expected. Initializers with designations can be combined with compound literals. Structure objects created using compound literals can be passed to functions without depending on member order. Here is an example. struct point { int x; int y; }; // Passing 2 anonymous structs into a function without declaring local variables drawline( (struct point){.x=1, .y=1}, (struct point){.y=3, .x=4}); Volatile and Const Structure declarations When declaring structures it is often necessary for us to make the structure volatile, this is especially important if you are going to overlay the structure onto registers (a register map) of the microprocessor. It is important to understand what happens to the members of the structure in terms of volatility depending on how we declare it. This is best explained using the examples from the C99 standard. struct s { // Struct declaration int i; const int ci; }; // Definitions struct s s; const struct s cs; volatile struct s vs; // The various members have the types: s.i // int s.ci // const int cs.i // const int cs.ci // const int vs.i // volatile int vs.ci // volatile const int Bit Fields It is possible to include in the declaration of a structure how many bits each member should occupy. This is known as "Bit Fields". It can be tricky to write portable code using bit-fields if you are not aware of their limitations. Firstly the standard states that "A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type." Further to this it also statest that "As specified in 6.7.2 above, if the actual type specifier used is int or a typedef-name defined as int, then it is implementation-defined whether the bit-field is signed or unsigned." This means effectively that unless you use _Bool or unsigned int your structure is not guaranteed to be portable to other compilers or platforms. The recommended way to declare portable and robust bitfields is as follows. struct bitFields { unsigned enable : 1; unsigned count : 3; unsigned mode : 4; }; When you use any of the members in an expression they will be promoted to a full sized unsigned int during the expression evaluation. When assigning back to the members values will be truncated to the allocated size. It is possible to use anonymous bitfields to pad out your structure so you do not need to use dummy names in a struct if you build a register map with some unimplemented bits. That would look like this: struct bitFields { unsigned int enable : 1; unsigned : 3; unsigned int mode : 4; }; This declares a variable which is at least 8 bits in size and has 3 padding bits between the members "enable" and "mode". The caveat here is that the standard does not specify how the bits have to be packed into the structure, and different systems do in fact pack bits in different orders (e.g. some may pack from LSB while others will pack from MSB first). This means that you should not rely on the postion of specific position of bits in your struct being in specific locations. All you can rely on is that in 2 structs of the same type the bits will be packed in corresponding locations. When you are dealing with communication systems and sending structures containing bitfields over the wire you may get a nasty surprize if bits are in a different order on the receiver side. And this also brings us to the next possible inconsitency - packing. This means that for all the syntactic sugar offered by bitfields, it is still more portable to use shifting and masking. By doing so you can select exactly where each bit will be packed, and on most compilers this will result in the same amount of code as using bitfields. Padding, Packing and Alignment This is going to be less applicable on a PIC16, but if you write portable code or work with larger processors this becomes very important. Typically padding will happen when you declare a structure that has members which are smaller than the fastest addressible unit of the processor. The standard allows the compiler to place padding, or unused space, in between your structure members to give you the fastest access in exchange for using more RAM. This is called "Alignment". On embedded applications RAM is usually in short supply so this is an important consideration. You will see e.g. on a 32-bit processor that the size of structures will increment in multiples of 4. The following example shows the definition of some structures and their sizes on a 32-bit processor (my i7 in this case running macOS). And yes it is a 64 bit machine but I am compiling for 32-bit here. // This struct will likely result in sizeof(iAmPadded) == 12 struct iAmPadded { char c; int i; char c2; } // This struct results in sizeof(iAmPadded) == 8 (on GCC on my i7 Mac) or it could be 12 depending on the compiler used. struct iAmPadded { char c; char c2; int i; } Many compilers/linkers will have settings with regards to "Packing" which can either be set globally. Packing will instruct the compiler to avoid padding in between the members of a structure if possible and can save a lot of memory. It is also critical to understand packing and padding if you are making register overlays or constructing packets to be sent over communication ports. If you are using GCC packing is going to look like this: // This struct on gcc on a 32-bit machine has sizeof(struct iAmPadded) == 6 struct __attribute__((__packed__)) iAmPadded { char c; int i; char c2; } // OR this has the same effect for GCC #pragma pack(1) struct __attribute__((__packed__)) iAmPadded { char c; int i; char c2; } If you are writing code on e.g. an AVR which uses GCC and you want to use the same library on your PIC32 or your Cortex-M0 32-bitter then you can instruct the compiler to pack your structures like this and save loads of RAM. Note that taking the address of structure members may result in problems on architectures which are not byte-addressible such as a SPARC. Also it is not allowed to take the address of a bitfield inside of a structure. One last note on the use of the sizeof operator. When applied to an operand that has structure or union type, the result is the total number of bytes in such an object, including internal and trailing padding. Deep and Shallow copy Another one of those areas where we see countless bugs. Making structures with standard integer and float types does not suffer from this problem, but when you start using pointers in your structures this can turn into a problem real fast. Generally it is perfectly fine to create copies of structures by passing them into functions or using the assignement operator "=". Example struct point { int a; int b; }; void function(void) { struct point point1 = {1,2}; struct point point2; point2 = point1; // This will copy all the members of point1 into point2 } Similarly when we call a function and pass in a struct a copy of the structure will be made into the parameter stack in the same way. When the structure however contains a pointer we must be careful because the process will copy the address stored in the pointer but not the data which the pointer is pointing to. When this happens you end up with 2 structures containing pointers pointing to the same data, which can cause some very strange behavior and hard to track down bugs. Such a copy, where only the pointers are copied is called a "shallow copy" of the structure. The alternative is to allocate memory for members being pointed to by the structure and create what is called a "deep copy" of the structure which is the safe way to do it. We probably see this with strings more often than with any type of pointer e.g. struct person { char* firstName; char* lastName; } // Function to read person name from serial port void getPerson(struct person* p); void f(void) { struct person myClient = {"John", "Doe"}; // The structure now points to the constant strings // Read the person data getPerson(&myClient) } // The intention of this function is to read 2 strings and assign the names of the struct person void getPerson(struct person* p) { char first[32]; char last[32]; Uart1_Read(first, 32); Uart1_Read(last, 32); p.firstName = first; p.lastName = last; } // The problem with this code is that it is easy for to look like it works. The probelm with this code is that it will very likely pass most tests you throw at it, but it is tragically broken. The 2 buffers, first and last, are allocated on the stack and when the function returns the memory is freed, but still contains the data you received. Until another function is called AND this function allocates auto variables on the stack the memory will reamain intact. This means at some later stage the structure will become invalid and you will not be able to understand how, if you call the function twice you will later find that both variables you passed in contain the same names. Always double check and be mindful where the pointers are pointing and what the lifetime of the memory allocated is. Be particularly careful with memory on the stack which is always short-lived. For a deep copy you would have to allocate new memory for the members of the structure that are pointers and copy the data from the source structure to the destination structure manually. Be particularly careful when structures are passed into a function by value as this makes a copy of the structure which points to the same data, so in this case if you re-allocate the pointers you are updating the copy and not the source structure! For this reason it is best to always pass structures by reference (function should take a pointer to a structure) and not by value. Besides if data is worth placing in a structure it is probably going to be bigger than a single pointer and passing the structure by reference would probably be much more efficient! Comparing Structs Although it is possible to asign structs using "=" it is NOT possible to compare structs using "==". The most common solution people go for is to use memcmp with sizeof(struct) to try and do the comparison. This is however not a safe way to compare structures and can lead to really hard to track down bugs! The problem is that structures can have padding as described above, and when structures are copied or initialized there is no obligation on the compiler to set the values of the locations that are just padding, so it is not safe to use memcmp to compare structures. Even if you use packing the structure may still have trailing padding after the data to meet alignment requirements. The only time using memcmp is going to be safe is if you used memset or calloc to clear out all of the memory yourself, but always be mindful of this caveat. Conclusion Structs are an important part of the C language and a powerful feature, but it is important that you ensure you fully understand all the intricacies involved in structs. There is as always a lot of bad advice and bad code out there in the wild wild west known as the internet so be careful when you find code in the wild, and just don't use typdef on your structs! References As always the WikiPedia page is a good resource Link to a PDF of the comittee draft which was later approved as the C99 standard Linus Torvalds explains why you should not use typedef everywhere for structs Good write-up on packing of structures Bit Fields discussed on Hackaday
  3. What every embedded programmer should know about ADC measurement, accuracy and sources of error ADC's you encounter will typically be specified as 8, 10 or 12-bit. This is however rarely the accuracy that you should expect from your ADC. It seems counter-intuitive at first, but once you understand what goes on under the hood this will be much clearer. What I am going to do today is take a simple evaluation board for the PIC18F47K40 (MPLAB® Xpress PIC18F47K40 Evaluation Board ) and determine empirically (through experiments and actual measurements) just how accurate you should expect ADC measurements to be. Feel free to skip ahead to a specific section if you already know the basics. Here is a summary of what we will cover with links for the cheaters. Units of Measurement of Errors Measurement Setup Sources and Magnitude of Errors Voltage Reference Noise Offset Gain Error Missing Codes and DNL Integral Nonlinearity (INL) Sampling Error Adding up Errors Vendor Comparison Final Notes Units of Measurement of Errors When we talk about ADC's you will often see the term LSB used. This term refers to the voltage represented by the least significant bit of the ADC, so in reality it is the voltage for which you should read 1 on the ADC. This is a convenient measure for ADC's since the reference voltage is often not fixed and the size of 1 LSB in volts will depend on what you have the reference at, while most errors caused by the transfer function will scale with the reference. For a 10-bit ADC with 3v3 of range one LSB will be 3.3/(2^10) = 3.3/1024 = 3.22mV. An error of 1% on a 10-bit converter would represent 1%*1024 = 10.24x the size of one LSB, so we will refer to this as 10LSB of error, which means our measurement could be off by 32.2mV or ten times the size of 1 LSB. When I have 10 LSB in error I really should be rounding my results to the nearest 10 LSB, since the least significant bits of my measurement will be corrupted by this error. 10LSB will take 3.32 bits to represent. This means that my lowest 3 bits are possibly incorrect and I can only be confident in the values represented by the 7 most significant bits of my result. This means that the effective number of bits (ENOB) for my system is only 7, even though my ADC is taking a 10-bit measurement. The lower 3 bits are affected by the measurement error and cannot be relied upon so they should be discarded if I am trying to make an absolute voltage measurement accurately. We can always work out exactly how many bits of accuracy we are losing, or to how many bits we need to round to, using the calculation: log(#LSB error)/log(2) Note that this calculation will give us fractional numbers of bits. If we have 10LSB error the error does not quite affect a full 4 bits (that happens only at 16LSB), but we can not say it removes only 3 bits, because that already happened at 8LSB, so this is somewhere in between. In order to compare errors meaningfully we will work with fractions of bits in these cases, so 10LSB of error reduces our accuracy by 3.32 bits. This is especially useful when errors are additive because we can add up all the fractional LSB of errors to get the total error to the nearest bit. At this point I would like to encourage you to take your oscilloscope and try to measure how much noise you can detect on your lines. You will probably be surprized that most desk oscilloscopes can only measure signals down to 20mV, which means that 1LSB on a 10-bit ADC with a 3V3 reference will be close to 10x smaller than the smallest signal your digital scope can measure! If you can see noise on the scope (which you probably can) then that means it is probably at least 20mV or 10LSB of error. It turns out that our intuition about how accurate an ADC should be, as well as how accurate our scope can measure is seldom correct ... Measurement Setup I am using my trusty Saleae Logic Pro 8 today. It has a 12-bit ADC and measures +-10V on the analog channel and is calibrated to be accurate to between 9 and 10 ENOB of absolute accuracy. This means that 1LSB of error will be roughly 4.8mV, which for my 2V system with a 10-bit ADC is already the size of 2LSB of measurement error. When I ground the Saleae input and take a measurement we can see how much noise to expect on the input during our measurements. As you will see later we actually want to see 2-3LSB of noise so that we can improve accuracy by digital filtering, if you do not have enough noise this is not possible, so this looks really good. Using the software to find the maximum variation for me you can see that I have about 15.64mV of noise on my line. Since the range is +-10V this is only 15.6/20000 = 0.08% of error, but this is going to be, for my target 2V range, 15.6/2048*1024 = 8LSB of error to start with on my measurement equipment! For an experiment we are going to need an analog voltage source to measure using the ADC. It so happens that this device has a DAC, so why not just use that! You would think that this was a no-brainer, but it turns out, as always, that it is not quite as simple as that would seem! What I will do first is set the DAC and ADC to use the same reference (this has the added benefit that Vref inaccuracy will be cancelled out, nice!).We expect that if we set the DAC to give us 1.024V (50% of full range) and we then measure this using the 10-bit ADC, that we would measure half of the ADC range, or 512 right? For the test I made a simple program that will just measure the ADC every 1 second and print the result to the UART. Well here is the result of the measurement (to the right). Not what you expected ?! Not only are the 1st two readings appallingly bad, but the average seems to be 717, which is a full 40% more than we expect! How is this possible? Well this is how. Not only is the ADC inaccurate here, but the DAC even more so! The DAC is only 5 bits and it is specified to be accurate to 5LSB. That is already a full 320mV of error, but that is still not nearly enough to explain why we are measuring 717/1024*2.048 = 1.434V instead of 1.024V... So what is really going on here? To see I connected my trusty Saleae and changed the application to cycle the DAC though all 32 values, 1s per value, and make a plot for us to look at. On the Saleae we see this. It turns out that the DAC is such a weak source that anything you connect to it's output (like even an ADC input leakage or simply an I/O pin with nothing connected to it!), will load down the DAC and skew the output! This has been the cause of consternation for many a soul actually (see e.g. this post on the Microchip forum) Wow, so that makes sense, but is there anything we can do about this? On this device unfortunately there is not much we can do. There are devices with on-board op-amps you can use to buffer the DAC output like the 16F170x family, but this device does not have op-amps so we are out of luck! I will blog all about DAC's and about what the reasons for this shape is on another occasion, this blog is about ADC after all! So all I am going to do is adjust the DAC setting to give us about the voltage we need by measuring the result using the Saleae and call it a day. Turns out I needed to subtract 6 from the output setting to get close. We now see a measurement of 520 and this is what we see while taking measurements with the Saleae. 10.37mV of noise on just about 1V and we are in business! Sources and Magnitude of Error When measurement errors are uncorrellated it means that they will all add up to form the total worst case error. For example if I have 2LSB of noise, and I also have 2LSB of reference error this means the reading can be 2LSB removed from the correct value as a result of the reference, and an additional 2LSB as a result of the noise, to give 4LSB of total error. This means that these two types of errors are not correllated and the bits contributed by each to the total error are additive. At this point I want to mention that customers often come to me demanding a 16bit-ADC because they feel that the 10-bit one they have is not adequate for their application. They can seldom explain to me why they need 31uV of accuracy or what advanced layout techniques they are applying to keep the noise levels to even remotely this range, and most of the time the real problem turns out to be that their 10bit-ADC is really being used so badly that they are hardly getting 5 bits of accuracy from the converter. I also often see calculations which effectively discard the lower 4 bits of the ADC measurement, leaving you with only 8-bits of effective measurement, so if you do that getting more bits in the ADC is obviously only going to buy you only disappointment! That said, lets look at all of the most significant sources of error one by one in more detail. There are quite a few so we will give them headings and numbers. 1. Voltage Reference To get us started, lets look at the voltage reference and see how many LSB this contributes to our measurement error. If you are using a 1% reference, then please do not insist that you need a 16-bit or even a 12-bit ADC, because your reference alone is contributing errors into the 7th most significant bit and 8 bits is all you are going to get anyway! The datasheet for our evaluation board chip (PIC18F47K40) shows that the voltage reference will be accurate to +-4% when we set it to 2.048V like we did. People are always surprized when they realize how many LSB they are losing due to the voltage reference! 4% of 1024 = 51.2, which means that the reference alone can contribute up to 51.2 LSB of error to our system! Using an expensive off-chip reference also complicates things for us. Using that we would now have to be very careful with the layout to not introduce noise at the reference pin, and also take care of any signals coupling into this pin. Even then the reference will likely be something like a TL431 which will only be accurate to 1% which will be 10 LSB reducing our 10-bit ADC to less than 8 ENOB. We must note that reference errors are not equally distributed. At the maximum scale 1% represents 10LSB of error, but at the lower end of the scale 1% will represent only 1% of 1LSB. Since we are looking for the worst-case error we have to work with 10LSB due to the 1% error over the full ADC range. In your application you may be able to adjust the contribution of this error down to better represent the range you are expecting to measure. For example - at mid range, where our test signal is, the reference error will only contribute 5LSB of error with a 1% reference, or 25LSB for our 4% internal reference. The reference error is something which we could calibrate out if we knew what the error was and many manufacturers discard it stating simply that you should calibrate it out. Sadly these references tend to drift over time, temperature and supply voltages, so you usually cannot just calibrate it in the factory and compensate for the error in software and forget it. To revisit our 16-bit ADC scenario, if I want to measure accurately to 31uV (16 bits on a 2V reference) that reference would have to be accurate to 31uV/2V = 0.0015%. Let's look on Digikey for a price on a Voltage reference with the best specs we can find. Best candidate I can find is this one at $128.31 a piece, and even that gives me only 0.1% with up to 0.6ppm/C drift. This means from 0 to 100C I will have 0.006% of temp drift (2LSB) on top of the 0.1% tolerance (which is another 33LSB). Now to be fair if I am building a control system I am more interested in perturbations from a setpoint and a 16-bit ADC may be valuable even if my reference is off, because I am not trying to take an absolute measurement, but still maintaining noise levels below 30uV is more of a challenge than it sounds, especially if I am driving some power which adds noise to the equation. This is of course the difference between accuracy and resolution. Accuracy gives me the minimum absolute error while resolution gives me the smallest relative unit of measure. 2. Noise Noise is of course the problem we all expect. It can often be a pretty large contributor to your measurement errors, and digital circuits are known for producing lots of noise that will couple into your inputs, but as we will see noise is not all bad and can be essential if you want to improve the results through digital post processing. We have seen that every 2mV of noise will add 1LSB to the error on our system as we have a 2V reference, and 1024 steps of measurement. As you have now seen this 2mV is probably much smaller than we can measure with a typical oscilloscope, so we cannot be sure how much noise we really have if we simply look at it on our scope. For most systems the recommendation would be to place the microcontroller in lowest power sleep mode and avoid toggling any output pins during the sampling of the ADC measurement to get the measurement with the lowest noise level. A simple experiment will show how much noise we could be coupling into the measurement when an adjacent pin is being toggled. I updated our program from before to simply toggle the pin next to the ADC input constantly and measured with the Saleae to see what the effect is. On the left is the signal zoomed out and on the right is one of the transitions zoomed in so you can get a better look. That glitch on the measurement line is 150mV or 75 LSB of noise due to an adjacent pin toggling, and the dev board I have does not even have long traces which would have made this much worse! It seems like a good idea to filter all this noise using analog low-pass filters like filter capacitors, but this is not always wise. We can make small amounts of noise work to our advantage, as long as it is white noise which is uncorrellated with our signal and other errors. When we do post-processing like taking multiple samples and averaging the result we can potentially increase the overall accuracy of our measurement. Using this technique it is possible to increase the ENOB (effective number of bits) of your measurements by simply taking more samples and averaging them. Without getting too deep into the math there, if you oversample a signal by a factor of N you will improve the SNR by a factor of sqrt(N), which means oversampling 256 times and taking the average will result in an increase of 16x the SNR, which represents an additional 4 bits of resolution of the ADC. Of course this is where having uncorrellated white noise of at least +-1LSB is important. If you have no noise on your signal you would likely just sample the same value 256 times and the average would not add any improvement to the resolution. If you had white noise added to the signal however you would sample a variety of values with the average lying somewhere in between the LSB you can measure, and the ratio of difference would represent the value of the signal more accurately. For a detailed discussion on this topic you can take a look at this application note by Silicon Labs https://www.silabs.com/documents/public/application-notes/an118.pdf 3. Offset The internal circuitry in the ADC will cause some offset error added into the conversion. This error will move all measurements either up or down by an equal amount. The Offset is a critical parameter for an ADC and should be specified in the datasheet for your device. For the PIC18F47K40 device the error due to offset is specified as 2 LSB. Of course if we know what the offset is we could easily subtract this from the results, so many specifications will exclude the offset error and claim that you could easily "calibrate out" the offset. This may be possible, even easy to do, but if you do not write the code for it and do the actual math you will have to include the offset error in your accuracy calculations, and measuring what the current offset is can be a real challenge in a real-world system which is not located on your laboratory bench. If you do decide to measure the offset on the factory floor and calibrate it out using software you need to be careful to use an accurate reference, avoid noise and other sources of error and make sure that the offset will remain constant over the operating range of voltage and temperature and also that it will not drift over time. If any of these are true your calibration will be met with limited success. Offset is often hard to calibrate out since many ADC's are not accurate close to the extremes (at Vref or 0V). If they were you could take a measurement with the input on Vref+ and on Vref- and determine the offset, but we knew it was never going to be this easy! The offset will also be different from device to device, so it is not possible to calibrate this out with fixed values in your code, you will have to actively measure this on every device in the factory and adjust as the offset changes. Some manufacturers will actually calibrate out the offset on an ADC for you during their manufacturing process. If this is the case you will probably see a small offset error of +-1 LSB which means that it is calibrated to be within this range. On our device the datasheet specifies a typical offset error of 0.5 LSB with a max error of 2 LSB, so this device is factory calibrated to remove the offset error, but even after this we should still expect up to 2 LSB of drift in the offset around the calibrated value. 4. Gain Error Similar to the offset the internal transfer function of the ADC is designed to be as close as possible to ideal but there is always some error. Gain error will cause the slope of the transfer function to be changed. Depending on the offset this can cause an error which is at maximum either at the top or bottom end of the measurement scale as shown in the figure below. Like the offset it is also possible to calibrate out the gain error, as long as we have enough reference points to use for the calibration. If the transfer function is perfectly linear this would mean we would only require 2 measurement points. For our device the datasheet spec is typically 0.2LSB of gain error with a max error of 1.5LSB. This means that we cannot gain much from attempting to calibrate out the gain on this one. For other manufacturers you can easily find gain and offset errors in the tens of LSB, which makes calibration and compensation for the gain and offset worth the effort. The PIC18F47K40 is not only compensated for drift with temperature but also individually calibrated in the factory, so it seems that any additional calibration measurements will be at most accurate to 1LSB and the device is already specified to typically have less than this error, so calibration will probably gain us nothing. 5. Missing Codes and DNL We expect that every time the code increments by 1LSB that the input voltage has increased by exactly 1LSB in size. For an ADC the DNL error is a measure of how close to this ideal we are in reality. It represents the largest single step error that exists for the entire range of the ADC. If the DNL is stated at 0.5LSB this means that it can take anything from 0.5LSB to 1.5LSB of input voltage change to get the output code to increment by 1. When the DNL is more than 1LSB it means that we can move the input voltage by 2LSB and only get a single count of the converter. When this happens it is possible that it causes the next bit to be squeezed down to 0LSB, which can cause the converter to skip that code entirely as shown below. Most converters will specify that the result will monotonically increase as the voltage increases and that it will have no missing codes as you scan through the range, but you still have to be careful, because this is under ideal conditions and when you add in the other errors it is possible that some codes get skipped, so when you are comparing the output of the converter never check for a specific conversion value. Always look for a value in a range around the limit you are checking. 6. Integral Nonlinearity - INL INL is another of the critical parameters for all ADC's and will be stated in your datasheet if the ADC is any good. For our example the INL is specified as 3.5LSB. The tearm INL refers to the integral of the differential nonlinearity. In effect it represents what the maximum deviation from the ideal transfer function of the ADC will be as shown in the picture below. The yellow line represents the ideal transfer function while the blue line represents the actual transfer function. As you can see the INL is defined as the size of the maximum error through the range of the ADC. Since the INL can happen at any location along the curve it is not possible to calibrate this out. It is also uncorrellated with the other errors we have examined. We just have to live with this one! 7. Sampling error A SAR ADC will consist of a sampling capacitor which holds the voltage we are converting during the conversion cycle. We must take care when we take a sample that we allow enough time for this sampling capacitor to charge to the level of accuracy we want to see in our conversion. Effectively we end up with a circuit that has some serial impedance through which the sampling capacitor is charged. The simplified circuit for the PIC18F47K40 looks as follows (from the datasheet). As you can see the series impedance (Rs) together with the sampling writch and passgate impedance (RIC + Rss) will form a low-pass RC filter to charge Chold. A detailed calculation of the sampling time required to be within 1LSB of the desired sampling value is shown in the ADC section of the device datasheet. If we leave too little time for the sample to be acquired this will directly result in a measurement error. In our case this means that if we have 10K Rs and we wait for 462us after the sampling mux turns to the input we are measuring, the capacitor will be charged to within 0.5LSB of our target voltage. The ADC on the PIC18F47L40 has a built-in circuit that can keep the sampling switch closed for us for a number of Tadc periods. This can be set by adjusting the register ADACQ or using the provided API generated by MCC to achieve this. That first inaccurate result we saw in the conversion was a direct result of the channel not being given enough time to charge the sampling cap since the acquisition time was set to the default value of 0. Of course since we are not switching channels the capacitor is closer to the correct value when to take subsequent samples so the error seems to be going away over time! I have seen customers just throw away the first ADC sample as inaccurate, but if you do not understand why you can easily get yourselfs into a lot of trouble when you need to switch channels! We can re-do the measurement and this time use an acquisition time of 4xTadc = 6.8us. This is the result. NOTE : There is another Errata on this device that you have to wait at least 1 instruction cycle before reading the ADGO bit to see if the conversion is complete after setting the ADGO bit. At first I was just doing what the datasheet suggested, set ADGO and then wait while(ADGO); for the conversion to complete. Due to this errata however the ADGO bit will still read 0 the first time you read it and you will think the conversion is done while it has not started, resulting in an ADC reading of 0 ! After adding the required NOP() to the generated MCC code as follows the incorrect first reading is gone: adc_result_t ADCC_GetSingleConversion(adcc_channel_t channel, uint8_t acquisitionDelay) { // Turn on the ADC module ADCON0bits.ADON = 1; // select the A/D channel ADPCH = channel; //Set the Acquisition Delay ADACQ = acquisitionDelay; //Disable the continuous mode. ADCON0bits.ADCONT = 0; // Start the conversion ADCON0bits.ADGO = 1; NOP(); // NOP workaround for ADGO silicon issue // Wait for the conversion to finish while (ADCON0bits.ADGO) { } // Conversion finished, return the result return (adc_result_t)(((adc_result_t)ADRESH << 8) + ADRESL); } Uncorrelated errors I will leave the full analysis up to the reader, but all of these errors are uncorrellated and thus additive, so for our case the worst case error will be when all of these errors align, the offset is in the same direction as the gain error, as the noise, as the INL error, etc. Of course when we test on the bench it is unlikely that we will encounter a situation where all of these are 100% aligned, but if we have manufactured thousands of units in the field running for years it is definitely going to happen and much more often that you would like, so we have no choice but to design for the worst-case error we are likely to see in the wild. For our exampe the different sources of error add up as follows: Voltage Reference = 4% [41 LSB] Noise [8 LSB] Offset [2.5 LSB] Gain [1.5 LSB] INL [3.5 LSB] For a total of 56.5 LSB of potential absolute error in measurement. This reduces our effective number of bits by log(56.5)/log(2) = 5.8 bits, which means that our 10-bit LSB can have absolute errors running into the 6th bit, giving us only 4 ENB (effective number of bits) when we are looking for absolute accuracy. We can improve this to 26.5 LSB by suing a 1% off-chip reference, which will make the ENB = 5 bits. If we look at the measurement we get using the Saleae we measure 0.99V on the line which should result in 0.99V/2.045V *1024 = 495 but our measurement is in fact 520, which is off by 25LSB. So as we can see our 1-board sample does not hit the worst case error at the center of the sampling range here, but our error extended at least into the 5th bit of the result as our 25LSB error requires more than 4 bits to represent. Nevertheless 25LSB is quite a bit better than the worst-case value of 56.5 LSB of error which we calculated, so this particular sample is not doing too badly! I am going to get my hands on a hair dryer in the week and take some measurements at an elevated temperature and then I will come back and update this for your reading pleasure 🙂 Comparison I recently compared some ADC's from different vendors. I was actually looking more at the other features but since I was busy with this I also noted down the specs. Not all of the datasheets were perfectly clear so do reach out to me if I made a mistake somewhere, but this was how they matched up in terms of ADC performance. As fas as I could find them I used the worst case specifications and not the typical ones. Some manufacturers only specify typical results, so this comparison is probably not fair to those who make better specifications with better information. Let me know in the comments how you feel about this. I will go over the numbers again and maybe come update all of these to typical values for a more fair comparison if someone asks me for this ... Manufacturer -> Device -> Xilinx XC7Z010 Microchip PIC32MZ EF Texas Instruments CC3220SF Espressif ESP32 ST Micro STM32L475 Renesas R65N V2 INL 2 3 2.5 12 2.5 3 DNL 1 1 4 7 1.5 2 Offset 8 2 6(1) 25(1) 2.5 3.5 Gain 0.5 8 82(1) ?(2) 4.5 3.5 Total Error (INL+Offset+Gain) 10.5 13 90.5 37+ 9.5 10 I noted that many of these manufacturers specify their ADC only at one temperature point (25C) so you probably have to dig a little deeper to ensure that the specs will not vary greatly over temperature. (1) These settings were specified in the datasheet as an absolute voltage and I converted them to LSB for max range and best resolution of the ADC. Specifically for the TI device the offset was specified as 2mV and gain error as 20mV on a 1.4V range, and for ESP32 the offset is specified as 60mV but for a wider voltage range of 2.45v (2) For ESP32 I was not able to determine the gain error clearly form the datasheet. Final Notes We can conlcude a couple of very important points from this. If the datasheet claims a 12-bit ADC we should not expect 12-bits of accuracy. First we need to calculate what to expect from our entire system, and we should expect the reference to add the most to our error. All 12-bit converters are not equal, so when comparing devices do not just look at how many bits the converters provide, also compare their performance! The same sytem can yield between 5 and 10 bits of accuracy depending on the specs of the converter, so do not be fooled! Many of the vendors specified their ADC only at a very specific temperature and reference voltage at maximum, take care not to be fooled by this - shall we call it "creative" specmanship and be sure to compare apples with apples when looking for absolute accuracy. Source Code For those who have this board or device I attach the test code I used for download here : ADC_47K40.zip
  4. A colleague of mine recommended this little book to me sometime last year. I have been referring to it so often now that I think we should add this to our reading list for embedded software engineers. The book is called "Don't make me think", by Steve Krug.The one I linked below is the "Revisited" version, which is the updated version. This book explains the essense of good user interface design, but why would I recommend this to embedded software engineers? After all embedded devices seldom have rich graphical GUI's and this book seems to be about building websites? It turns out that all the principles that makes a website easy to read, that makes for an awesome website in other words, apply almost verbatim to writing readable/maintainable code! You see code is written for humans to read and maintain, not for machines (machines prefer to read assembly or machine code in binary after all!). The principles explained in this book, when applied to your software will make it a pleasure to read, and effortless to maintain, because it will clearly communicate it's message without the unnecessary clutter and noise that we usually find in source code. You will learn that people who are maintaining and extending your code will not be reasoning as much as they will be satisficing (yes that is a real word !). This forms the basis of what Bob Martin calls "Viscosity" in your code. (read about it in his excellent paper entitled Design Principles and Design Patterns. The idea of Viscosity is that developers will satisfice when maintaining or extending the code, which results in the easiest way to do things being followed most often, so if the easiest thing is the correct thing the code will not rot over time, on the other hand if doing the "right" thing is hard people will bypass the design with ugly hacks and the code will become a tangled mess fairly quickly. But I digress, this book will help you understand the underlying reasons for this and a host of other problems. This also made me think of some excellent videos I keep on sending to people, this excellent talk by Chandler Carruth which explains that, just like Krug explains in this little book, programmers do not actually read code, they scan it, which is why consistency of form is so important (coding standards). Also this great talk by Kevlin Henney which explains concepts like signal to noise ratio and other details about style in your code (including how to write code with formatting which is refactoring immune - hint you should not be using tabs - because of course only a moron would use tabs) Remember, your code is the user interface to your program for maintainers of the code who it was written for in the first place. Let's make sure they understand what the hell it is you were doing before they break your code! For the lazy - here is an Amazon share link to the book, click it, buy it right now! https://amzn.to/2ZEoO4O
  5. I came across this one yet again today, so I wanted to record it here where people can find it and where I can point to it without looking up all the details in the standard over and over again. I know pointers are hard enough to grok as it is, but it seems that const-qualified pointers and pointers to const-qualified types confuses the hell out of everybody. Here is a bit of wisdom from the C standard. As they say - if all else fails read the manual! BTW. The same applies to all qualifiers so this counts for volatile as well. I see the mistake too often where people are trying to make a pointer which is being changed from an ISR e.g. and they will use something like: volatile List_t * ListHead; This usually does not have the intended meaning of course. The volatile qualifier applies to the List_t and not to the pointer. So this is in fact not a volatile pointer to a List_t. It is instead a non-volatile pointer to a "volatile List_t". In simple terms it is the variable at the address being pointed to which is volatile, not the address itself (the pointer). To make the a volatile pointer, that is a pointer which is changed from another context such as an ISR, you need to do it like this: List_t * volatile ListHead; Of course if both the pointer and the thing it is pointing to are volatile we do it like this: volatile List_t * volatile ListHead; There is another example in section 6.2.5 of the standard.
  6. I realize now that I have a new pet peeve. The widespread and blind inclusion, by Lemmings calling themselves embedded C programmers, of extern "C" in C header files everywhere. To add insult to injury, programmers (if I can call them that) who commit this atrocity tend to gratuitously leave a comment in the code claiming that this somehow makes the code "compatible" with C++ (whatever that is supposed to mean - IT DOES NOT !!!). It usually looks something like this (taken from MCC produced code - uart.h) #ifdef __cplusplus // Provide C++ Compatibility extern "C" { #endif So why does this annoy me so much? (I mean besides the point that I am unaware of any C++ compiler for the PIC16, which I generated this code for - which in itself is a hint that this has NEVER been tested on any C++ compiler ! ) First and foremost - adding something to your code which claims that it will "Provide C++ Compatibility" is like entering into a contract with the user that they can now safely expect the code to work with C++. Well I have bad news for you buddy - it takes a LOT more than wrapping everything in extern "C" to make your code "Compatible with C++" !!! If you are going to include that in your code you better know what you are doing and your code better work (as in actually being tested) with a real C++ compiler. If it does not I am going to call out the madness you perpetrated by adding something which only has value when you mix C and C++, but your code cannot be used with C++ at all in the first place - because you are probably doing a host of things which are incompatible with C++ ! In short, as they say - your header file comment is writing cheques your programming skills cannot cash. Usually a conversation about this starts with me asking the perpetrator if they can explain to me what "That thing" actually does, and when do you think you need this to use the code with a C++ compiler, so let's start there. I can count on one hand the people who have been able to explain to me how this works correctly. Extern "C" - What it does and what it is meant for This construct is used to tell a C++ compiler that it should set a property called langage linkage which affects mangling of names as well as possible calling conventions. It does not guarantee that you can link to C code compiled with a different compiler. This is a good point to quote the C++14 specification, section 7.5 on Language linkage That said, if you are e.g. using GCC for your C as well as your C++ compiler you will have a very good chance of being lucky enough that the implementation-defined linkage will be compatible! A C++ compiler will use "name mangling" to ensure that symbols with identical names can be distinguished from each other by passing additional semantic information to the linker. This becomes important when you use e.g. function overloading, or namespaces. This mangling of names is a feature of C++ (which allows duplicate names across the program, but does NOT exist in C (which does not allow duplicate symbols to be linked at all). When you place an extern "C" wrapper around the declaration of a symbol you are telling the C++ compiler to disable this name mangling feature and also to alter it's calling convention when calling a function to be more likely to link with C code, although calling conventions are a topic all in it's own. As you may have deduced by now there is a reason that this is not called "extern C++"! This is required to make your C++ calls match the names and conventions in the C object files, which does not contain mangled names. If you ARE going to place a comment next to your faux pas, at least get it right and say "For compatibility with C naming and calling conventions" instead of claiming incorrectly that this is required for C++ compatibility somehow! There is one very specific use case, one which we used to encounter all the time in what would now be called "the olden days" of Windows 3.1 when everything was shared through dynamic link libraries (.dll files). It turns out that in order to use a dll which was written using C++ from your C program you had to wrap the publicly exported function declarations in extern "C". Yes that is correct, this is used to make C++ code compatible with C, NOT THE OTHER WAY AROUND! So when do I really need this then? Let's take a small step back. If you want to use your C code with a C++ compiler, you can simply compile it all with the C++ compiler! The compiler will resolve the names just fine after mangling them all, and since you are properly using name spaces with your C++ code this will reduce the chances of name collisions and you will be all the happier for doing the right thing. No need for disabling name mangling there. If you do not need this to get your code to compile with a C++ compiler, then when DO you need it? Ahhh, now you are beginnng to see why this has become a pet peeve of mine ... but the plot still has some thickening to do before we are done... Pretty much the only case where it makes sense to do this is when you are going to compile some of your code with a C compiler, then compile some other code with a C++ compiler, and in the end take the object files from the two compilers and link them together, probably using the C++ linker. Of course when you feed your .c files and your .cpp files to a typical C++ compiler it will actually run a C compiler on the .c files and the C++ compiler on the .cpp files by default, and mangling will be an issue and you will exclaim "you see! He was wrong!", but not that fast ... it is simple enough to tell the compiler to compile all files with the C++ compiler, and this remains the best way to use C libraries in source form with your C++ code. If you are going to compile the C code into a binary library and link it in (which sometimes happens with commercial 3rd party libraries - like .dll's - DUH ! ) then there is a case for you to do this, but most likely this does not apply to you at all as you will have the source available all the time and you are working on an embedded system where you are making a single binary. To help you ease the world of hurt you are probably in for - you should read up on how to tell your compiler to use a particlar calling convention which has a chance of being compatible with the linker. If you are using GCC you can start here. To be clear, if you add extern "C" to your C code and then compiles it into an object file to be linked with your C++ program the extern "C" qualifier is entirely ignored by the C compiler. Yes that is right, all the way to producing the object file this has no effect whatsoever. It is only when you are producing calls to the C code from the C++ code that the C++ code is altered to match the C naming and calling conventions, so this means that the C++ code is altered to be compatible with C. In the end There is a hell of a lot of things you will need to consider if you want to mix C and C++. I promise that I will write another blog on that next time around if I get anybody asking for it in the comments. Adding extern "C" to your code is NOT required to make it magically "compatible with C++". In order to do that you need to heed with extreme caution the long list of incompatibilities between C and C++ and you will probably have to be much more specific than just stating C and C++. Probably something like C89 and C++11 if I had to wager a guess to what will be most relevant. And remember the reasons why it is almost always just plain stupid to use extern "C" in your C headers. It does not do what you think it does. It especially does not do what your comment claims it does - it does not provide C++ compatability! Don't let your comments write cheques your code cannot cash! - Before you even think of adding that, make sure you KNOW AND FOLLOW ALL the C++ compatability rules first. For heaven's sake TEST your code with a C++ compiler ! If you want to use a "C" library with your C++ code simply compile all the code with your C++ compiler. QED - no mess no fuss! If it is not possible to do 5 above, then compile the C code normally (without that nonsense) and place extern "C" around the #include of the C library only! (example below). After all, this is for providing C linkage compatability to C++ compiled code! If you are producing a binary library using C to be used/linked with a C++ compiler, then please save us all and just compile the library with both C and C++ and provide 2 binaries! If all of the above fail, beacuase you really just hit that one in a million case where you think you need this, then for Pete's sake educate yourself before you attempt something that hard, hopefully in the process you will realize that it is just a bad idea after all! Now please just STOP IT! I feel very much like pulling a Jeff Atwood and saying after all that only a moron would use extern "C" in their C headers (of course he was talking about using tabs). Orunmila Oh - I almost forgot - the reasonable way of using extern "C" looks like this: #include <string> // Yes <string>, not <string.h> - this is C++ ! #include <stdio> // Include C libraries extern "C" { #include "clib1.h" #include "clib2.h" } using namespace std; // Because the cuticles on my typing finger hurt if I have to type out "std::" all the time! // This is a proper C++ class because I am too patrician to use "C" like that peasant library writer! class myClass { private: int myPrivateInt; ... ... Appeals to Authority Dan Saks CppCon talk entitled “extern c: Talking to C Programmers about C++” A very good answer on Stackoverflow Some decent answers on this Stackoverflow question Fairly OK post on geeksforgeeks The dynamic loading use case with examples Note: That first one is a bit of a red herring as it does not explain extern C at all - nevertheless it is a great talk which all embedded programmers can benefit from 🙂
  7. Whenever I start a new project I always start off reaching for a simple while(1) "superloop" architecture https://en.wikibooks.org/wiki/Embedded_Systems/Super_Loop_Architecture . This works well for doing the basics but more often than not I quickly end up short and looking to employ a timer to get some kind of scheduling going. MCC makes this pretty easy and convenient to set up. It contains a library called "Foundation Services" which has 2 different timer implementations, TIMEOUT and RTCOUNTER. These two library modules have pretty much the same interface, but they are implemented very differently under the hood. For my little "Operating System" I am going to prefer the RTCOUNTER version as keeping time accurately is more important to me than latency. The TIMEOUT module is capable of providing low latency reaction times whenever a timer expires by adjusting the timer overflow point so that an interrupt will occur right when the next timer expires. This allows you to use the ISR to call the action you want to happen directly and immediately from the interrupt context. Nice as that may be in some cases, it always increases the complexity, the code size and the overall cost of the system. In our case RTCOUNTER is more than good enough so we will stick with that. RTCOUNTER Operation First a little bit more about RTCOUNTER. In short RTCOUNTER keeps track of a list of running timers. Whenever you call the "check" function it will compare the expiry time of the next timer and call the task function for that timer if it has expired. It achieves this by using a single hardware timer which will operate in "Free Running" mode. This means the hardware timer will never be "re-loaded" by the code, it will simply overflow back to it's starting value naturally, and every time this happens the module will count that another overflow has happened in an overflow counter. The count of the timer is made up by a combination of the actual hardware timer and the overflow counter. By "hiding" or abstracting the real size of the hardware timer like this the module can easily be switched over to use any of the PIC timers, regardless if they count up or down or how many bits they implement in hardware. Mode 32-bit Timer value General (32-x)-bit of g_rtcounterH x-bit Hardware Timer Using TMR0 in 8-bit mode 24-bits of g_rtcounterH TMR0 (8-bit) Using TMR1 16-bits of g_rtcounterH TMR1H (8-bit) TMR1L (8-bit) RTCOUNTER is compatible with all the PIC timers, and you can switch it to a different timer later without modifying your application code which is nice. Since all that happens when the timer overflows is updating the counter the Timer ISR is as short and simple as possible. // We only increment the overflow counter and clear the flag on every interrupt void rtcount_isr(void) { g_rtcounterH++; PIR4bits.TMR1IF = 0; } When we run the "check" function it will construct the 32-bit time value by combining the hardware timer and the overflow counter (g_rtcounterH). It will then compare this value to the expiry time of the next timer in the list to expire. By keeping the list of timers sorted by expiry time it saves time during the checking (which happens often) by doing the sorting work during creation of the timer (which happens infrequently). How to use it Using it is failry straight-forward. Create a callback function which returns the "re-schedule" time for the timer. Allocate memory for your timer/task and tie it to your callback function Create the timer (which starts it) specifying how long before it will first expire Regularly call the check function to check if the next timer has expired, and call it's callback if it has. In C the whole program may look something like this example: #include "mcc_generated_files/mcc.h" int32_t ledFlasher(void* p); rtcountStruct_t myTimer = {ledFlasher}; void main(void) { SYSTEM_Initialize(); INTERRUPT_GlobalInterruptEnable(); INTERRUPT_PeripheralInterruptEnable(); rtcount_create(&myTimer, 1000); // Create a new timer using the memory at &myTimer // This is my main Scheduler or OS loop, all tasks are executed from here from now on while (1) { // Check if the next timer has expired, call it's callback from here if it did rtcount_callNextCallback(); } } int32_t ledFlasher(void* p) { LATAbits.RA0 ^= 1; // Toggle our pin return 1000; // When we are done we want to restart this timer 1000 ticks later, return 0 to stop } Example with Multiple Timers/Tasks Ok, admittedly blinking an LED with a timer is not rocket science and not really impressive, so let's step it up and show how we can use this concept to write an application which is more event-driven than imperative. NOTE: If you have not seen it yet I recommend reading Martin Fowler's article on Event Sourcing and how this design pattern reduces the probability of errors in your system on his website here. By splitting our program into tasks (or modules) which each perform a specific action and works independently of other tasks, we can generate code modules which are completely independent and re-usable quite easily. Independent and re-usable (or mobile as Uncle Bob says) does not only mean that the code is maintainable, it also means that we can test and debug each task by itself, and if we do this well it will make the code much less fragile. Code is "fragile" when you are fixing something in one place and something seemingly unrelated breaks elsewhere ... that will be much less likely to happen. For my example I am going to construct some typical tasks which need to be done in an embedded system. To accomplish this we will Create a task function for each of these by creating a timer for it. Control/set the amount of CPU time afforded to each task by controlling how often the timer times out Communicate between tasks only through a small number of shared variables (this is best done using Message Queue's - we will post about those in a later blog some time) Let's go ahead and construct our system. Here is the big picture view This system has 6 tasks being managed by the Scheduler/OS for us. Sampling the ADC to check the battery level. This has to happen every 5 seconds Process keys, we are looking at a button which needs to be de-bounced (100 ms) Process serial port for any incoming messages. The port is on interrupt, baud is 9600. Our buffer is 16 bytes so we want to check it every 10ms to ensure we do not loose data. Update system LCD. We only update the LCD when the data has changed, we want to check for a change every 100ms Update LED's. We want this to happen every 500ms Drive Outputs. Based on our secret sauce we will decide when to toggle some pins, we do this every 1s These tasks will work together, or co-operate, by keeping to the promise never to run for a long time (let's agree 10ms is a long time, tasks taking longer than that needs to be broken into smaller steps). This arrangement is called Co-operative Multitasking . This is a well-known mechanism of multi-tasking on a microcontroller, and has been implemented in systems like "Windows 3.1" and "Windows 95" as well as "Classic Mac-OS" in the past. By using the Scheduler and event driven paradigm here we can implement and test each of these subsystems independently. Even when we have it all put together we can easily replace one of these subsystems with a "Test" version of it and use that to generate test conditions for us to ensure everything will work correctly under typical operation conditions. We can "disable" any part of the system by simply commenting out the "create" function for that timer and it will not run. We can also adjust how often things happen or adjust priorities by modifying the task time values. As before we first allocate some memory to store all of our tasks. We will initialize each task with a pointer to the callback function used to perform this task as before. The main program= ends up looking something like this. void main(void) { SYSTEM_Initialize(); INTERRUPT_GlobalInterruptEnable(); INTERRUPT_PeripheralInterruptEnable(); rtcount_create(&adcTaskTimer, 5000); rtcount_create(&keyTaskTimer, 100); rtcount_create(&serialTaskTimer, 10); rtcount_create(&lcdTaskTimer, 100); rtcount_create(&ledTaskTimer, 500); rtcount_create(&outputTaskTimer, 1000); // This is my main Scheduler or OS loop, all tasks are executed from events while (1) { // Check if the next timer has expired, call it's callback from here if it did rtcount_callNextCallback(); } } As always the full project incuding the task functions and the timer variable declarations can be downloaded. The skeleton of this program which does initialize the other peripherals, but runs the timers completely compiles to only 703 words of code or 8.6% on this device, and it runs all 6 program tasks using a single hardware timer.
  8. They write the stuff - what we can learn from the way the Space Shuttle software was developed As a youngster I watched the movie "The right stuff" about the Mercury Seven. The film deified the picture of an Astronaut in my mind as some kind of superhero. Many years later I read and just absolutely loved an article written by Charles Fishman in Fast Company in 1996, entitled "They write the stuff", which did something similar for the developers who wrote the code which put those guys up there. Read the article here : https://www.fastcompany.com/28121/they-write-right-stuff I find myself going back to this article time and again when confronted by the question about how many bugs is acceptable in our code, or even just on what is possible in terms of quality. The article explores the code which ran the space shuttle and the processes the team responsible for this code follows. This team is one of very few teams certified to CMM level 5, and the quality of the code is achieved not through intelligence or heroic effort, but by a culture of quality engrained in the processes they have created. Here are the important numbers for the Space Shuttle code: Lines of code 420,000 Known bugs per version 1 Total development cost ($200M) $200,000,000.00 Cost per line of code (1975 value) $476.00 Cost per line of code (2019 inflation adjusted value) $2223.00 Development team (count) +-100 Average Productivity (lines/developer/workday) 8.04 The moral of the story here is that the quality everyone desires is indeed achievable, but we tend to severely underestimate the cost which would go along with this level of quality. I have used the numbers from this article countless times in arguments about the trade-off between quality and cost on my projects. The most important lesson to learn from this case study is that quality seems to depend primarily on the development processes the team follows and the culture of quality which they adhere to. As Fishman points out, in software development projects most people focus on attempting to test quality into the software, and of course as Steve McConnell pointed out this is "like weighing yourself more often" in an attempt to lose weight. Another lesson to be learned is that the process inherently accepts the fact that people will make mistakes. Quality is not ensured by people trying really hard to avoid mistakes, instead it accepts the fact that mistakes will be made and builds the detection and correction of these mistakes into the process itself. This means that if something slips through it is inappropriate to blame any individual for this mistake, the only sensible thing to do is to acknowledge a gap in the process, so when a bug does make it out into the wild the team will focus on how to fix the process instead of trying to lay blame on the person who was responsible for the mistake. Over all this article is a great story which is chock full of lessons to learn and pass on. References The full article is archived here : https://www.fastcompany.com/28121/they-write-right-stuff NASA has a great piece published on the history of the software https://history.nasa.gov/computers/Ch4-5.html Fantastic read- the interview with Tony Macina published in Communications of the ACM in 1984 - http://klabs.org/DEI/Processor/shuttle/shuttle_primary_computer_system.pdf
  9. Real Programmers don't use Pascal I mentioned this one briefly before in the story of Mel, but this series would not be complete without this one. Real Programmers Don't Use PASCAL Ed Post Graphic Software Systems P.O. Box 673 25117 S.W. Parkway Wilsonville, OR 97070 Copyright (c) 1982 (decvax | ucbvax | cbosg | pur-ee | lbl-unix)!teklabs!ogcvax!gss1144!evp Back in the good old days -- the "Golden Era" of computers, it was easy to separate the men from the boys (sometimes called "Real Men" and "Quiche Eaters" in the literature). During this period, the Real Men were the ones that understood computer programming, and the Quiche Eaters were the ones that didn't. A real computer programmer said things like "DO 10 I=1,10" and "ABEND" (they actually talked in capital letters, you understand), and the rest of the world said things like "computers are too complicated for me" and "I can't relate to computers -- they're so impersonal". (A previous work [1] points out that Real Men don't "relate" to anything, and aren't afraid of being impersonal.) But, as usual, times change. We are faced today with a world in which little old ladies can get computerized microwave ovens, 12 year old kids can blow Real Men out of the water playing Asteroids and Pac-Man, and anyone can buy and even understand their very own Personal Computer. The Real Programmer is in danger of becoming extinct, of being replaced by high-school students with TRASH-80s! There is a clear need to point out the differences between the typical high-school junior Pac-Man player and a Real Programmer. Understanding these differences will give these kids something to aspire to -- a role model, a Father Figure. It will also help employers of Real Programmers to realize why it would be a mistake to replace the Real Programmers on their staff with 12 year old Pac-Man players (at a considerable salary savings). LANGUAGES The easiest way to tell a Real Programmer from the crowd is by the programming language he (or she) uses. Real Programmers use FORTRAN. Quiche Eaters use PASCAL. Nicklaus Wirth, the designer of PASCAL, was once asked, "How do you pronounce your name?". He replied "You can either call me by name, pronouncing it 'Veert', or call me by value, 'Worth'." One can tell immediately from this comment that Nicklaus Wirth is a Quiche Eater. The only parameter passing mechanism endorsed by Real Programmers is call-by-value-return, as implemented in the IBM/370 FORTRAN G and H compilers. Real programmers don't need abstract concepts to get their jobs done: they are perfectly happy with a keypunch, a FORTRAN IV compiler, and a beer. Real Programmers do List Processing in FORTRAN. Real Programmers do String Manipulation in FORTRAN. Real Programmers do Accounting (if they do it at all) in FORTRAN. Real Programmers do Artificial Intelligence programs in FORTRAN. If you can't do it in FORTRAN, do it in assembly language. If you can't do it in assembly language, it isn't worth doing. STRUCTURED PROGRAMMING Computer science academicians have gotten into the "structured programming" rut over the past several years. They claim that programs are more easily understood if the programmer uses some special language constructs and techniques. They don't all agree on exactly which constructs, of course, and the examples they use to show their particular point of view invariably fit on a single page of some obscure journal or another -- clearly not enough of an example to convince anyone. When I got out of school, I thought I was the best programmer in the world. I could write an unbeatable tic-tac-toe program, use five different computer languages, and create 1000 line programs that WORKED. (Really!) Then I got out into the Real World. My first task in the Real World was to read and understand a 200,000 line FORTRAN program, then speed it up by a factor of two. Any Real Programmer will tell you that all the Structured Coding in the world won't help you solve a problem like that -- it takes actual talent. Some quick observations on Real Programmers and Structured Programming: Real Programmers aren't afraid to use GOTOs. Real Programmers can write five page long DO loops without getting confused. Real Programmers enjoy Arithmetic IF statements because they make the code more interesting. Real Programmers write self-modifying code, especially if it saves them 20 nanoseconds in the middle of a tight loop. Programmers don't need comments: the code is obvious. Since FORTRAN doesn't have a structured IF, REPEAT ... UNTIL, or CASE statement, Real Programmers don't have to worry about not using them. Besides, they can be simulated when necessary using assigned GOTOs. Data structures have also gotten a lot of press lately. Abstract Data Types, Structures, Pointers, Lists, and Strings have become popular in certain circles. Wirth (the above-mentioned Quiche Eater) actually wrote an entire book [2] contending that you could write a program based on data structures, instead of the other way around. As all Real Programmers know, the only useful data structure is the array. Strings, lists, structures, sets -- these are all special cases of arrays and and can be treated that way just as easily without messing up your programing language with all sorts of complications. The worst thing about fancy data types is that you have to declare them, and Real Programming Languages, as we all know, have implicit typing based on the first letter of the (six character) variable name. OPERATING SYSTEMS What kind of operating system is used by a Real Programmer? CP/M? God forbid -- CP/M, after all, is basically a toy operating system. Even little old ladies and grade school students can understand and use CP/M. Unix is a lot more complicated of course -- the typical Unix hacker never can remember what the PRINT command is called this week -- but when it gets right down to it, Unix is a glorified video game. People don't do Serious Work on Unix systems: they send jokes around the world on USENET and write adventure games and research papers. No, your Real Programmer uses OS/370. A good programmer can find and understand the description of the IJK305I error he just got in his JCL manual. A great programmer can write JCL without referring to the manual at all. A truly outstanding programmer can find bugs buried in a 6 megabyte core dump without using a hex calculator. (I have actually seen this done.) OS/370 is a truly remarkable operating system. It's possible to destroy days of work with a single misplaced space, so alertness in the programming staff is encouraged. The best way to approach the system is through a keypunch. Some people claim there is a Time Sharing system that runs on OS/370, but after careful study I have come to the conclusion that they are mistaken. PROGRAMMING TOOLS What kind of tools does a Real Programmer use? In theory, a Real Programmer could run his programs by keying them into the front panel of the computer. Back in the days when computers had front panels, this was actually done occasionally. Your typical Real Programmer knew the entire bootstrap loader by memory in hex, and toggled it in whenever it got destroyed by his program. (Back then, memory was memory -- it didn't go away when the power went off. Today, memory either forgets things when you don't want it to, or remembers things long after they're better forgotten.) Legend has it that Seymour Cray, inventor of the Cray I supercomputer and most of Control Data's computers, actually toggled the first operating system for the CDC7600 in on the front panel from memory when it was first powered on. Seymour, needless to say, is a Real Programmer. One of my favorite Real Programmers was a systems programmer for Texas Instruments. One day, he got a long distance call from a user whose system had crashed in the middle of some important work. Jim was able to repair the damage over the phone, getting the user to toggle in disk I/O instructions at the front panel, repairing system tables in hex, reading register contents back over the phone. The moral of this story: while a Real Programmer usually includes a keypunch and lineprinter in his toolkit, he can get along with just a front panel and a telephone in emergencies. In some companies, text editing no longer consists of ten engineers standing in line to use an 029 keypunch. In fact, the building I work in doesn't contain a single keypunch. The Real Programmer in this situation has to do his work with a text editor program. Most systems supply several text editors to select from, and the Real Programmer must be careful to pick one that reflects his personal style. Many people believe that the best text editors in the world were written at Xerox Palo Alto Research Center for use on their Alto and Dorado computers [3]. Unfortunately, no Real Programmer would ever use a computer whose operating system is called SmallTalk, and would certainly not talk to the computer with a mouse. Some of the concepts in these Xerox editors have been incorporated into editors running on more reasonably named operating systems. EMACS and VI are probably the most well known of this class of editors. The problem with these editors is that Real Programmers consider "what you see is what you get" to be just as bad a concept in text editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. TECO, to be precise. It has been observed that a TECO command sequence more closely resembles transmission line noise than readable text [4]. One of the more entertaining games to play with TECO is to type your name in as a command line and try to guess what it does. Just about any possible typing error while talking with TECO will probably destroy your program, or even worse -- introduce subtle and mysterious bugs in a once working subroutine. For this reason, Real Programmers are reluctant to actually edit a program that is close to working. They find it much easier to just patch the binary object code directly, using a wonderful program called SUPERZAP (or its equivalent on non-IBM machines). This works so well that many working programs on IBM systems bear no relation to the original FORTRAN code. In many cases, the original source code is no longer available. When it comes time to fix a program like this, no manager would even think of sending anything less than a Real Programmer to do the job -- no Quiche Eating structured programmer would even know where to start. This is called "job security". Some programming tools NOT used by Real Programmers: FORTRAN preprocessors like MORTRAN and RATFOR. The Cuisinarts of programming -- great for making Quiche. See comments above on structured programming. Source language debuggers. Real Programmers can read core dumps. Compilers with array bounds checking. They stifle creativity, destroy most of the interesting uses for EQUIVALENCE, and make it impossible to modify the operating system code with negative subscripts. Worst of all, bounds checking is inefficient. Source code maintainance systems. A Real Programmer keeps his code locked up in a card file, because it implies that its owner cannot leave his important programs unguarded [5]. THE REAL PROGRAMMER AT WORK Where does the typical Real Programmer work? What kind of programs are worthy of the efforts of so talented an individual? You can be sure that no real Programmer would be caught dead writing accounts-receivable programs in COBOL, or sorting mailing lists for People magazine. A Real Programmer wants tasks of earth-shaking importance (literally!): Real Programmers work for Los Alamos National Laboratory, writing atomic bomb simulations to run on Cray I supercomputers. Real Programmers work for the National Security Agency, decoding Russian transmissions. It was largely due to the efforts of thousands of Real Programmers working for NASA that our boys got to the moon and back before the cosmonauts. The computers in the Space Shuttle were programmed by Real Programmers. Programmers are at work for Boeing designing the operating systems for cruise missiles. Some of the most awesome Real Programmers of all work at the Jet Propulsion Laboratory in California. Many of them know the entire operating system of the Pioneer and Voyager spacecraft by heart. With a combination of large ground-based FORTRAN programs and small spacecraft-based assembly language programs, they can to do incredible feats of navigation and improvisation, such as hitting ten-kilometer wide windows at Saturn after six years in space, and repairing or bypassing damaged sensor platforms, radios, and batteries. Allegedly, one Real Programmer managed to tuck a pattern-matching program into a few hundred bytes of unused memory in a Voyager spacecraft that searched for, located, and photographed a new moon of Jupiter. One plan for the upcoming Galileo spacecraft mission is to use a gravity assist trajectory past Mars on the way to Jupiter. This trajectory passes within 80 +/- 3 kilometers of the surface of Mars. Nobody is going to trust a PASCAL program (or PASCAL programmer) for navigation to these tolerances. As you can tell, many of the world's Real Programmers work for the U.S. Government, mainly the Defense Department. This is as it should be. Recently, however, a black cloud has formed on the Real Programmer horizon. It seems that some highly placed Quiche Eaters at the Defense Department decided that all Defense programs should be written in some grand unified language called "ADA" (registered trademark, DoD). For a while, it seemed that ADA was destined to become a language that went against all the precepts of Real Programming -- a language with structure, a language with data types, strong typing, and semicolons. In short, a language designed to cripple the creativity of the typical Real Programmer. Fortunately, the language adopted by DoD has enough interesting features to make it approachable: it's incredibly complex, includes methods for messing with the operating system and rearranging memory, and Edsgar Dijkstra doesn't like it [6]. (Dijkstra, as I'm sure you know, was the author of "GoTos Considered Harmful" -- a landmark work in programming methodology, applauded by Pascal Programmers and Quiche Eaters alike.) Besides, the determined Real Programmer can write FORTRAN programs in any language. The real programmer might compromise his principles and work on something slightly more trivial than the destruction of life as we know it, providing there's enough money in it. There are several Real Programmers building video games at Atari, for example. (But not playing them. A Real Programmer knows how to beat the machine every time: no challange in that.) Everyone working at LucasFilm is a Real Programmer. (It would be crazy to turn down the money of 50 million Star Wars fans.) The proportion of Real Programmers in Computer Graphics is somewhat lower than the norm, mostly because nobody has found a use for Computer Graphics yet. On the other hand, all Computer Graphics is done in FORTRAN, so there are a fair number people doing Graphics in order to avoid having to write COBOL programs. THE REAL PROGRAMMER AT PLAY Generally, the Real Programmer plays the same way he works -- with computers. He is constantly amazed that his employer actually pays him to do what he would be doing for fun anyway, although he is careful not to express this opinion out loud. Occasionally, the Real Programmer does step out of the office for a breath of fresh air and a beer or two. Some tips on recognizing real programmers away from the computer room: At a party, the Real Programmers are the ones in the corner talking about operating system security and how to get around it. At a football game, the Real Programmer is the one comparing the plays against his simulations printed on 11 by 14 fanfold paper. At the beach, the Real Programmer is the one drawing flowcharts in the sand. A Real Programmer goes to a disco to watch the light show. At a funeral, the Real Programmer is the one saying "Poor George. And he almost had the sort routine working before the coronary." In a grocery store, the Real Programmer is the one who insists on running the cans past the laser checkout scanner himself, because he never could trust keypunch operators to get it right the first time. THE REAL PROGRAMMER'S NATURAL HABITAT What sort of environment does the Real Programmer function best in? This is an important question for the managers of Real Programmers. Considering the amount of money it costs to keep one on the staff, it's best to put him (or her) in an environment where he can get his work done. The typical Real Programmer lives in front of a computer terminal. Surrounding this terminal are: Listings of all programs the Real Programmer has ever worked on, piled in roughly chronological order on every flat surface in the office. Some half-dozen or so partly filled cups of cold coffee. Occasionally, there will be cigarette butts floating in the coffee. In some cases, the cups will contain Orange Crush. Unless he is very good, there will be copies of the OS JCL manual and the Principles of Operation open to some particularly interesting pages. Taped to the wall is a line-printer Snoopy calender for the year 1969. Strewn about the floor are several wrappers for peanut butter filled cheese bars (the type that are made stale at the bakery so they can't get any worse while waiting in the vending machine). Hiding in the top left-hand drawer of the desk is a stash of double stuff Oreos for special occasions. Underneath the Oreos is a flow-charting template, left there by the previous occupant of the office. (Real Programmers write programs, not documentation. Leave that to the maintainence people.) The Real Programmer is capable of working 30, 40, even 50 hours at a stretch, under intense pressure. In fact, he prefers it that way. Bad response time doesn't bother the Real Programmer -- it gives him a chance to catch a little sleep between compiles. If there is not enough schedule pressure on the Real Programmer, he tends to make things more challenging by working on some small but interesting part of the problem for the first nine weeks, then finishing the rest in the last week, in two or three 50-hour marathons. This not only inpresses his manager, who was despairing of ever getting the project done on time, but creates a convenient excuse for not doing the documentation. In general: No Real Programmer works 9 to 5. (Unless it's 9 in the evening to 5 in the morning.) Real Programmers don't wear neckties. Real Programmers don't wear high heeled shoes. Real Programmers arrive at work in time for lunch. [9] A Real Programmer might or might not know his wife's name. He does, however, know the entire ASCII (or EBCDIC) code table. Real Programmers don't know how to cook. Grocery stores aren't often open at 3 a.m., so they survive on Twinkies and coffee. THE FUTURE What of the future? It is a matter of some concern to Real Programmers that the latest generation of computer programmers are not being brought up with the same outlook on life as their elders. Many of them have never seen a computer with a front panel. Hardly anyone graduating from school these days can do hex arithmetic without a calculator. College graduates these days are soft -- protected from the realities of programming by source level debuggers, text editors that count parentheses, and user friendly operating systems. Worst of all, some of these alleged computer scientists manage to get degrees without ever learning FORTRAN! Are we destined to become an industry of Unix hackers and Pascal programmers? On the contrary. From my experience, I can only report that the future is bright for Real Programmers everywhere. Neither OS/370 nor FORTRAN show any signs of dying out, despite all the efforts of Pascal programmers the world over. Even more subtle tricks, like adding structured coding constructs to FORTRAN have failed. Oh sure, some computer vendors have come out with FORTRAN 77 compilers, but every one of them has a way of converting itself back into a FORTRAN 66 compiler at the drop of an option card -- to compile DO loops like God meant them to be. Even Unix might not be as bad on Real Programmers as it once was. The latest release of Unix has the potential of an operating system worthy of any Real Programmer. It has two different and subtly incompatible user interfaces, an arcane and complicated terminal driver, virtual memory. If you ignore the fact that it's structured, even C programming can be appreciated by the Real Programmer: after all, there's no type checking, variable names are seven (ten? eight?) characters long, and the added bonus of the Pointer data type is thrown in. It's like having the best parts of FORTRAN and assembly language in one place. (Not to mention some of the more creative uses for #define.) No, the future isn't all that bad. Why, in the past few years, the popular press has even commented on the bright new crop of computer nerds and hackers ([7] and [8]) leaving places like Stanford and M.I.T. for the Real World. From all evidence, the spirit of Real Programming lives on in these young men and women. As long as there are ill-defined goals, bizarre bugs, and unrealistic schedules, there will be Real Programmers willing to jump in and Solve The Problem, saving the documentation for later. Long live FORTRAN! ACKNOWLEGEMENT I would like to thank Jan E., Dave S., Rich G., Rich E. for their help in characterizing the Real Programmer, Heather B. for the illustration, Kathy E. for putting up with it, and atd!avsdS:mark for the initial inspriration. REFERENCES [1] Feirstein, B., Real Men Don't Eat Quiche, New York, Pocket Books, 1982. [2] Wirth, N., Algorithms + Datastructures = Programs, Prentice Hall, 1976. [3] Xerox PARC editors . . . [4] Finseth, C., Theory and Practice of Text Editors - or - a Cookbook for an EMACS, B.S. Thesis, MIT/LCS/TM-165, Massachusetts Institute of Technology, May 1980. [5] Weinberg, G., The Psychology of Computer Programming, New York, Van Nostrabd Reinhold, 1971, page 110. [6] Dijkstra, E., On the GREEN Language Submitted to the DoD, Sigplan notices, Volume 3, Number 10, October 1978. [7] Rose, Frank, Joy of Hacking, Science 82, Volume 3, Number 9, November 1982, pages 58 - 66. [8] The Hacker Papers, Psychology Today, August 1980. [9] Datamation, July, 1983, pp. 263-265.
  10. Software Engineering is complex. The essense of Fred Brooks's "No Silver Bullet" was that software takes long to develop because it takes humans long to deal with this complexity. Today we so often run into the situation where someone publishes some clever idea or solution, and others enthusastically implement this in their project only to be disappointed by the fact that it does not seem to give them the expected benefit. Things that come to mind first are "modular code", Design Patterns and "Object Oriented". More often than not the root cause is a lack of a deeper understanding of the essense of the solution or of original problem. This mistake of ritualistically copying what the "Textbook" says has become so systemic in our industry that we needed a vocabulary name to refer to it, and this name is "Cargo Cult Programming". The WikiPedia Entry has a pretty neat description - "Cargo cult programming is a style of computer programming characterized by the ritual inclusion of code or program structures that serve no real purpose. Cargo cult programming is typically symptomatic of a programmer not understanding either a bug they were attempting to solve or the apparent solution (compare shotgun debugging, deep magic)." In short it is a reminder that you should not ritualistically follow any solution without truly understanding what the essense of the solution is. What are the advantages and disadvantages, the trade-offs involved, and why does it apply in your situation? I e.g. recently had a debate with some colleagues about what constituted "Industrial Strength Code", and the claim was made that "Industrial Quality Code" is code that uses State Machines and all function calls are non-blocking. To me this just sounds like a textbook case of Cargo Culting the rituals hoping for the quality to follow. It is after all possible to produce the highest quality code using any programming paradigm provided it is appied at an appropraite place and time. But I digress... In order to appreciate the term, you really have to read the story of the Cargo Cult! (This image was found at http://cargocultsoa11.files.wordpress.com/2010/09/cargo-cult2.jpg and was taken by Steve Axford. The Cargo Cult “Reporter Paul Raffaele: "John [Frum] promised you much cargo more than 60 years ago, and none has come. … Why do you still believe in him?" Chief Isaac Wan: "You Christians have been waiting 2,000 years for Jesus to return to earth and you haven’t given up hope."” The earliest known cargo cult was the Tuka Movement in Fiji from 1885. During World War II, the Allies set up many temporary military bases in the Pacific, introducing isolated peoples to Western manufactured goods, or "cargo". While military personnel were stationed there, many islanders noticed these newcomers engaging in ritualized behaviors, like marching around with rifles on their shoulders in formations. After the Allies left, the source of cargo was removed and the people were nearly as isolated as before. In their desire to keep getting more goods, various peoples throughout the Pacific introduced new religious rituals mimicking what they had seen the strangers do. Melanesia In one instance well-studied by anthropologists, the Tanna Islanders of what is now Vanuatu interpreted the US military drill as religious rituals, leading them to conclude that these behaviors brought cargo to the islands. Hoping that the cargo would return by duplicating these behaviors, they continued to maintain airstrips and replaced their facilities using native materials. These included remarkably detailed full-size replicas of airplanes made of wood, bark, and vines, a hut-like radio shack complete with headphones made of coconut halves, and attempts at recreating military uniforms and flags. Many Melanesians believed that Western manufactured goods were created by ancestral spirits, but the occupiers had unfairly gained control of them (as the occupiers in question had no visible means of producing said goods themselves). The islanders expected that a messianic Western figure, John Frum, would return to deliver the cargo. No one knows who Frum is, nor is there physical evidence he existed, but the islanders continue to ceremoniously honor him. After the war the US Navy attempted to talk the people out of it, but by that point it was too late and the religious movement had taken hold. Subsequently the people of Tanna have been waiting over sixty years for the cargo to return. Then again, as mentioned in the quote above, Christians have been waiting more than two thousand years for their guy to come back. Modern cargo cult believers do exist, although most see John Frum and the like merely as manifestations of the same divinity worshiped in other parts of the world, and treat the trappings of the belief as a worship service rather than a magical collection of talismans. (The full story at https://rationalwiki.org/wiki/Cargo_cult reproduced here under CC-BY-SA ) More about Cargo Cult's here on WikiPedia https://en.wikipedia.org/wiki/Cargo_cult
  11. What every embedded programmer should know about … Duff's Device "A trick used three times becomes a standard technique" (George Polya) C is a peculiar language and knowing the internal details of how it works can be very powerful. At the same time, when we say things like that of course we first think of the Story of Mel which we wrote about before, so be careful not to write unmaintainable code! Probably the most famous "trick" in C is something called Duff's Device. It was discovered in 1983 by Duff while working at Lucasfilm (since they were working on Indiana Jones and the Temple of Doom around that time I am just going to believe that the booby traps were the inspiration for this one). The purpose of this "trick" is to unroll a loop in C, to get the performance increase of having the loop unrolled even if the number of times the loop runs is not always the same. We all know unrolling a loop which runs a fixed number of times is trivially easy, but doing it for an arbitrary number of cycles is quite a challenge. It relies on two facts about C. Firstly that case labels fall through if there is no break, and secondly that C allows safely jumping into the middle of a loop. Duff originally had this problem when he had to simply copy integers. This is very similar to the problem we often have in embedded programs that need to copy data, or do anything mundane repeatedly in a loop. Duff's problem was sending multiple src bytes to the same destination, similar to sending a string on a UART where to points to the send register location (of course his machine did not require waiting between bytes! send (uint8_t* to, uint8_t* from, uint16_t count) { do { /* count > 0 assumed */ *to = *from++; } while(--count > 0); } The problem with this code is that the post-increment together with the assignment is a fairly cheap operation, but the decrement of the count, comparison to zero together with the jump is more than twice that expense, and doing this all once for every byte means we are spending a lot more time on the loop than copying the data! Duff realized that he could change the time spent by any ratio he desired which would allow him to make a tradeoff at will between code space and speed. Of course the mileage you get will vary based on what you are doing in the loop and the capability of the processor to do the +7 and /8 math, but if you keep to powers of 2 and the divide becomes a simple shift, or even a swap and a mask like you have for 16, this can yield impressive results. The solution looks like this: send (uint8_t* to, uint8_t* from, uint16_t count) { register uint16_t n = (count + 7) / 8; switch (count % 8) { case 0: do { *to = *from++; case 7: *to = *from++; case 6: *to = *from++; case 5: *to = *from++; case 4: *to = *from++; case 3: *to = *from++; case 2: *to = *from++; case 1: *to = *from++; } while (--n > 0); } } Just to test that it works, let's go through 2 examples. If we pass in a count of 1 the following happens the variable n will become (1+7)/8 = 1 the switch will jumpt to (1%8) = "case 1" n-- becomes 0 and the while loop terminates immediately and we are done If we pass in a count of 12 the following happens: the variable n will become (12+7)/8 = 2 the switch will jumpt to (12%8) = "case 4" The unrolled loop will execute and fall through 4 case statements n-- becomes 1 and the while loop jumps to "case 0" The loop executes 8 more times n-- becomes 0 and the while loop terminates and we are done. As you can see the loop was executed 4 + 8 = 12 times but the loop variable (n) was only decremented twice and the jump happend 2 times instead of 12 for around 80% saving in loop costs. This concept is easily expanded to something like memcpy which increments both the from and to pointers. As always there is a link at the end with a zipped MPLAB-X project you can use with the simulator to play with Duff's Device! References Wikipedia page on Duff's Device Duff's device entry in The Jargon File Source Code
  12. Version 1.0.0

    96 downloads

    This zip file contains the project implementing Duff's Device using XC8 and a PIC16F18875 which goes with the Duff's Device Blog. The project can be run entirely in the simulator in MPLAB-X so no need to have the device to play with the code! See the blog for more details
  13. Version 1.0.0

    116 downloads

    This file is the project which accompanies the blog post - A brief introduction to the pitfalls of Floating Point for programmers. This zip file contains a MPLAB-X project whcih is designed to run on a PIC16F18875, but is best used running in the simulator. The project should be configured for running on the simulator already. It was built using XC8 v2.05 and MPLAB-X v5.10. See the blog for more details
  14. What every embedded programmer should know about … Floating Point Numbers "Obvious is the most dangerous word in mathematics." E. T. Bell “Floating Point Numbers” is one of those areas that seem obvious at first, but do not be fooled! Perhaps when we learn how to do algebra and arithmetic we are never bound in our precision in quite the same way a computer is, and when we encounter computers we are so in awe by how “smart” they are that it becomes hard to fathom how limited floating point numbers on computers truly are. The first time I encountered something like this it had me completely baffled: void example1(void) { volatile double dval1; volatile bool result; dval1 = 0.6; dval1 += 0.1; if (dval1 == (0.6+0.1)) result = true; // Even though these are “obviously” the same else result = false; // This line is in fact what happens! } In this example we can see first hand one of the effects of floating point rounding errors. We are adding 0.6 to 0.1 in two slightly different ways and the computer is telling us that the result is not the same! But how is this possible? Surely, something must be broken here? Rest assured, there is nothing broken. This is simply how floating point numbers work, and as a programmer it is very important to know and understand why! In this particular case the compiler is adding up 0.6 and 0.1 at compile time in a different way than the microcontroller does at runtime. 0.7 represented as a binary fraction does not round to the same value as 0.6 + 0.1. This of course is merely scratching the surface, there is an impressive number of pitfalls when using floating point numbers in computer programs which result in incorrect logic, inaccurate calculations, broken control systems and unexpected division by 0 errors. Let’s take a more detailed look at these problems, starting at the beginning. Binary Fractions For decades now computers represent floating point numbers predominantly in a consistent way (IEEE 754-1985). [See this fascinating paper on it’s history by one of it’s architects, Prof W. Kahan] In order to understand the IEEE 754 representation better, we first have to understand how fractional numbers are represented when using binary. We are all familiar with the way binary numbers work. It should come as no surprise that when you go right of the decimal point it works just the same as decimal numbers, you keep on dividing by 2, so in binary 0.1 = 1/2 and 0.01 = 1/4 and so on. Let’s look at an example. The representation of 10.5625 in binary is 1010.1001. ( online calculator if you want to play ) Position 8's 4's 2's 1's 1/2's 1/4's 1/8's 1/16's Value 1 0 1 0 . 1 0 0 1 To get the answer we have to add up all of the digits, so we have 1 x8 + 1 x2 + 1 x 1/2 + 1 x 1/16 = 8 + 2 + 0.5 + 0.0625 = 10.5625 That seems easy enough but there are some peculiarities which are not obvious, one of these is that just like 1/3 is represented by a repeating decimal for decimal numbers, in binary entirely different numbers end up being repeating fractions, like 1/10, which is not a repeating decimal is in fact a repeating binaries fraction. The representation of the decimal number 0.1 in binary turns out to be 0.00011 0011 0011 0011 0011 0011 … When you do math with a repeating fraction like 1/3 or 1/6 or even 1/11 you know to take care as the effects of rounding will impact your accuracy, but it feels counter intuitive to apply the same caution to numbers like 0.1 or 0.6 or 0.7 which we have been trusting for accuracy all of our lives. IEEE 754 Standard Representation Next we need to take a closer look at the IEEE 754 standard so that we can understand what level of accuracy is reasonable to expect. 32 Bit IEEE 754 floating point number Sign (1bit) Exponent (8-bits) Significand (1.xxxxxxxx) (23 bits) The basic representation is fairly easy to understand. (for more detail follow the links from the WikiPedia page ) Standard (or Single Precision) floating point numbers are represented using 32 bits. The number is represented as a 24 bit binary number raised to an 8-bit exponent The 24 bit number is called the significand. The significand is represented as a binary fraction with an implied leading 1,meaning we really only use 23 bits to represent it as the MSB is always 1. 1 bit, the MSB, is always used as a sign bit Some bit patterns have special meanings, e.g. since there is always a leading 1 for the significand representing exactly 0 requires a special rule. There are a lot of special bit patterns with special meanings in the standard e.g. Infinity, NaN and subnormal numbers. NOTE: As you can imagine all these special bit patterns adds quite a lot of overhead to the code dealing with floating point numbers. The Solaris manual has a very good description of the special cases The decimal number 0.25 is for example represented using IEEE 754 floating point standard presentation as follows: Firstly the sign bit is 0 as the number is positive. Next up is the significand. We want to represent ¼ - in binary that would be 0.01, but the rule for is that we have to have a leading 1 in front of the decimal point, so we have to shift the decimal point by 2 positions to give us 1.0 * 2^-2 = 0.25. (This rule is meant to ensure that we will not have 2 different representations of the same number). The exponent is constructed by taking the value and adding a bias of 127 to it, this has some nice side effects, we can represent anything from -127 to +128 without sacrificing another sign bit for the exponent, and we can also sort numbers by simply treating the entire number as a signed integer. In the example we need 2 to the power of (-2) so we take 127 + (-2) which gives 125. In IEEE 754 that would look as follows: Sign (0) Exponent (125) Significand (1.0) 0 0111 1101 000 0000 0000 0000 0000 0000 If you would like to see this all done by hand you can look at this nice youtube video which explains how this works in more detail with an example - Video Link NOTE: You can check these numbers yourself using the attached project and the MPLAB-X debugger to inspect them. If you right-click in the MPLAB-X watch window you can change the representation of the numbers to binary to see the individual bits) Next we will look at what happens with a number like 0.1, which we now know has a repeating binary fraction representation. Remember that 0.1 in binary is 0.00011001100110011..., which is encoded as 1.100110011001100… x 2^(-4) (because the rule demands a leading 1 in the fraction). Doing the same as above this leads to: Sign (0) Exponent (123) Significand (1.100110011...) 0 0111 1011 100 1100 1100 1100 1100 1101 The only tricky bit there is the last digit which is a 1, as we are required to round off to the nearest 23 binary digits and since the next digit would be a 1 that requires us to round up so the last digit becomes a 1. For more about binary fractional rounding see this post on the angular blog This rounding process is one of the primary causes of inaccuracy when we work with floating point numbers! In this case 0.1 becomes exactly 0.100000001490116119384765625, which is very close to 0.1 but not quite exact. To make things worse the IDE will try to be helpful by rounding the displayed number to 8 decimal digits for you, so looking at the number in the watch window will show you 0.1 exactly! Precision Before we look at how things can break, let’s get the concept of precision under the knee first. If we have 24 bits to represent a binary significand that equates to 7.22472 decimal digits of precision (since log(2^24) = 7.22472). This means that the number is by design only accurate to the 7th digit and sometimes it can be accurate to the 8th digit, but further digits should be expected to be only approximations. The smallest absolute value normal 32-bit floating point number would be 1.0 x 2 ^ (-126) = 1.17549453E-38 represented as follows: Sign (0) Exponent (1) Significand (1.0) 0 0000 0001 000 0000 0000 0000 0000 0000 NOTE: Numbers with exponent 0 represent “Subnormal” numbers which can be even smaller but are treated slightly differently, this is out of scope for this discussion. (see https://en.wikipedia.org/wiki/Denormal_number for more info) The absolute precision of floating point numbers vary wildly over their range as a consequence of the point floating with the 7 most significant digits. This means that the largest numbers, in the range of 10^38 have only 7 accurate digits, which means the smallest distance between 2 subsequent numbers will be 10^31, or 10 Quintillion for those who know big numbers. This means that a rounding error can mean we are off by 1 Quintillion in error! To put that in context, if we used 32-bit floating point numbers to store bank balances they would have only 7 digits of accuracy, which means any balance of $100,000.00 or more will not be 100% correct and the bigger the balance gets the larger the error would be. As the numbers get smaller the error floats down with the decimal point so for the smallest numbers the error shrinks to a miniscule 10^-45. Also these increments do not match the increments we would have for decimal digits, which means converting between the two is imprecise and involves choosing the closest representation instead of the exact number (known as rounding). The following figure shows a decimal and binary representation of numbers on the same scale. Note that the binary intervals are divisible only by powers of 2 while the decimals are divided by 10’s, which means the lines to not align (follow the blue line down e.g.) These number lines also show that there are lots of values in-between which cannot be represented and we have to round them to the nearest representable number. Implications and Errors We know now that floating point numbers are imprecise and we have already seen one example of how rounding of these numbers can cause exact comparisons to fail unexpectedly. This will make up our first problem. Example PROBLEM 1 - Comparison of floating point numbers The first typical problem we will look at is the example that we started with. PROBLEM 1 : Floating point numbers are imprecise so we cannot reliably check them for equality. SOLUTION 1 : Never do exact comparisons for equality on floating point numbers, use inequalities like greater than or smaller than instead, and when doing this be sensitive to the fact that the imprecision and rounding can cause unexpected behavior at the boundaries. If you are really forced to look for a specific value consider looking for a small range e.g. check if Temp > 99.0 AND Temp < 101.0 if you are trying to find 100. void example1(void) { // If you REALLY HAVE to compare these then do something like this. // The range you pick needs to be appropriate to cover the maximum // possible rounding error for the calculation. // If we added up only 1 pair the most this error can be is 7 decimal // digits of accuracy, so these values are appropriate if ((dval1 > 0.6999999) && (dval1 < 0.7000001)) { result = true; // This time we get the truth ! } else { result = false; // And this does NOT happen ... } } The attached source code project contains more examples of this. Example PROBLEM 2 - Addition of numbers of different scale The next problem happens often when we implement something like a PID controller where we have to make thousands of additions over a time period intended to move the control variable by a small increment every cycle. Because of the limitation of 7.22 digits of precision we can run into trouble when our sampling rate is high and the value of Pi (integral term) requires more than 7.22 digits of precision. Example 2 shows this case nicely. void example2(void) { volatile double PI = 1E-5; volatile double controlValue = 300; controlValue += 1 * PI; controlValue += 1 * PI; controlValue += 1 * PI; } This failure case happens quite easily when we have e.g. a PWM duty cycle we are controlling, and the current value is 300, but the sampling rate is very high, in this case 100kHz. We want to add a small increment to the controlValue at every sample, but we want these to add up to 1, so that the controlValue becomes 301, after 1 second of control, but since 1E-5 is 8 decimal places down from the value 300 the entire error we are trying to add gets rounded away. If you run this code in the simulator you will see that controlValue remains the same no matter how many times you add this small value to it. When you change the control value to 30 it works though, which means that you can test this as well as you like, and it will look like it is working but it can fail in production! When you change the controlValue to 30 you will see that the first 2 additions work, and then it starts losing precision and it gets worse as you attempt to approach 300 An example of where this happened for real and people actually died is the case of the Patriot Missile software bug - yes this really happened! (The full story here and more about it here ) PROBLEM 2 : When doing addition or subtraction, rounding can cause systems to fail. SOLUTION 2 : Always take care that 7 digits of precision will be enough to do your math, if not make sure you store the integral or similar part in a separate variable to compensate for the precision mismatch. Imprecision and rounding of interim values can also cause all kinds of problems like giving different answers depending on the order the numbers are added together or underflows to 0 for interim values causing division by zero incorrectly when doing multiplication or division. There is a nice example of this when calculating quadratic roots in the attached source code project Example PROBLEM 3 - Getting consistent absolute precision across the range Our third problem is very typical in financial systems, where no matter how large the balance gets, for the books to balance we must always be accurate to 1 cent. The floating point number system ensures that we always get consistent precision as a ratio or percentage of the number we deal with, for single precision numbers that is 7.22 digits or roughly 0.000001 % of accuracy, but as the numbers get really big the absolute precision gets worse as numbers get bigger. If we want to be accurate to 1mm or 1cm or 1 cent this system will not work, but there are ways to get consistent absolute precision, and even avoid the speed penalty of dealing with floating point numbers entirely! PROBLEM 3 : Precision of the smallest increment has to remain constant for my application but floating point numbers break things due to lack of precision when my numbers get large. SOLUTION 3: In cases like this floating point numbers are just not appropriate to store the number. Use a scaled integer to represent your number. For money this is a simple as storing cents as integer instead of dollars. For distances this may mean storing millimeters as an integer instead of storing meters as a floating point, or for time storing milliseconds as an integer instead of storing seconds as a floating point. Conclusion It is quite natural to think that how Floating Point numbers work is obvious, but as we have seen it is everything but. This blog covered only the basics of how floating point numbers are handled on computers, why they are imprecise and the 3 most common types of problems programmers have to deal with, including typical solutions to these problems. There are a lot more things that can go wrong when you start getting numbers close to zero (subnormal), floating point underflow to 0, dealing with infinity etc. during calculations. A lot of these are nicely described by David Goldberg in this paper , but they are probably much less likely to affect you unless you are doing serious number crunching. Please do read that paper if you ever have to build something using any floating point calculations where accuracy is critical like a Patriot Missile System! References Why we needed the IEEE 754 standard by Prof W. Kahan An online Converter between Binary and Decimal The WikiPedia page on the standard Great description in the Solaris Manual Reproduction of the David Goldberg paper Youtube video of how to convert a decimal floating point number to IEEE754 format How rounding of binary fractions work About subnormal or denormal numbers Details on the Patriot Missile disaster - Link 1 Details on the Patriot Missile disaster - Link 2 Online IEEE754 calculator so you can check your compilers and hardware Source Code
×
×
  • Create New...