Jump to content
 

Orunmila

Member
  • Content Count

    221
  • Joined

  • Last visited

  • Days Won

    28

Blog Entries posted by Orunmila

  1. Orunmila
    All too often I see programmers stumped trying to lay out the folder structure for their embedded C project.
    My best advice is that folder structure is not the real problem. it is just one symptom of dependency problems. If we fix the underlying dependencies a pragmatic folder structure for your project will probably be obvious due to the design being sound.
    In this blog I am going to first look briefly at Modularity in general, and then explore some program folder structures I see often, exploring if and why they smell. 
    On Modularity in general
    Writing modular code is not nearly as easy as it sounds.
    Trying it out for real we quickly discover that simply distributing bits of code across a number of files does not solve much of our problems. This is because modularity is about Architecture and Design and, as such, there is a lot more to it. To determine if we did a good job we need to first look at WHY. WHY exactly do we desire the code to be modular, or to be more specific - what exactly are we trying to achieve by making the code modular?
    A lot can be said about modularity but to me, my goals are usually as follows:
    Reduce working set complexity through divide and conquer.  Avoid duplication by re-using code in multiple projects (mobility). Adam Smith-like division of labor. When code is broken down into team-sized modules we can construct and maintain it more efficiently. Teams can have areas of specialization and everyone does not have to understand the entire problem in order to contribute.
    In engineering, functional decomposition is a process of breaking a complex system into a number of smaller subsystems with clear distinguishable functions (responsibilities). The purpose of doing this is to apply the divide and conquer strategy to a complex problem. This is also often called Separation of Concerns.
    If we keep that in mind we can test for modularity during code review by using a couple of simple core concepts.
    Separation: Is the boundary of every module clearly distinguishable? This requires every module to be in a single file, or else - if it spans multiple files - a single folder which encapsulates the contents of the module into a single entity. Independent and Interchangeable: This implies that we can also use the module in another program with ease, something Robert C Martin calls Mobility. A good test is to imagine how you would manage the code using version control systems if the module you are evaluating had to reside in a different repository, have its own version number and its own independent documentation.  Individually testable: If a module is truly independent it can be used by itself in a test program without bringing a string of other modules along. Testing of the module should follow the Open-Closed principle which means that we can create our tests without modifying the module itself in any way. Reduction in working set Complexity: If the division is not making the code easier to understand it is not effective. This means that modules should perform abstraction - hiding as much of the complexity inside the module and exposing a simplified interface one layer of abstraction above the module function.  Software Architecture is in the end all about Abstraction and Encapsulation, which means that making your code modular is all about Architecture.
    By dividing your project into a number of smaller, more manageable problems, you can solve each of these individually. We should be able to give each of these to a different autonomous team that has its own release schedule, it's own code repository and it's own version number.
    Exploring some program file structures
    Now that we have established some ground rules for testing for modularity, let's look at some examples and see if we can figure out which ones are no good and which ones can work based on what we discussed above.
    Example 1: The Monolith

    I would hope that we can all agree that this fails the modularity test on all counts.
    If you have a single file like this there really is only one way to re-use any of the code, and that is to copy and paste it into your other project. For a couple of lines this could still work, but normally we want to avoid duplicating code in multiple projects as this means we have to maintain it in multiple places and if a bug was found in one copy there would be no way to tell how many times the code has been copied and where else we would have to go fix things.
    I think what contributes to the problem here is that little example projects or demo projects (think about that hello world application) often use this minimalistic structure in the interest of simplifying it down to the bare minimum. This makes sense if we want to really focus on a very specific concept as an example, but it sets a very poor example of how real projects should be structured.
    Example 2: The includible main

    In this project, main.c grew to the point where the decision was made to split it into multiple files, but the code was never redesigned, so the modules still have dependencies back to main. That is usually when we see questions like this on Stack Overflow.
    Of course main.c cannot call into module.c without including module.h, and the module is really the only candidate for including main.h, which means that you have what we call a circular dependency. This mutual dependency indicates that we do not actually have 2 modules at all. Instead, we have one module which has been packaged into 2 different files.

    Your program should depend on the modules it uses, it does not make sense for any of these modules to have a reverse dependency back to your program, and as such it does not make any sense to have something like main.h. Instead, just place anything you are tempted to place in main.h at the top of main.c instead!
    If you do have definitions or types that you think can be used by more than one module then make this into a proper module, give it a proper name and let anything which uses this include this module as a proper dependency.
    Always remember that header files are the public interfaces into your C translation unit. Any good Object Oriented programming book will advise you to make as little as possible public in your class. You should never expose the insides of your module publically if it does not form part of the public interface for the class. If your definitions, types or declarations are intended for internal use only they should not be in your public header file, placing them at the top of your C file most likely the best.
    A good example is device configuration bits. I like to place my configuration bit definitions in a file by itself called device_config.h, which contains only configuration bit settings for my project. This module is only used by main, but it is not called main.h. Instead, it has a single responsibility which is easy to deduce from the name of the file. To keep it single responsibility I will never put other things like global defines or types in this file. It is only for setting up the processor config bits and if I do another project where the settings should be the same (e.g. the bootloader for the project) then it is easy for me to re-use this single file.
    In a typical project, you will want to have an application that depends on a number of libraries, something like this. Importantly we can describe the program as an application that uses WiFi, TempSensors, and TLS. There should not be any direct dependencies between modules. Any dependencies between modules should be classified as configuration which is injected by the application, and the code that ties all of this together should be part of the application, not the modules. It is important that we adhere to the Open-Closed principle here. We cannot inject dependencies by modifying the code in the libraries/modules that we use, it has to be done by changing the application. The moment we change the libraries to do this we have changed the library in an application-specific way and we will pay the price for that when we try to re-use it.
     
    It is always critical that the dependencies here run only in one direction, and that you can find all the code that makes up each module on your diagram in a single file or in a folder by itself to enable you to deal with the module as a whole.
    Example 3: The Aggregate or generic header file
    Projects often use an aggregate header file called something like "includes.h". This quickly leads to the pattern where every module depends on every other and is also known as Spaghetti Code. It becomes obvious if you look at the include graph or when you try and re-use a module in your project by itself for e.g. a test. When any header file is changed you have to re-test every module now.
    This fails the test of having clearly distinguishable boundaries and clear and obvious dependencies between modules.
    In MCC there is a good (or should I say bad?) example of such an aggregate header file called mcc.h. I created a minimal project using MCC for the PIC16F18877 and only added the Accel3 click to the project as a working example for this case.
    The include graph generated using Doxygen looks as follows.
     

    There is no indication from this graph that the Accelerometer is the one using the I2C driver, and although main never calls to I2C itself it does look like that dependency exists. The noble intention here is of course to define a single external interface for MCC generated code, but it ends up tying all of the MCC code together into a single monolithic thing. This means my application does not depend on the Accelerometer, it now depends instead on a single monolithic thing called "everything inside of MCC", and as MCC grows this will become more and more painful to manage.
    If you remove the aggregate header then main no longer includes everything and the kitchen sink, and the include graph reduces to something much more useful as follows:

    This works better because now the abstractions are being used to simplify things effectively, and the dependency of the sensor on I2C is hidden from the application level. This means we could change the sensor from I2C to SPI without having any impact on the next layer up.
    Another version of this anti-pattern is called "One big Header File", where instead of making one header that includes all the others, we just place all the contents of all those headers into a single global file. This file is often called "common.h" or "defs.h" or "global.h" or something similar. Ward Cunningham has a good comprehensive list of the problems caused by this practice on his wiki.
    Example 4: The shared include folder

    This is a great example of Cargo Culting something that sometimes works in a library, and applying it everywhere without understanding the consequences. The mistake here is to divide the project into sources and headers instead of dividing it into modules. Sources and headers are hopefully not the division that comes to mind when I ask you to divide code into modules!
    In the context of a library, where the intention is very much to have an external interface (include) separated from its internal implementation (src), this segregation can make sense, but your program is not a library. 
    When you look at this structure you should ask how would this work in the following typical scenarios:
    If one of the libraries grows enough that we need to split it into multiple files? How will you now know which headers and/source belong to which library? If two libraries end up with identically named files? Typical examples of collisions are types.h, config.h, hal.h, callbacks.h or interface.h.  If I have to update a library to a later version, how will I know which files to replace if they are all mixed into the same folder? How do I know which files are part of my project, and as such, I should maintain them locally, as opposed to which files are part of a library and should be maintained at the library project location which is used in many projects? This structure is bad because it breaks the core architectural principles of cohesion and encapsulation which dictates that we keep related things together, and encapsulate logical or functional groupings into clearly identifiable entities.
    If you do not get this right it leads to library files being copied into every project, and that means multiple copies of the same file in revision control. You also end up with files that have nothing to do with each other grouped together in the same folder.
    Example 5: A better way
    On the other hand, if you focus on cohesion and encapsulation you should end up with something more like this
     
    I am not saying this is the one true way to structure your project, but with this arrangement, we can get the libraries from revision control and simply replace an entire folder when we do. It is also obvious which files are part of each library, and which ones belong to my project. We can see at a glance that this project has it's own code and depends on 3 libraries. The structure embodies information about the project which helps us manage it, and this information is not duplicated requiring us to keep data from different places in sync.
    We can now include these libraries into this, or any other project, by simply telling Git to fetch the desired version of each of these folders from its own repository. This makes it easy to update the version of any particular library, and name collisions between libraries are no longer an issue.
    Additionally, as the library grows it will be easy to distinguish in my code which library I have a dependency on, and exactly which types.h file I am referring to when I refer to the header files as follows.

    Conclusion
    Many different project directory structures could work for your project. We are in no way saying that this is "the one true structure". What we are saying is that when the time comes to commit your project to a structure, do remember the pros and cons of each of these examples we discussed. That way you will at least know the coming consequences of your decisions before you are committed to them.
    Robert C. Martin, aka Uncle Bob, wrote a great article back in 2000 describing the SOLID architectural principles. SOLID is focussed on managing dependencies between software modules. Following these principles will help create an architecture that manages the dependencies between modules well.
    A SOLID design will naturally translate into a manageable folder structure for your embedded C project. 
     
     
  2. Orunmila
    I2C is such a widely used standard, yet it has caused me endless pain and suffering. In principle I2C is supposed to be simple and robust, a mechanism for "Inter-Integrated Circuit" communication. 
    I am hoping that this summary of the battle scars I have picked up from using I2C might just save you some time and suffering. I have found that despite a couple of typical initial hick-ups it is generally not that hard to get I2C communication going, but making it robust and reliable can prove to be a quite a challenge.
    Problem #1 - Address Specification
    I2C data is not represented as a bit-stream, but rather a specific packet format with framing (start and stop conditions) preceded by an address, which encapsulates a sequence of 8-bit bytes, each followed by an ACK or NAK bit.

    The first byte is supposed to be the address, but right from the bat, you have to deal with the first special case. How to combine this 7-bit address with the R/W bit always causes confusion.
    There is no consistency in datasheets of I2C slave devices for specifying the device address, and even worse most vendors fail to specify which approach they use, leaving users to figure it out through trial and error. This has become bad enough that I would not recommend trying to implement I2C without an oscilloscope in hand to resolve these kinds of guessing games.
    Let's say the 7-bit device address was 0x76 (like the ever-popular Bosh Sensortech BME280).
    Sometimes this will be specified simply as 0x76, but the API in the software library, in order to save the work of shifting this value by 1 and masking in the R/W bit will often require you to pass in 0xEC as the address (0x76 left-shifted by one). 
    Sometimes the vendor will specify 0xEC as the "write" address and 0xED as the "read" address.

    To add insult to injury your bus analyzer or Saleae will typically show the first 8-bits as a hex value so you will never see the actual 7-bit address as a hex number on the screen, leaving you to be bit twiddling in your head on a constant basis while trying to make sense of the traces.
    Problem #2 - Multiple Addresses
    To add to the confusion from above many devices (like the BME280) has the ability to present on more than one address, so the datasheet will specify that (in the case of the BME280) if you pull down the unused SDO pin on the device it's address will be 0x76, but if you pull the pin up it will be 0x77.
    I have seen many users leave this "unused" pin floating in their layouts, causing the device to schizophrenically switch between the 2 addresses at runtime and behavior to look erratic. This also, of course, doubles the number of possible addresses the device may end up responding to, and the specification of exactly 2 addresses fools a lot of people into thinking that the vendor is actually specifying a read and write address as described above. This all adds to the guessing game of what the actual device address may be.
    To add to the confusion most devices have internal registers and these also have their own addresses, so it is very easy to get confused about what should go in the address byte. It is not the register address, it is the slave address, the register address goes in the data byte of the "write" you need to use if you want to do a "read",  in order to read a register from a specific address on the slave. Ok, if that is not confusing to you I salute you sir!
    Problem #3 - 10-bit address mode
     
    As if there was not enough address confusion already the limitation of only 127 possible device addresses lead to the inclusion of an extension called 10-bit addressing.
    A 10-bit address is actually a pre-defined 5-bits, followed by the 2 most significant bits of the 10-bit address, then the R/W bit, after this an Ack from all the devices on the bus using 10-bit addressing with the same 2 MSB addresses, and after this the remaining 8 bits of the address followed by the real/full address ack.
    So once again there is no standard way to represent the 10-bit address. Let's say the device has 10-bit address 0x123, how would this be specified now? The vendor could say 0x123 (and only 10 of the 12 bits implied are the 10-bit address), or they could include the prefix and specify it as 0xF223. Of course that number contains the R/W bit in the middle somewhere, so they may specify a "read" and a "write" address as 0xF223 and 0xF323, or they could right-shift the high-byte to show it as a normal 7-bit address, removing the R/W bit, and say it is 0x7123.
    I think you get the picture here, lots of room for confusion and we have not even received our first ACK yet!
    Problem #4 - Resetting during Debugging
    Since I2C is essentially transaction/packet based and it does not include timeouts in the specification (SMBUS does of course, but most slave sensors conform to I2C only) there is a real chance that you are going to reset your host processor (or bus master) in the middle of such a transaction. This happens as easily as re-programming the processor during development (which you will likely be doing a lot).
    The problem that tends to catch everybody at some point is that a hardware reset of your host processor is entirely invisible to the slave device which does not lose power when you toggle the master device's reset pin! The result is that the slave thinks that it is in the middle of an I2C transaction and awaits the expected number of master clock pulses to complete the current transaction, but the master thinks that it should be creating a start condition on the bus. This often leads to the slave holding the data line low and the master unable to generate a start condition on the bus.
    When this happens you will lose the ability to communicate with the I2C sensor/slave and start debugging your code to find out what has broken. In reality, there is nothing wrong with your code and simply removing and re-applying the power to the entire board will cause both the master and slave to be reset, leaving you able to communicate again.
    Of course, re-applying the power typically causes the device to start running, and if you want to debug you will have to attach the debugger which may very well leave you in a locked-up state once again.
    The only way around this is to use your oscilloscope or Saleae all of the time and whenever the behavior seems strange stare very carefully at what is happening with the data line, is the address going out, is the start condition recognized and is the slave responding as it should, if not you are stuck and need to reset the slave device somehow.
    Problem #5 - Stuck I2C bus
    The situation described in #4 above is often referred to as a "stuck bus" condition. I have tried various strategies in the past to robustly recover from such a stuck bus condition programmatically, but they all come with a number of compromises. 
    Firstly slave devices are essentially allowed to clock-stretch indefinitely, and if a slave device state machine goes bonkers it is possible that a single slave device can hold the entire bus hostage indefinitely and the only thing you can possibly do is remove the power from all slave devices. This is not a very common failure mode but it is definitely possible and needs addressing for robust or critical systems.
    Often getting the bus "unstuck" is as simple as providing the slave device enough clocks to convince it that the last transaction is complete. Some slaves behave well and after clocking them 8 times and providing a NAK they will abort their current transaction.  I have seen slaves, especially I2C memories, where you have to supply more than 8 clocks to be certain that the transaction terminates, e.g. 32 clocks. I have also seen specialized slave devices that will ignore your NAK's and insist on sending even more data e.g. 128 or more bits before giving up on an interrupted transaction.
    The nasty part about getting an I2C bus "unstuck" is that you usually not use the I2C peripheral itself to do this service. This means that typically you will need to disable the peripheral, change the pins to GPIO mode and bit-bang the clocks you need out of the port, after which you need to re-initialize the I2C peripheral and try the next transaction, and if this fails then rinse and repeat the process until you succeed. This, of course, is expensive in terms of code space, especially on small 8-bit implementations.
    Problem #6 - Required Repeated Start conditions
    The R/W bit comes back to haunt us for this one. The presence of this bit implies that all transactions on I2C should be uni-directional, that is they must either read or write, but in practice, things are not that simple. Typically a sensor or memory will have a number of register locations inside of the device and you will have to "write" to the device to specify which location you wish to address, followed by "reading" from the device to get the data. The problem with a bus is that something may interrupt you between these two operations that form one larger transaction. 
    In order to overcome this limitation, I2C allows you to concatenate 2 I2C operations into a single transaction by omitting the stop condition between them. So you can do the write operation and instead of completing it with a stop condition on the bus you can follow with a second start condition and the latter half of the operation, terminating the whole thing with a stop condition only when you are done.
    This is called a "repeated start" condition and looks as follows (from the BME280 datasheet).

    It can often be quite a challenge to generate such a repeated start condition as many I2C drivers will require you to specify read/write and a pointer and number of bytes and not give you the option to omit the stop condition, and many slave devices will reset their state machines at a stop condition so without a repeated start it is not possible to communicate with these devices.
    Of course, I should also mention that the requirement to send the slave address twice for these transactions significantly reduces the throughput you can get through the bus.
    Problem #7 - What is Ack and Nak supposed to be?
    This brings us to the next problem. It is quite clear that the Address is ack-ed by the slave, but when you are reading data what is the exact semantics of the ack/nak? The BME280 datasheet is a bit unique in that it clearly distinguishes in that figure in #6 above whether the ack should be generated by the master or the slave (ACKS vs ACKM), but from the specification, it is not immediately clear. If I read data from a slave, who is supposed to provide the ack at the end of the data? Is this the master or the slave? 
    What would be the purpose of the master providing an ack to the slave to data? Clearly, the master is alive as it is generating clocks, and the slave may be sending all 1's which means it does not touch the bus at all. So what is the slave supposed to do if the master makes a Nak in response to a byte? And how would a slave determine if it should Nak your data since there is no checksum or CRC on it there is no way to determine if it is correct? None of this is clearly specified anywhere.
    To add confusion I have seen people spend countless hours looking for the bug in their BME280 code which causes the last data byte to get a NAK! When you look on the bus analyzer or Oscilloscope you will be told that every byte was followed by an ACK except for the last one where you will see a NAK. Most people interpret this NAK to be an indication that something is wrong, but no, look carefully at the image from the datasheet in section #6 above! Each byte received by the master is followed by an ACKM (ack-ed by the master) EXCEPT for the last byte, in which case the master will not ACK it, causing a NAK to proceed the stop condition!
    To make this even harder, most I2C hardware peripherals will not allow you fine-grained control of whether the master will ACK or NAK. Very often the peripheral will just blithely ack every byte that it reads from the slave regardless.
    Problem #8 - Pull-up resistors and bus capacitance
    The I2C bus is designed to be driven only through open-drain connections pulling the bus down, it is pulled up by a pair of pull-up resistors (one on the clock line and one on the data line). I have seen many a young engineer struggle with unreliable I2C communication due to either the entire lack of or incorrect pull-up resistors. Yes it is possible to actually communicate even without the resistors due to parasitic pull-ups which will be much larger than required, meaning that it will pull weakly enough to get the bus high-ish and under some conditions can provoke an ack from a slave device.
    There is no clear specification of what the size of these pull-up resistors should be, and for good reason, but this causes a lot of uncertainty. The I2C specification does specify that the maximum bus capacitance should be 400pF. This is a pretty tricky requirement to meet if you have a large PCB with a number of devices on the bus and it is often overlooked, so it is typical to encounter boards where the capacitance is exceeding the official specification.
    In the end, the pull-up needs to be strong enough (that is small enough) to pull the bus to Vdd fast enough to communicate at the required bus speed (typically 100kHz or 400kHz). The higher the bus capacitance is the stronger you will have to pull up in order to bring the bus to Vdd in time. If you look at the Oscilloscope you will see that the bus goes low fairly quickly (pulled down strongly to ground) but goes up fairly slowly, something like this:
    As you can see in the trace to the right there are a number of things to consider. If your pull-ups are too large you will get a lot of interference as indicated by the red arrows in the trace where the clock line is coupling through onto the data line which has too high an impedance. This can often be alleviated with good layout techniques, but if you see this on your scope consider lowering the pull-up value to hold the bus more steady.
    If the rise times through the pull-up are too slow for the bus speed you are using you will have to either work on reducing the capacitance on the bus or pulling up harder through a smaller resistor. Of course, you cannot just tie the bus to Vdd in the extreme as it still needs to be pulled to 0 by the master and slaves.
    As a last consideration, the smaller the resistor is you use the more power you will consume while driving the bus.
    Problem #9 - Multi-master
    I have been asked many times to implement multi-master I2C. There are a large number of complications when you need multiple masters on an I2C bus and this should only be attempted by true experts. Arbitrating the bus when multiple masters are pulling it down simultaneously presents a number of race conditions that are extremely hard to robustly deal with.
    I would like to just point out one case here as illustration and leave it at that. 
    Typical schemes for multi-master will require the master to monitor the bus while it is emitting the address byte. When the master is not pulling the bus low, but it reads low this is an indication that another master is trying to emit an address at the same time and you as a master should then abort your transaction immediately, yielding the bus to the other master. A problem arises though when both masters are trying to read from the same slave device. When this happens it is possible that both addresses match exactly and that the 2 masters start their transactions in close proximity. 
    Due to the clock skew between the masters, it is possible that they are trying to read from different control registers on the slave, that the slave will match only one of the 2 masters, but both masters will think that they have the bus and the slave is responding to their request.
    When this happens the one master will end up receiving incorrect data from the wrong address. Consider e.g. a BME280 where you may get the pressure reading instead of humidity, causing you to react incorrectly.
    Like I said there are many obscure ways multi-master can fail you, so beware when you go there.
    Problem #10 - Clock Stretching
    In the standard slaves are allowed to stretch the clock by driving the clock line low after the master releases it. Clock stretching slaves are a common cause of I2C busses becoming stuck as the standard does not provide for timeouts. This is something where SMBUS has provided a large improvement over the basic I2C standard, although there can still be ambiguity around how long you really have to wait to ensure that all slaves have timed out, and the idea with SMBUS is that you can safely mix with non-SMBUS slaves, but this one aspect makes it unreliable to do so.
    In critical systems, you will as a result very often see 2 I2C slave devices connected via different sets of pins, using I2C as a point to point communications channel instead of a bus in order to isolate failure conditions to a single sensor.

     Problem #11 - SMBUS voltage levels
    In I2C logical 1 voltage levels depends on the bus voltage and are above 70% of bus voltage for a 1 and below 30% for a 0. The problems here are numerous, resulting in different devices seeing a 0 or 1 at different levels. SMBUS devices do not use this mechanism but instead specify thresholds at 0.8v and 2.1v. These levels are often not supported by the microcontroller you are using leaving some room for misinterpretation, especially if you add the effects of bus capacitance and the pull-up resistors to the signal integrity.
    For more information about SMBUS and where it differs from the standard I2C specification take a look at this WikiPedia page.
    Problem #12 - NAK Polling
    NAK polling often comes into play when you are trying to read from or write to an I2C memory and the device is busy. These memory devices will use the NAK mechanism to signal the master that he has to wait and retry the operation in a short while.
    The problem here is that many hardware I2C peripherals simply ignore acks and nak's altogether or does not give you the required hooks to respond to these. Many vendors try to accelerate I2C operations by letting you pre-load a transaction for sending to the slave and doing all of the transmission in hardware using a state machine, but these implementations rarely have accommodations for retrying the byte if the slave was to NAK it. 
    NAK-polling also makes it very hard to use DMA for speeding up I2C transmissions as once again you need to make a decision based on the Ack/Nak after every byte, and the hooks to make these decisions typically require an interrupt or callback at the end of every byte which causes huge overhead.
    Problem #13 - Bus Speeds
    When starting to bring up an I2C bus I often see engineers starting with one sensor and working their way through them one by one. This can lead to a common problem where the first sensor is capable of high-speed transmission e.g. 1MHz, but you only need 1 sensor on the bus that is limited to 100KHz and this can cause all kinds of intermittent failures.
    When you have more than 1 slave on the same bus make sure that the bus is running at a speed that all the slaves can handle, this means that when you bring up the bus it is always a good idea to start things out at 100kHz and only increase the speed once you have communication established with all the slaves on the bus.
    The more slaves you have on the bus the more likely you will be to have an increased bus capacitance and signal integrity problems.
    In conclusion
    I2C is quite a widely used standard. When I am given the choice between using I2C or something like SPI for communication with sensor devices I tend to prefer SPI for a number of reasons. It is possible to go much faster using SPI as the bus is driven hard to both 0 and 1, the complexities of I2C and the problems outlined above inevitably raises the complexity significantly and presents a number of challenges to achieving a robust system. To be clear I am not saying I2C is not a robust protocol, just that it takes some real skill to use it in a truly robust way, and other protocols like SPI do not require the same effort to achieve a robust solution.
    So like the Plain White T's say, hate is a strong word, but I really really don't like I2C ...
    I am sure there are more ways I2C has bitten people, please share your additional problems in the comments below, or if you are struggling with I2C right now, feel free to post your problem and we will try to help you debug it!
     
     
  3. Orunmila
    Structures in the C Programming Language
    Structures in C is one of the most misunderstood concepts. We see a lot of questions about the use of structs, often simply about the syntax and portability. I want to explore both of these and look at some best practice use of structures in this post as well as some lesser known facts. Covering it all will be pretty long so I will start off with the basics, the syntax and some examples, then I will move on to some more advanced stuff. If you are an expert who came here for some more advanced material please jump ahead using the links supplied. 
    Throughout I will refer to the C99 ANSI C standard often, which can be downloaded from the link in the references. If you are not using a C99 compiler some things like designated initializers may not be available. I will try to point out where something is not available in older complilers that only support C89 (also known as C90). C99 is supported in XC8 from v2.0 onwards.
    Advanced topics handled lower down
    Scope Designated Initializers Declaring Volatile and Const Bit-Fields Padding and Packing of structs and Alignment Deep and Shallow copy of structures Comparing Structs Basics
    A structure is a compound type in C which is known as an "aggregate type". Structures allows us to use sets of variables together like a single aggregate object. This allows us to pass groups of variables into functions, assign groups of variables to a destination location as a single statement and so forth.
    Structures are also very useful when serializing or de-serializing data over communication ports. If you are receiving a complex packet of data it is often possible to define a structure specifying the layout of the variables e.g. the IP protocol header structure, which allows more natural access to the members of the structure.
    Lastly structures can be used to create register maps, where a structure is aligned with CPU registers in such a way that you can access the registers through the corresponding structure members.
    The C language has only 2 aggregate types namely structures and arrays. A union is notably not considered an aggregate type as it can only have one member object (overlapping objects are not counted separately). [Section "6.5.2 Types" of C99]
    Syntax
    The basic syntax for defining a structure follows this pattern.
    struct [structure tag] {    member definition;    member definition;    ... } [one or more structure variables]; As indicated by the square brackets both the structure tag (or name) and the structure variables are optional. This means that I can define a structure without giving it a name. You can also just define the layout of a structure without allocating any space for it at the same time.
    What is important to note here is that if you are going to use a structure type throughout your code the structure should be defined in a header file and the structure definition should then NOT include any variable definitions. If you do include the structure variable definition part in your header file this will result in a different variable with an identical name being created every time the header file is included! This kind of mistake is often masked by the fact that the compiler will co-locate these variables, but this kind of behavior can cause really hard to find bugs in your code, so never do that!
    Declare the layout of your structures in a header file and then create the instances of your variables in the C file they belong to. Use extern definitions if you want a variable to be accessible from multiple C files as usual.
    Let's look at some examples.
    Example1 - Declare an anonymous structure (no tag name) containing 2 integers, and create one instance of it. This means allocate storage space in RAM for one instance of this structure on the stack.
    struct {    int i;    int j; } myVariableName; This structure type does not have a name, so it is an anonymous struct, but we can access the variables via the variable name which is supplied. The structure type may not have a name but the variable does. When you declare a struct like this it is not possible to declare a function which will accept this type of structure by name.
    Example 2 - Declare a type of structure which we will use later in our code. Do not allocate any space for it.
    struct myStruct {    int i;    int j; }; If we declare a structure like this we can create instances or define variables of the struct type at a later stage as follows. (According to the standard "A declaration specifies the interpretation and attributes of a set of identifiers. A definition of an identifier is a declaration for that identifier that causes storage to be reserved for that object" - 6.7)
    struct myStruct myVariable1; struct myStruct myVariable2; Example 3 - Declare a type of structure and define a type for this struct.
    typedef struct myStruct {    int i;    int j; } myStruct_t; // Not to be confused by a variable declaration // typedef changes the syntax here - myStruct_t is part of the typedef, NOT the struct definition! // This is of course equivalent to struct myStruct {    int i;    int j; }; // Now if you placed a name here it would allocate a variable typedef struct myStruct myStruct_t; The distinction here is a constant source of confusion for developers, and this is one of many reasons why using typedef with structs is NOT ADVISED. I have added in the references a link to some archived conversations which appeared on usenet back in 2002. In these messages Linus Torvalds explains much better than I can why it is generally a very bad idea to use typedef with every struct you declare as has become a norm for so many programmers today. Don't be like them!
    In short typedef is used to achieve type abstraction in C, this means that the owner of a library can at a later time change the underlying type without telling users about it and everything will still work the same way. But if you are not using the typedef exactly for this purpose you end up abstracting, or hiding, something very important about the type. If you create a structure it is almost always better for the consumer to know that they are dealing with a structure and as such it is not safe to to comparisons like == to the struct and it is also not safe to copy the struct using = due to deep copy problems (later on I describe these). By letting the user of your structs know explicitly they are using structs when they are you will avoid a lot of really hard to track down bugs in the future. Listen to the experts!
    This all means that the BEST PRACTICE way to use structs is as follows,
    Example 4- How to declare a structure, instantiate a variable of this type and pass it into a function. This is the BEST PRACTICE way.
    struct point { // Declare a cartesian point data type int x; int y; }; void pointProcessor(struct point p) // Declare a function which takes struct point as parameter by value { int temp = p.x; ... // and the rest } void main(void) { // local variables struct point myPoint = {3,2}; // Allocate a point variable and initialize it at declaration. pointProcessor(myPoint); } As you can see we declare the struct and it is clear that we are defining a new structure which represents a point. Because we are using the structure correctly it is not necessary to call this point_struct or point_t because when we use the structure later it will be accompanied by the struct keyword which will make its nature perfectly clear every time it is used.
    When we use the struct as a parameter to a function we explicitly state that this is a struct being passed, this acts as a caution to the developers who see this that deep/shallow copies may be a problem here and need to be considered when modifying the struct or copying it. We also explicitly state this when a variable is declared, because when we allocate storage is the best time to consider structure members that are arrays or pointers to characters or something similar which we will discuss later under deep/shallow copies and also comparisons and assignments.
    Note that this example passes the structure to the function "By Value" which means that a copy of the entire structure is made on the parameter stack and this is passed into the function, so changing the parameter inside of the function will not affect the variable you are passing in, you will be changing only the temporary copy.
    Example 5 - HOW NOT TO DO IT! You will see lots of examples on the web to do it this way, it is not best practice, please do not do it this way!
    // This is an example of how NOT to do it // This does the same as example 4 above, but doing it this way abstracts the type in a bad way // This is what Linus Torvalds warns us against! typedef struct point_tag { // Declare a cartesian point data type int x; int y; } point_t; void pointProcessor(point_t p) { int temp = p.x; ... // and the rest } void main(void) { // local variables point_t myPoint = {3,2}; // Allocate a point variable and initialize it at declaration. pointProcessor(myPoint); } Of course now the tag name of the struct has no purpose as the only thing we ever use it for is to declare yet another type with another name, this is a source of endless confusion to new C programmers as you can imagine! The mistake here is that the typedef is used to hide the nature of the variable.
    Initializers
    As you saw above it is possible to assign initial values to the members of a struct at the time of definition of your variable. There are some interesting rules related to initializer lists which are worth pointing out. The standard requires that initializers be applied in the order that they are supplied, and that all members for which no initializer is supplied shall be initialized to 0. This applies to all aggregate types. This is all covered in the standard section 6.7.8.
    I will show a couple of examples to clear up common misconceptions here. Desctiptions are all in the comments.
    struct point { int x; int y; }; void function(void) { int myArray1[5]; // This array has random values because there is no initializer int myArray2[5] = { 0 }; // Has all its members initialized to 0 int myArray3[5] = { 5 }; // Has first element initialized to 5, all other elements to 0 int myArray3[5] = { }; // Has all its members initialized to 0 struct point p1; // x and y both indeterminate (random) values struct point p2 = {1, 2}; // x = 1 and y = 2 struct point p3 = { 1 }; // x = 1 and y = 0; // Code follows here } These rules about initializers are important when you decide in which order to declare your members of your structures. We saw a great example of how user interfaces can be simplified by placing members to be initialized to 0 at the end of the list of structure members when we looked at the examples of how to use RTCOUNTER in another blog post.
    More details on Initializers such as designated initializers and variable length arrays, which were introduced in C99, are discussed in the advanced section below.
    Assignment
    Structures can be assigned to a target variable just the same as any other variable. The result is the same as if you used the assignment operator on each member of the structure individually. In fact one of the enhancements of the "Enhanced Midrange" code in all PIC16F1xxx devices is the capability to do shallow copies of structures faster thought specialized instructions!
    struct point { // Declare a cartesian point data type int x; int y; }; void main(void) { struct point p1 = {4,2}; // p1 initialized though an initializer-list struct point p2 = p1; // p2 is initialized through assignment // At this point p2.x is equal to p1.x and so is p2.y equal to p1.y struct point p3; p3 = p2; // And now all three points have the same value } Be careful though, if your structure contains external references such as pointers you can get into trouble as explained later under Deep and Shallow copy of structures.
    Basic Limitations
    Before we move on to advanced topics. As you may have suspected there are some limitations to how much of each thing you can have in C. The C standard calls these limits Translation Limits. They are a requirement of the C standard specifying what the minimum capabilities of a compiler has to be to call itself compliant with the standard. This ensures that your code will compile on all compliant compilers as long as you do not exceed these limits.
    The Translation Limits applicable to structures  are:
    External identifiers must use at most 31 significant characters. This means structure names or members of structures should not exceed 31 unique characters. At most 1023 members in a struct or union At most 63 levels of nested structure or union definitions in a single struct-declaration-list Advanced Topics
    Scope
    Structure, union, and enumeration tags have scope that begins just after the appearance of the tag in a type specifier that declares the tag. When you use typedef's however the type name only has scope after the type declaration is complete. This makes it tricky to define a structure which refers to itself when you use typedef's to define the type, something which is important to do if you want to construct something like a linked list.
    I regularly see people tripping themselves up with this because they are using the BAD way of using typedef's. Just one more reason not to do that!
    Here is an example.
    // Perfectly fine declaration which compiles as myList has scope inside the curly braces struct myList { struct myList* next; }; // This DOES NOT COMPILE ! // The reason is that myList_t only has scope after the curly brace when the type name is supplied. typedef struct myList { myList_t* next; } myList_t; As you can see above we can easily refer a member of the structure to a pointer of the structure itself when you stay away from typedef's,  but how do you handle the more complex case of two separate structures referring to each other?
    In order to solve that one we have to make use of incomplete struct types.
    Below is an example of how this looks in practice.
    struct a; // Incomplete declaration of a struct b; // Incomplete declaration of b struct a { // Completing the declaration of a with member pointing to still incomplete b struct b * myB; }; struct b { // Completing the declaration of b with member pointing to now complete a struct a * myA; }; This is an interesting example from the standard on how scope is resolved.
    Designated Initializers (introduced in C99)
    Example 4 above used initializer-lists to initialize the members of our structure, but we were only able to omit members at the end, which limited us quite severely. If we could omit any member from the list, or rather include members by designation, we could supply the initializers we need and let the rest be set safely to 0. This was introduced in C99.
    This addition had a bigger impact on Unions however. There is a rule in a union which states that initializer-lists shall be applied solely to the first member of the union. It is easy to see why this was necessary, since the members of a each struct which comprizes a union do not have to be the same number of members, it would be impossible to apply a list of constants to an arbitraty member of the union. In many cases this means that designated initializers are the only way that unions can be initialized consistently.
    Examples with structs.
    struct multi { int x; int y; int a; int b; }; struct multi myVar = {.a = 5, .b = 3}; // Initialize the struct to { 0, 0, 5, 3 } Examples with a Union.
    struct point { int x; int y; }; struct circle { struct point center; int radius; }; struct line { struct point start; struct point end; }; union shape { struct circle mCircle; struct line mLine; }; void main(void) { volatile union shape shape1 = {.mLine = {{1,2}, {3,4}}}; // Initialize the union using the line member volatile union shape shape2 = {.mCircle = {{1,2}, 10}}; // Initialize the union using the circle member ... } The type of initialization of a union using the second member of the union was not possible before C99, which also means if you are trying to port C99 code to a C89 compiler this will require you to write initializer functions which are functionally different and your port may end up not working as expected.
    Initializers with designations can be combined with compound literals. Structure objects created using compound literals can be passed to functions without depending on member order. Here is an example.
    struct point { int x; int y; }; // Passing 2 anonymous structs into a function without declaring local variables drawline( (struct point){.x=1, .y=1}, (struct point){.y=3, .x=4}); Volatile and Const Structure declarations
    When declaring structures it is often necessary for us to make the structure volatile, this is especially important if you are going to overlay the structure onto registers (a register map) of the microprocessor. It is important to understand what happens to the members of the structure in terms of volatility depending on how we declare it.
    This is best explained using the examples from the C99 standard.
    struct s { // Struct declaration int i; const int ci; }; // Definitions struct s s; const struct s cs; volatile struct s vs; // The various members have the types: s.i // int s.ci // const int cs.i // const int cs.ci // const int vs.i // volatile int vs.ci // volatile const int Bit Fields
    It is possible to include in the declaration of a structure how many bits each member should occupy. This is known as "Bit Fields". It can be tricky to write portable code using bit-fields if you are not aware of their limitations.
    Firstly the standard states that "A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type." Further to this it also statest that "As specified in 6.7.2 above, if the actual type specifier used is int or a typedef-name defined as int, then it is implementation-defined whether the bit-field is signed or unsigned."
    This means effectively that unless you use _Bool or unsigned int your structure is not guaranteed to be portable to other compilers or platforms. The recommended way to declare portable and robust bitfields is as follows.
    struct bitFields { unsigned enable : 1; unsigned count : 3; unsigned mode : 4; }; When you use any of the members in an expression they will be promoted to a full sized unsigned int during the expression evaluation. When assigning back to the members values will be truncated to the allocated size.
    It is possible to use anonymous bitfields to pad out your structure so you do not need to use dummy names in a struct if you build a register map with some unimplemented bits. That would look like this:
    struct bitFields { unsigned int enable : 1; unsigned : 3; unsigned int mode : 4; }; This declares a variable which is at least 8 bits in size and has 3 padding bits between the members "enable" and "mode".
    The caveat here is that the standard does not specify how the bits have to be packed into the structure, and different systems do in fact pack bits in different orders (e.g. some may pack from LSB while others will pack from MSB first). This means that you should not rely on the postion of specific position of bits in your struct being in specific locations. All you can rely on is that in 2 structs of the same type the bits will be packed in corresponding locations. When you are dealing with communication systems and sending structures containing bitfields over the wire you may get a nasty surprize if bits are in a different order on the receiver side. And this also brings us to the next possible inconsitency - packing.
    This means that for all the syntactic sugar offered by bitfields, it is still more portable to use shifting and masking. By doing so you can select exactly where each bit will be packed, and on most compilers this will result in the same amount of code as using bitfields.
    Padding, Packing and Alignment
    This is going to be less applicable on a PIC16, but if you write portable code or work with larger processors this becomes very important.
    Typically padding will happen when you declare a structure that has members which are smaller than the fastest addressible unit of the processor. The standard allows the compiler to place padding, or unused space, in between your structure members to give you the fastest access in exchange for using more RAM. This is called "Alignment". On embedded applications RAM is usually in short supply so this is an important consideration.
    You will see e.g. on a 32-bit processor that the size of structures will increment in multiples of 4. 
    The following example shows the definition of some structures and their sizes on a 32-bit processor (my i7 in this case running macOS). And yes it is a 64 bit machine but I am compiling for 32-bit here.
    // This struct will likely result in sizeof(iAmPadded) == 12 struct iAmPadded { char c; int i; char c2; } // This struct results in sizeof(iAmPadded) == 8 (on GCC on my i7 Mac) or it could be 12 depending on the compiler used. struct iAmPadded { char c; char c2; int i; } Many compilers/linkers will have settings with regards to "Packing" which can either be set globally. Packing will instruct the compiler to avoid padding in between the members of a structure if possible and can save a lot of memory. It is also critical to understand packing and padding if you are making register overlays or constructing packets to be sent over communication ports.
    If you are using GCC packing is going to look like this:
    // This struct on gcc on a 32-bit machine has sizeof(struct iAmPadded) == 6 struct __attribute__((__packed__)) iAmPadded { char c; int i; char c2; } // OR this has the same effect for GCC #pragma pack(1) struct __attribute__((__packed__)) iAmPadded { char c; int i; char c2; } If you are writing code on e.g. an AVR which uses GCC and you want to use the same library on your PIC32 or your Cortex-M0 32-bitter then you can instruct the compiler to pack your structures like this and save loads of RAM.
    Note that taking the address of structure members may result in problems on architectures which are not byte-addressible such as a SPARC. Also it is not allowed to take the address of a bitfield inside of a structure.
    One last note on the use of the sizeof operator.  When applied to an operand that has structure or union type, the result is the total number of bytes in such an object, including internal and trailing padding.
    Deep and Shallow copy
    Another one of those areas where we see countless bugs. Making structures with standard integer and float types does not suffer from this problem, but when you start using pointers in your structures this can turn into a problem real fast.
    Generally it is perfectly fine to create copies of structures by passing them into functions or using the assignement operator "=".
    Example
    struct point { int a; int b; }; void function(void) { struct point point1 = {1,2}; struct point point2; point2 = point1; // This will copy all the members of point1 into point2 } Similarly when we call a function and pass in a struct a copy of the structure will be made into the parameter stack in the same way. When the structure however contains a pointer we must be careful because the process will copy the address stored in the pointer but not the data which the pointer is pointing to. When this happens you end up with 2 structures containing pointers pointing to the same data, which can cause some very strange behavior and hard to track down bugs. Such a copy, where only the pointers are copied is called a "shallow copy" of the structure. The alternative is to allocate memory for members being pointed to by the structure and create what is called a "deep copy" of the structure which is the safe way to do it.   We probably see this with strings more often than with any type of pointer e.g. struct person { char* firstName; char* lastName; } // Function to read person name from serial port void getPerson(struct person* p); void f(void) { struct person myClient = {"John", "Doe"}; // The structure now points to the constant strings // Read the person data getPerson(&myClient) } // The intention of this function is to read 2 strings and assign the names of the struct person void getPerson(struct person* p) { char first[32]; char last[32]; Uart1_Read(first, 32); Uart1_Read(last, 32); p.firstName = first; p.lastName = last; } // The problem with this code is that it is easy for to look like it works. The probelm with this code is that it will very likely pass most tests you throw at it, but it is tragically broken. The 2 buffers, first and last, are allocated on the stack and when the function returns the memory is freed, but still contains the data you received. Until another function is called AND this function allocates auto variables on the stack the memory will reamain intact. This means at some later stage the structure will become invalid and you will not be able to understand how, if you call the function twice you will later find that both variables you passed in contain the same names.
    Always double check and be mindful where the pointers are pointing and what the lifetime of the memory allocated is. Be particularly careful with memory on the stack which is always short-lived.
    For a deep copy you would have to allocate new memory for the members of the structure that are pointers and copy the data from the source structure to the destination structure manually. Be particularly careful when structures are passed into a function by value as this makes a copy of the structure which points to the same data, so in this case if you re-allocate the pointers you are updating the copy and not the source structure! For this reason it is best to always pass structures by reference (function should take a pointer to a structure) and not by value. Besides if data is worth placing in a structure it is probably going to be bigger than a single pointer and passing the structure by reference would probably be much more efficient!
    Comparing Structs
    Although it is possible to asign structs using "=" it is NOT possible to compare structs using "==". The most common solution people go for is to use memcmp with sizeof(struct) to try and do the comparison. This is however not a safe way to compare structures and can lead to really hard to track down bugs!
    The problem is that structures can have padding as described above, and when structures are copied or initialized there is no obligation on the compiler to set the values of the locations that are just padding, so it is not safe to use memcmp to compare structures. Even if you use packing the structure may still have trailing padding after the data to meet alignment requirements. The only time using memcmp is going to be safe is if you used memset or calloc to clear out all of the memory yourself, but always be mindful of this caveat.
    Conclusion
    Structs are an important part of the C language and a powerful feature, but it is important that you ensure you fully understand all the intricacies involved in structs. There is as always a lot of bad advice and bad code out there in the wild wild west known as the internet so be careful when you find code in the wild, and just don't use typdef on your structs!
    References
    As always the WikiPedia page is a good resource
    Link to a PDF of the comittee draft which was later approved as the C99 standard
    Linus Torvalds explains why you should not use typedef everywhere for structs
    Good write-up on packing of structures
    Bit Fields discussed on Hackaday
     
     
  4. Orunmila
    I recently got my hands on a brand new PIC18F47K40 Xpress board (I ordered one after we ran into the Errata bug here a couple of weeks ago).
    I wanted to start off with a simple "Hello World" app which would use the built-in CDC serial port, which is great for doing printf debugging with, and interacting with the board in general since it has no LED's and no display to let me know that anything is working, but immediately I got stuck. Combing the user's guide I could not find any mention of the CDC interface or which pins to configure to make it work, so I stared at the schematic and identified a hand full of candidates which I could try as the correct pins.
    Eventually I figured it out, the TX pin to send data to the PC is RB6 and the RX pin which will receive data from the PC is RB7, so if you are setting up the UART using MCC it should look like this:

     
    I created a very simple little terminal application to test the board with and thought this may be something useful for others who start with this board, especially since the standard MCC UART code falls afoul of the dreaded Errata we discussed here before caught me out again even on this simple little application.
    So remember to set the linker up to implement the workaround for the Errata like so ->
    The little program simply waits for a keypress (so you have time to set up your terminal application) and then (at 115200 BAUD) sends the welcome message "Hello World". After this the program will echo every character you type, but it will enclose it in a message so that you are sure you are not just fooled by the local echo in your terminal program!
     
     
     
     
     
     
    Here is the whole program, the serial port and other setup is all generated using MCC.
    void main(void) { // Initialize the device SYSTEM_Initialize(); // Wait for a keypress before we start EUSART1_Read(); // Say Hello printf("Hello World\r\n"); while (1) { printf("Received: '%c' \r\n", EUSART1_Read()); } } The full example project can be downloaded from here ADC_47K40.zip
    Note: I have set the project up to automatically copy the hex file onto the board, programming it, when I build the code. This is however set up for my MAC, if you are no a PC or linux you will probably see an error when building that it could not copy the file. If you want to set this up for your platform the instructions are in the users guide for the board.
    I include a screenshot of the page here for convenience.

     
     
  5. Orunmila
    One feature of MPLAB-X which is not nearly used enough is the Simulator. It is actually very powerful and useful for testing and debugging code which will run on your board and interact with the outside world via external stimuli. There is more information on the developer help site here http://microchipdeveloper.com/mplabx:scl
    This is how you can use the IDE to debug your serial command line interface code using stimulus files and the UART OUTPUT window.
    The first trick is to enable the UART output window in the IDE. We are going to start by creating a new project using MCC which drives the UART on the PIC16F18875. Settings should look as follows:

    Then go on the project settings and enable the UART output window in the simulator 

    We can now update our main code to do something simple to test the UART output.
    #include "mcc_generated_files/mcc.h" uint8_t buffer[64]; void main(void) { // initialize the device SYSTEM_Initialize(); INTERRUPT_GlobalInterruptEnable(); INTERRUPT_PeripheralInterruptEnable(); printf("Hello World\r\n"); If you run this code in the simulator it will open an additional output window which will show the output from the UART as follows

    Now the next trick is to use a stimulus file to send your test data into the device via the UART to allow you to test your serial processing code.
    First we will make a test file, there is more information on the format of these files on the developer help site, but it is fairly easy to understand just by looking at the example.
    // Start off by sending 0x00 00 // First string, send the bytes H E L L O (no \0 will be sent) "HELLO" // Wait for 100 ms before sending the next bit wait 100 ms // Second string " WORLD" // Wait for 100 ms before sending the next bit wait 100 ms // Send an entirely binary packet 90 03 00 14 55 12 We now simply save this text file so that we can import it into the stimulus system. We will just call it data.txt for now.
    Next we have to set up the IDE to send the data in this file. We are going to start by opening the Stimulus editor, this is under the menu "Window | Simulator | Stimulus" in MPLAB-X.
    Navigate to the tab labeled "Register Injection" and do the following steps:
    To add the file we will enter a new row here. Just type in the empty boxes (or you can use the icons on the left to add a new row)  Provide any label (we are using TEST) Select the UART1 Receive Register under Reg/Var - it is called RC1REG Choose "Message" for the trigger Click on the filename block to select the file we made above Set "Wrap" to no - this means it will not repeat Format should show Pkt and be raed-only The stimulus is now set up and ready for a test drive.

    As a test we are just going to update the program to echo back what it receives and we will be able to see the output in the Simulator Output window as the stimulus is being sent in. Here is the test code:
    #include <stdint.h> #include "mcc_generated_files/mcc.h" void main(void) { // initialize the device SYSTEM_Initialize(); INTERRUPT_GlobalInterruptEnable(); INTERRUPT_PeripheralInterruptEnable(); printf("Hello World\r\n"); while (1) { EUSART_Write(EUSART_Read()); } } If you now run the simulator the UART output will just show "Hello World" as the start message. When you hit the green run button on the stimulus it will send the data in the file to the program. Hitting the run button a second time will remove the stimulus and if you hit it one more time it will apply the stimulus a second time.
    That is a very basic introduction of how to use a stimulus which feeds data from a text file into the UART for testing purposes.
    It is a small step from here to feed data into a variable or into a different peripheral, we will not go into that right here, but know that it is easy to also do this. 
    When you combine this feature with running MDB from the command line and using a test framework such as Unity http://www.throwtheswitch.org/unity/ you should have all the tools to create automated tests for your serial port code.
    As always the project with the code we used is attached.
    SimulatorUART.zip
     
  6. Orunmila
    What every embedded programmer should know about ADC measurement, accuracy and sources of error
    ADC's you encounter will typically be specified as 8, 10 or 12-bit. This is however rarely the accuracy that you should expect from your ADC. It seems counter-intuitive at first, but once you understand what goes
    on under the hood this will be much clearer.
    What I am going to do today is take a simple evaluation board for the PIC18F47K40 (MPLAB® Xpress PIC18F47K40 Evaluation Board ) and determine empirically (through experiments and actual measurements) just how accurate you should expect ADC measurements to be.
    Feel free to skip ahead to a specific section if you already know the basics. Here is a summary of what we will cover with links for the cheaters.
    Units of Measurement of Errors Measurement Setup Sources and Magnitude of Errors Voltage Reference Noise Offset Gain Error Missing Codes and DNL Integral Nonlinearity (INL) Sampling Error Adding up Errors  Vendor Comparison Final Notes Units of Measurement of Errors
    When we talk about ADC's you will often see the term LSB used. This term refers to the voltage represented by the least significant bit of the ADC, so in reality it is the voltage for which you should read 1 on the ADC. This is a convenient measure for ADC's since the reference voltage is often not fixed and the size of 1 LSB in volts will depend on what you have the reference at, while most errors caused by the transfer function will scale with the reference.
    For a 10-bit ADC with 3v3 of range one LSB will be 3.3/(2^10) = 3.3/1024 = 3.22mV.
    An error of 1% on a 10-bit converter would represent 1%*1024 = 10.24x the size of one LSB, so we will refer to this as 10LSB of error, which means our measurement could be off by 32.2mV or ten times the size of 1 LSB.
    When I have 10 LSB in error I really should be rounding my results to the nearest 10 LSB, since the least significant bits of my measurement will be corrupted by this error. 10LSB will take 3.32 bits to represent. This means that my lowest 3 bits are possibly incorrect and I can only be confident in the values represented by the 7 most significant bits of my result. This means that the effective number of bits (ENOB) for my system is only 7, even though my ADC is taking a 10-bit measurement. The lower 3 bits are affected by the measurement error and cannot be relied upon so they should be discarded if I am trying to make an absolute voltage measurement accurately.
    We can always work out  exactly how many bits of accuracy we are losing, or to how many bits we need to round to, using the calculation:
    log(#LSB error)/log(2) Note that this calculation will give us fractional numbers of bits. If we have 10LSB error the error does not quite affect a full 4 bits (that happens only at 16LSB), but we can not say it removes only 3 bits, because that already happened at 8LSB, so this is somewhere in between. In order to compare errors meaningfully we will work with fractions of bits in these cases, so 10LSB of error reduces our accuracy by 3.32 bits. This is especially useful when errors are additive because we can add up all the fractional LSB of errors to get the total error to the nearest bit.
    At this point I would like to encourage you to take your oscilloscope and try to measure how much noise you can detect on your lines. You will probably be surprized that most desk oscilloscopes can only measure signals down to 20mV, which means that 1LSB on a 10-bit ADC with a 3V3 reference will be close to 10x smaller than the smallest signal your digital scope can measure!
    If you can see noise on the scope (which you probably can) then that means it is probably at least 20mV or 10LSB of error. 
    It turns out that our intuition about how accurate an ADC should be, as well as how accurate our scope can measure is seldom correct ...
    Measurement Setup

    I am using my trusty Saleae Logic Pro 8 today. It has a 12-bit ADC and measures +-10V on the analog channel and is calibrated to be accurate to between 9 and 10 ENOB of absolute accuracy. 
    This means that 1LSB of error will be roughly 4.8mV, which for my 2V system with a 10-bit ADC is already the size of 2LSB of measurement error.
    When I ground the Saleae input and take a measurement we can see how much noise to expect on the input during our measurements. As you will see later we actually want to see 2-3LSB of noise so that we can improve accuracy by digital filtering, if you do not have enough noise this is not possible, so this looks really good.
    Using the software to find the maximum variation for me you can see that I have about 15.64mV of noise on my line. Since the range is +-10V this is only 15.6/20000 = 0.08% of error,  but this is going to be, for my target 2V range, 15.6/2048*1024 = 8LSB of error to start with on my measurement equipment!

    For an experiment we are going to need an analog voltage source to measure using the ADC. It so happens that this device has a DAC, so why not just use that! You would think that this was a no-brainer, but it turns out, as always, that it is not quite as simple as that would seem!
    What I will do first is set the DAC and ADC to use the same reference (this has the added benefit that Vref inaccuracy will be cancelled out, nice!).We expect that if we set the DAC to give us 1.024V (50% of full range) and we then measure this using the 10-bit ADC, that we would measure half of the ADC range, or 512 right?
    For the test I made a simple program that will just measure the ADC every 1 second and print the result to the UART. Well here is the result of the measurement (to the right). Not what you expected ?!  Not only are the 1st two readings appallingly bad, but the average seems to be 717, which is a full 40% more than we expect! 
    How is this possible?
    Well this is how.
    Not only is the ADC inaccurate here, but the DAC even more so! The DAC is only 5 bits and it is specified to be accurate to 5LSB. That is already a full 320mV of error, but that is still not nearly enough to explain why we are measuring 717/1024*2.048 = 1.434V instead of 1.024V... So what is really going on here?
    To see I connected my trusty Saleae and changed the application to cycle the DAC though all 32 values, 1s per value, and make a plot for us to look at.  On the Saleae we see this.

    It turns out that the DAC is such a weak source that anything you connect to it's output (like even an ADC input leakage or simply an I/O pin with nothing connected to it!), will load down the DAC and skew the output! This has been the cause of consternation for many a soul actually (see e.g. this post on the Microchip forum)
    Wow, so that makes sense, but is there anything we can do about this? On this device unfortunately there is not much we can do. There are devices with on-board op-amps you can use to buffer the DAC output like the 16F170x family, but this device does not have op-amps so we are out of luck! I will blog all about DAC's and about what the reasons for this shape is on another occasion, this blog is about ADC after all! So all I am going to do is adjust the DAC setting to give us about the voltage we need by measuring the result using the Saleae and call it a day. Turns out I needed to subtract 6 from the output setting to get close.
    We now see a measurement of 520 and this is what we see while taking measurements with the Saleae. 10.37mV of noise on just about 1V and we are in business!

    Sources and Magnitude of Error
    When measurement errors are uncorrellated it means that they will all add up to form the total worst case error. For example if I have 2LSB of noise, and I also have 2LSB of reference error this means the reading can be 2LSB removed from the correct value as a result of the reference, and an additional 2LSB as a result of the noise, to give 4LSB of total error. This means that these two types of errors are not correllated and the bits contributed by each to the total error are additive.
    At this point I want to mention that customers often come to me demanding a 16bit-ADC because they feel that the 10-bit one they have is not adequate for their application. They can seldom explain to me why they need 31uV of accuracy or what advanced layout techniques they are applying to keep the noise levels to even remotely this range, and most of the time the real problem turns out to be that their 10bit-ADC is really being used so badly that they are hardly getting 5 bits of accuracy from the converter. I also often see calculations which effectively discard the lower 4 bits of the ADC measurement, leaving you with only 8-bits of effective measurement, so if you do that getting more bits in the ADC is obviously only going to buy you only disappointment!
    That said, lets look at all of the most significant sources of error one by one in more detail. There are quite a few so we will give them headings and numbers.
    1. Voltage Reference
    To get us started, lets look at the voltage reference and see how many LSB this contributes to our measurement error. If you are using a 1% reference, then please do not insist that you need a 16-bit or even a 12-bit ADC, because your reference alone is contributing errors into the 7th most significant bit and 8 bits is all you are going to get anyway! 

    The datasheet for our evaluation board chip (PIC18F47K40) shows that the voltage reference will be accurate to +-4% when we set it to 2.048V like we did. People are always surprized when they realize how many LSB they are losing due to the voltage reference! 
    4% of 1024 = 51.2, which means that the reference alone can contribute up to 51.2 LSB of error to our system! Using an expensive off-chip reference also complicates things for us. Using that we would now have to be very careful with the layout to not introduce noise at the reference pin, and also take care of any signals coupling into this pin. Even then the reference will likely be something like a TL431 which will only be accurate to 1% which will be 10 LSB reducing our 10-bit ADC to less than 8 ENOB.
    We must note that reference errors are not equally distributed. At the maximum scale 1% represents 10LSB of error, but at the lower end of the scale 1% will represent only 1% of 1LSB. Since we are looking for the worst-case error we have to work with 10LSB due to the 1% error over the full ADC range. In your application you may be able to adjust the contribution of this error down to better represent the range you are expecting to measure. For example - at mid range, where our test signal is, the reference error will only contribute 5LSB of error with a 1% reference, or 25LSB for our 4% internal reference.
    The reference error is something which we could calibrate out if we knew what the error was and many manufacturers discard it stating simply that you should calibrate it out. Sadly these references tend to drift over time, temperature and supply voltages, so you usually cannot just calibrate it in the factory and compensate for the error in software and forget it. 
    To revisit our 16-bit ADC scenario, if I want to measure accurately to 31uV (16 bits on a 2V reference) that reference would have to be accurate to 31uV/2V = 0.0015%. Let's look on Digikey for a price on a Voltage reference with the best specs we can find.
    Best candidate I can find is this one at $128.31 a piece, and even that gives me only 0.1% with up to 0.6ppm/C drift. This means from 0 to 100C I will have 0.006% of temp drift (2LSB) on top of the 0.1% tolerance (which is another 33LSB).

    Now to be fair if I am building a control system I am more interested in perturbations from a setpoint and a 16-bit ADC may be valuable even if my reference is off, because I am not trying to take an absolute measurement, but still maintaining noise levels below 30uV is more of a challenge than it sounds, especially if I am driving some power which adds noise to the equation. This is of course the difference between accuracy and resolution. Accuracy gives me the minimum absolute error while resolution gives me the smallest relative unit of measure.
    2. Noise
    Noise is of course the problem we all expect. It can often be a pretty large contributor to your measurement errors, and digital circuits are known for producing lots of noise that will couple into your inputs, but as we will see noise is not all bad and can be essential if you want to improve the results through digital post processing.
    We have seen that every 2mV of noise will add 1LSB to the error on our system as we have a 2V reference, and 1024 steps of measurement. As you have now seen this 2mV is probably much smaller than we can measure with a typical oscilloscope, so we cannot be sure how much noise we really have if we simply look at it on our scope.
    For most systems the recommendation would be to place the microcontroller in lowest power sleep mode and avoid toggling any output pins during the sampling of the ADC measurement to get the measurement with the lowest noise level. 
    A simple experiment will show how much noise we could be coupling into the measurement when an adjacent pin is being toggled. 
    I updated our program from before to simply toggle the pin next to the ADC input constantly and measured with the Saleae to see what the effect is. On the left is the signal zoomed out and on the right is one of the transitions zoomed in so you can get a better look. That glitch on the measurement line is 150mV or 75 LSB of noise due to an adjacent pin toggling, and the dev board I have does not even have long traces which would have made this much worse!

    It seems like a good idea to filter all this noise using analog low-pass filters like filter capacitors, but this is not always wise. We can make small amounts of noise work to our advantage, as long as it is white noise which is uncorrellated with our signal and other errors. When we do post-processing like taking multiple samples and averaging the result we can potentially increase the overall accuracy of our measurement. Using this technique it is possible to increase the ENOB (effective number of bits) of your measurements by simply taking more samples and averaging them. 
    Without getting too deep into the math there, if you oversample a signal by a factor of N you will improve the SNR by a factor of sqrt(N), which means oversampling 256 times and taking the average will result in an increase of 16x the SNR, which represents an additional 4 bits of resolution of the ADC. 
    Of course this is where having uncorrellated white noise of at least +-1LSB is important. If you have no noise on your signal you would likely just sample the same value 256 times and the average would not add any improvement to the resolution. If you had white noise added to the signal however you would sample a variety of values with the average lying somewhere in between the LSB you can measure, and the ratio of difference would represent the value of the signal more accurately. For a detailed discussion on this topic you can take a look at this application note by Silicon Labs https://www.silabs.com/documents/public/application-notes/an118.pdf 
    3. Offset
    The internal circuitry in the ADC will cause some offset error added into the conversion. This error will move all measurements either up or down by an equal amount. The Offset is a critical parameter for an ADC and should be specified in the datasheet for your device.
    For the PIC18F47K40 device the error due to offset is specified as 2 LSB. Of course if we know what the offset is we could easily subtract this from the results, so many specifications will exclude the offset error and claim that you could easily "calibrate out" the offset. This may be possible, even easy to do, but if you do not write the code for it and do the actual math you will have to include the offset error in your accuracy calculations, and measuring what the current offset is can be a real challenge in a real-world system which is not located on your laboratory bench.
    If you do decide to measure the offset on the factory floor and calibrate it out using software you need to be careful to use an accurate reference, avoid noise and other sources of error and make sure that the offset will remain constant over the operating range of voltage and temperature and also that it will not drift over time. If any of these are true your calibration will be met with limited success.
    Offset is often hard to calibrate out since many ADC's are not accurate close to the extremes (at Vref or 0V). If they were you could take a measurement with the input on Vref+ and on Vref- and determine the offset, but we knew it was never going to be this easy!
    The offset will also be different from device to device, so it is not possible to calibrate this out with fixed values in your code, you will have to actively measure this on every device in the factory and adjust as the offset changes.
    Some manufacturers will actually calibrate out the offset on an ADC for you during their manufacturing process. If this is the case you will probably see a small offset error of +-1 LSB which means that it is calibrated to be within this range.
    On our device the datasheet specifies a typical offset error of 0.5 LSB with a max error of 2 LSB, so this device is factory calibrated to remove the offset error, but even after this we should still expect up to 2 LSB of drift in the offset around the calibrated value.
    4. Gain Error
    Similar to the offset the internal transfer function of the ADC is designed to be as close as possible to ideal but there is always some error. Gain error will cause the slope of the transfer function to be changed. Depending on the offset this can cause an error which is at maximum either at the top or bottom end of the measurement scale as shown in the figure below.

    Like the offset it is also possible to calibrate out the gain error, as long as we have enough reference points to use for the calibration. If the transfer function is perfectly linear this would mean we would only require 2 measurement points.
    For our device the datasheet spec is typically 0.2LSB of gain error with a max error of 1.5LSB. This means that we cannot gain much from attempting to calibrate out the gain on this one. For other manufacturers you can easily find gain and offset errors in the tens of LSB, which makes calibration and compensation for the gain and offset worth the effort. The PIC18F47K40 is not only compensated for drift with temperature but also individually calibrated in the factory, so it seems that any additional calibration measurements will be at most accurate to 1LSB and the device is already specified to typically have less than this error, so calibration will probably gain us nothing.
    5. Missing Codes and DNL
    We expect that every time the code increments by 1LSB that the input voltage has increased by exactly 1LSB in size. For an ADC the DNL error is a measure of how close to this ideal we are in reality. It represents the largest single step error that exists for the entire range of the ADC.
    If the DNL is stated at 0.5LSB this means that it can take anything from 0.5LSB to 1.5LSB of input voltage change to get the output code to increment by 1. When the DNL is more than 1LSB it means that we can move the input voltage by 2LSB and only get a single count of the converter. When this happens it is possible that it causes the next bit to be squeezed down to 0LSB, which can cause the converter to skip that code entirely as shown below.

    Most converters will specify that the result will monotonically increase as the voltage increases and that it will have no missing codes as you scan through the range, but you still have to be careful, because this is under ideal conditions and when you add in the other errors it is possible that some codes get skipped, so when you are comparing the output of the converter never check for a specific conversion value. Always look for a value in a range around the limit you are checking.
    6. Integral Nonlinearity - INL
    INL is another of the critical parameters for all ADC's and will be stated in your datasheet if the ADC is any good. For our example the INL is specified as 3.5LSB.
    The tearm INL refers to the integral of the differential nonlinearity. In effect it represents what the maximum deviation from the ideal transfer function of the ADC will be as shown in the picture below. The yellow line represents the ideal transfer function while the blue line represents the actual transfer function. As you can see the INL is defined as the size of the maximum error through the range of the ADC.

    Since the INL can happen at any location along the curve it is not possible to calibrate this out. It is also uncorrellated with the other errors we have examined. We just have to live with this one!
    7. Sampling error
    A SAR ADC will consist of a sampling capacitor which holds the voltage we are converting during the conversion cycle. We must take care when we take a sample that we allow enough time for this sampling capacitor to charge to the level of accuracy we want to see in our conversion. Effectively we end up with a circuit that has some serial impedance through which the sampling capacitor is charged. The simplified circuit for the PIC18F47K40 looks as follows (from the datasheet).

    As you can see the series impedance (Rs) together with the sampling writch and passgate impedance (RIC + Rss) will form a low-pass RC filter to charge Chold. A detailed calculation of the sampling time required to be within 1LSB of the desired sampling value is shown in the ADC section of the device datasheet.
    If we leave too little time for the sample to be acquired this will directly result in a measurement error.
    In our case this means that if we have 10K Rs and we wait for 462us after the sampling mux turns to the input we are measuring, the capacitor will be charged to within 0.5LSB of our target voltage. The ADC on the PIC18F47L40 has a built-in circuit that can keep the sampling switch closed for us for a number of Tadc periods. This can be set by adjusting the register ADACQ or using the provided API generated by MCC to achieve this. That first inaccurate result we saw in the conversion was a direct result of the channel not being given enough time to charge the sampling cap since the acquisition time was set to the default value of 0. 
    Of course since we are not switching channels the capacitor is closer to the correct value when to take subsequent samples so the error seems to be going away over time! I have seen customers just throw away the first ADC sample as inaccurate, but if you do not understand why you can easily get yourselfs into a lot of trouble when you need to switch channels!
    We can re-do the measurement and this time use an acquisition time of 4xTadc = 6.8us. This is the result. 
    NOTE : There is another Errata on this device that you have to wait at least 1 instruction cycle before reading the ADGO bit to see if the conversion is complete after setting the ADGO bit. At first I was just doing what the datasheet suggested, set ADGO and then wait while(ADGO); for the conversion to complete. Due to this errata however the ADGO bit will still read 0 the first time you read it and you will think the conversion is done while it has not started, resulting in an ADC reading of 0 !
    After adding the required NOP() to the generated MCC code as follows the incorrect first reading is gone:
    adc_result_t ADCC_GetSingleConversion(adcc_channel_t channel, uint8_t acquisitionDelay) { // Turn on the ADC module ADCON0bits.ADON = 1; // select the A/D channel ADPCH = channel; //Set the Acquisition Delay ADACQ = acquisitionDelay; //Disable the continuous mode. ADCON0bits.ADCONT = 0; // Start the conversion ADCON0bits.ADGO = 1; NOP(); // NOP workaround for ADGO silicon issue // Wait for the conversion to finish while (ADCON0bits.ADGO) { } // Conversion finished, return the result return (adc_result_t)(((adc_result_t)ADRESH << 8) + ADRESL); }  

    Uncorrelated errors
    I will leave the full analysis up to the reader, but all of these errors are uncorrellated and thus additive, so for our case the worst case error will be when all of these errors align, the offset is in the same direction as the gain error, as the noise, as 
    the INL error, etc.
    Of course when we test on the bench it is unlikely that we will encounter a situation where all of these are 100% aligned, but if we have manufactured thousands of units in the field running for years it is definitely going to happen and much more often that you would like, so we have no choice but to design for the worst-case error we are likely to see in the wild.
    For our exampe the different sources of error add up as follows:
    Voltage Reference = 4% [41 LSB]
    Noise [8 LSB]
    Offset [2.5 LSB]
    Gain [1.5 LSB]
    INL [3.5 LSB]
    For a total of 56.5 LSB of potential absolute error in measurement. This reduces our effective number of bits by log(56.5)/log(2) = 5.8 bits, which means that our 10-bit LSB can have absolute errors running into the 6th bit, giving us only 4 ENB (effective number of bits) when we are looking for absolute accuracy.
    We can improve this to 26.5 LSB by suing a 1% off-chip reference, which will make the ENB = 5 bits. 
    If we look at the measurement we get using the Saleae we measure 0.99V on the line which should result in 0.99V/2.045V *1024 = 495 but our measurement is in fact 520, which is off by 25LSB. So as we can see our 1-board sample does not hit the worst case error at the center of the sampling range here, but our error extended at least into the 5th bit of the result as our 25LSB error requires more than 4 bits to represent. Nevertheless 25LSB is quite a bit better than the worst-case value of 56.5 LSB of error which we calculated, so this particular sample is not doing too badly!

    I am going to get my hands on a hair dryer in the week and take some measurements at an elevated temperature and then I will come back and update this for your reading pleasure 🙂
    Comparison
    I recently compared some ADC's from different vendors. I was actually looking more at the other features but since I was busy with this I also noted down the specs. Not all of the datasheets were perfectly clear so do reach out to me if I made a mistake somewhere, but this was how they matched up in terms of ADC performance.
    As fas as I could find them I used the worst case specifications and not the typical ones. Some manufacturers only specify typical results, so this comparison is probably not fair to those who make better specifications with better information. Let me know in the comments how you feel about this. I will go over the numbers again and maybe come update all of these to typical values for a more fair comparison if someone asks me for this ...
    Manufacturer ->
    Device -> Xilinx
    XC7Z010 Microchip
    PIC32MZ EF Texas Instruments
    CC3220SF Espressif
    ESP32 ST Micro
    STM32L475 Renesas
    R65N V2 INL 2 3 2.5 12 2.5 3 DNL 1 1 4 7 1.5 2 Offset 8 2 6(1) 25(1) 2.5 3.5 Gain 0.5 8 82(1) ?(2) 4.5 3.5               Total Error
    (INL+Offset+Gain) 10.5 13 90.5 37+ 9.5 10 I noted that many of these manufacturers specify their ADC only at one temperature point (25C) so you probably have to dig a little deeper to ensure that the specs will not vary greatly over temperature.
    (1) These settings were specified in the datasheet as an absolute voltage and I converted them to LSB for max range and best resolution of the ADC. Specifically for the TI device the offset was specified as 2mV and gain error as 20mV on a 1.4V range, and for ESP32 the offset is specified as 60mV but for a wider voltage range of 2.45v
    (2) For ESP32 I was not able to determine the gain error clearly form the datasheet.
    Final Notes
    We can conlcude a couple of very important points from this.
    If the datasheet claims a 12-bit ADC we should not expect 12-bits of accuracy. First we need to calculate what to expect from our entire system, and we should expect the reference to add the most to our error. All 12-bit converters are not equal, so when comparing devices do not just look at how many bits the converters provide, also compare their performance! The same sytem can yield between 5 and 10 bits of accuracy depending on the specs of the converter, so do not be fooled! Many of the vendors specified their ADC only at a very specific temperature and reference voltage at maximum, take care not to be fooled by this - shall we call it "creative" specmanship and be sure to compare apples with apples when looking for absolute accuracy. Source Code
    For those who have this board or device I attach the test code I used for download here : ADC_47K40.zip
  7. Orunmila
    Comparing raw pin toggling speed AVR ATmega4808 vs PIC 16F15376
    Toggling a pin is such a basic thing. After all, we all start with that Blinky program when we bring up a new board.

    It is actually a pretty effective way to compare raw processing speed between any two microcontrollers. In this, our first Toe-to-toe showdown, we will be comparing how fast these cores can toggle a pin using just a while loop and a classic XOR toggle.
    First, let's take a look at the 2 boards we used to compare these cores. These were selected solely because I had them both lying on my desk at the time. Since we are not doing anything more than toggling a pin we just needed an 8-bit AVR core and an 8-bit PIC16F1 core on any device to compare. I do like these two development boards though, so here are the details if you want to repeat this experiment.
    In the blue corner, we have the AVR, represented by the ATmega4808, 
    sporting an AtMega core (AVRxt in the instruction manual)

    Clocking at a maximum of 20MHz.

    We used the AVR-IOT WG Development Board, part number AC164160.
    This board can be obtained for $29 here:
    https://www.microchip.com/Developmenttools/ProductDetails/AC164160
    Compiler: XC8 v2.05 (Free)
    In the red corner, we have the PIC, represented by the 16F15376,
    sporting a PIC16F1 Enhanced Midrange core.

    Clocking at a maximum of 32MHz.

    We used the MPLAB® Xpress PIC16F15376 Evaluation Board, part number DM164143.
    This board can be obtained at $12 here:
    https://www.microchip.com/developmenttools/ProductDetails/DM164143 
    Compiler: XC8 v2.05 (Free)
    Results
    This is what we measured. All the details around the methodology we used and an analysis of the code follows below and attached you will find all the source code we used if you want to try this at home. The numbers in the graph are pin toggling frequency in kHz after it has been normalized to a 1MHz CPU clock speed. 
     
    How we did it (and some details about the cores)
    Doing objective comparisons between 2 very different cores is always hard. 
    We wanted to make sure that we do an objective comparison between the cores which you can use to make informed decisions on your project. In order to do this, we had to deal with the fact that the maximum clock speed of these devices is not the same and also that the fundamental architecture of these two cores is very different.
    In principle, the AVR is a Load-store Architecture machine with a 1 stage pipeline. This basically means that all ALU operations have to be performed between CPU registers and the RAM is used to load from and store results to. The PIC, on the other hand, uses a Register Memory Architecture, which means in short that some ALU operations can be performed on RAM locations directly and that the machine has a much smaller set of registers.
    On the PIC all instructions are 1 word in length, which is 14-bits wide, while the data bus is 8-bits in size and all results will be a maximum of 8-bits in size. On the AVR instructions can be 16-bit or 32-bit wide which results in different execution times depending on the instruction.
    Both processors have a 1 stage pipeline, which means that the next instruction is fetched while the current one is being executed. This means branching causes an incorrect fetch and results in a penalty of one instruction cycle. One major difference is that the AVR, due to its Load-store Architecture, is capable of completing the instruction within as little as just one clock cycle. When instructions need to use the data bus they can take up to 5 clock cycles to execute. Since the PIC has to transfer data over the bus it takes multiple cycles to execute an instruction. In keeping with the RISC paradigm of highly regular instruction pipeline flow, all instructions on the PIC take 4 clock cycles to execute.
    All of this just makes it tricky and technical to compare performance between these processors. What we decided to do is rather take typical tasks we need the CPU to perform which occurs regularly in real programs and simply measure how fast each CPU can perform these tasks. This should allow you to work backwards from what your application will be doing during maximum throughput pressure on the CPU and figure out which CPU will perform the best for your specific problem.
    Round 1: Basic test
    For the first test, we used virtually the same code on both processors. Since both of these are supported by MCC it was really easy to get going.
    We created a blank project for the target CPU Fired up MCC Adjusted the clock speed to the maximum possible Clicked in the pin manager to make a single pin on PORTC an output Hit generate code. After this all we added was the following simple while loop:
    PIC AVR while (1) { LATC ^= 0xFF; } while (1) { PORTC.OUT ^= 0xFF; } The resulting code produced by the free compilers (XC8 v2.05 in both cases) was as follows, interestingly enough both loops had the same number of instructions (6 in total) including the loop jump. This is especially interesting as it will show how the execution of a same-length loop takes on each of these processors. You will notice that without optimization there is some room for improvement, but since this is how people will evaluate the cores at first glance we wanted to go with this. 
    PIC          AVR .tg {border-collapse:collapse;border-spacing:0;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg .tg-0lax{text-align:left;vertical-align:top} Address Hex Instruction 07B3 30FF MOVLW 0xFF 07B4 00F0 MOVWF __pcstackCOMMON 07B5 0870 MOVF __pcstackCOMMON, W 07B6 0140 MOVLB 0x0 07B7 069A XORWF LATC, F 07B8 2FB3 GOTO 0x7B3      Address     Hex         Instruction 017D 9180 LDS R24, 0x00 017E 0444   017F 9580 COM R24 0180 9380 STS 0x00, R24 0181 0444   0182 CFFA RJMP 0x17D We used my Saleae logic analyzer to capture the signal and measure the timing on both devices. Since the Saleae is thresholding the digital signal and the rise and fall times are not always identical you will notice a little skew in the measurements. We did run everything 512x slower to confirm that this was entirely measurement error, so it is correct to round all times to multiples of the CPU clock in all cases here.
    PIC AVR     Analysis
    For the PIC
    The clock speed was 32MHz. We know that the PIC takes 4 clock cycles to execute one instruction, which gives us an expected instruction rate of one instruction every 125ns. Rounding for measurement errors we see that the PIC has equal low and high times of 875ns. That is 7 instruction cycles for each loop iteration.
    To verify if this makes sense we can look at the ASM. We see 6 instructions, the last of which is a GOTO, which we know will take 2 instruction cycles to execute. Using that fact we can verify that the loop repeats every 7 instruction cycles as expected (7 x 125ns = 875ns.)
    For the AVR
    The clock speed was 20MHz. We know that the AVR takes 1 clock cycle per instruction, which gives us an expected instruction rate of one instruction every 50ns.
    Rounding for measurement errors we see that the AVR has equal low and high times of 400ns. That is 8 instruction cycles for each loop iteration.
    To verify if this makes sense we again look at the ASM. We see 4 instructions, the last of which is an RJMP, which we know will take 2 instruction cycles to execute. We also see one LDS which takes 3 cycles because it is accessing sram, and one STS instruction which will each take 2 cycles and a Complement instruction which takes 1 more. Using those facts we can verify that the loop should repeat every 8 instruction cycles as expected (8 x 50ns = 400ns.)
    Comparison
    Since the 2 processors are not running at the same clock speed we need to do some math to get a fair comparison. We think 2 particular approaches would be reasonable.
    Compare the raw fastest speed the CPU can do, this gives a fair benchmark where CPU's with higher clock speeds get an advantage. Normalize the results to a common clock speed, this gives us a fair comparison of capability at the same clock speed. In the numbers below we used both methods for comparison.
    The numbers
      AVR PIC Notes Clock Speed 20MHz 32MHz   Loop Speed 400ns 875ns   Maximum Speed 2.5Mhz 1.142MHz Loop speed as a toggle frequency Normalized Speed 125kHz 35.7kHz Loop frequency normalized to a 1MHz CPU clock ASM Instructions 4 6   Loop Code Size 12 bytes
    12 bytes
    4 instructions 6 words
    10.5 bytes
    6 instructions Due to the nuances here we compared this 3 ways Total Code Size 786 bytes
      101 words
    176.75 bytes    
    Round 2: Expert Optimized test
    For the second round, we tried to hand-optimize the code to squeeze out the best possible performance from each processor. After all, we do not want to just compare how well the compilers are optimizing, we want to see what is the absolute best the raw CPU's can achieve. You will notice that although optimization doubled our performance, it made little difference to the relative performance between the two processors.
    For the PIC we wrote to LATC to ensure we are in the right bank, and pre-set the W register, this means the loop reduces to just a XORF and a GOTO. For the AVR we changed the code to use the Toggle register instead doing an XOR of the OUT register for the port.
    The optimized code looked as follows.
    PIC AVR LATC = 0xFF; asm ("MOVLW 0xFF"); while (1) { asm ("XORWF LATC, F"); } asm ("LDI R30,0x40"); asm ("LDI R31,0x04"); asm ("SER R24"); while (1){ asm ("STD Z+7,R24"); } The resulting ASM code after these changes now looked as follows. Note we did not include the instructions outside of the loop here as we are really just looking at the loop execution.
    PIC          AVR .tg {border-collapse:collapse;border-spacing:0;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg .tg-0lax{text-align:left;vertical-align:top} Address Hex Instruction 07C1 069A XORWF LATC, F 07C2 00F0 GOTO 0x7C1            Address     Hex         Instruction 0180 8387 STD Z+7,R24 0181 CFFE RJMP 0x180       Here are the actual measurements:
    PIC AVR     Analysis
    For the PIC we do not see how we could improve on this as the loop has to be a GOTO which takes 2 cycles and 1 instruction is the least amount of work we could possibly do in the loop so we are pretty confident that this is the best we can do, and when measuring we see 3 instruction cycles which we think is the limit here. 
    Note: N9WXU did suggest that we could fill all the memory with XOR instructions and let it loop around forever and in doing so save the GOTO, but we would still have to set W to FF every second instruction to have consistent timing, so this would still be 2 instructions per "loop" although it would use all the FLASH and execute in 250ns. Novel as this idea was, since that means you can do nothing else we dismissed that idea as not representative.
    For the AVR we think we are also at the limits here. The toggle register lets us toggle the pin in 1 clock cycle which cannot be beaten, and the RJMP unavoidably adds 2 more. We measure 3 cycles for this.
      AVR PIC Notes Clock Speed 20MHz 32MHz   Loop Speed 150ns 375ns   Maximum Speed 6.667Mhz 2.667MHz Loop speed as a toggle frequency Normalized Speed 333.3kHz 83.3kHz Loop frequency normalized to a 1MHz CPU clock ASM Instructions 2 2   Loop Code Size 4 bytes
    4 bytes 2 words
    3.5 bytes   At this point, we can do a raw comparison of absolute toggle frequency performance after the hand optimization. Comparing this way gives the PIC the advantage of running at 32MHz while the AVR is limited to 20MHz. Interestingly the PIC gains a little as expected, but the overall picture does not change much.
     
    The source code can be downloaded here:
    PIC MPLAB-X Project file  MicroforumToggleTestPic16f1.zip
    AVR MPLAB-X Project file  MicroforumToggleTest4808.zip
    What next?
    For our next installment, we have a number of options. We could add more cores/processors to this test of ours, or we can take a different task and cycle through the candidates on that. We could also vary the tools by using different compilers and see how they stack up against each other and across the architectures. Since our benchmarks will all be based on real-world tasks it should not matter HOW the CPU is performing the task or HOW we created the code, the comparison will simply be how well the job gets done.

    Please do post any ideas or requests in the comments and we will see if we can either improve this one or oblige with another Toe-to-toe comparison.
     
    Updates:
    This post was updated to use the 1 cycle STD instruction instead of the 2 cycle STS instruction for the hand-optimized AVR version in round 2
     
  8. Orunmila
    Comments
    I was musing over a piece of code this week trying to figure out why it was doing something that seemed to not make any sense at first glance. The comments in this part of the code were of absolutely no help, they were simply describing what the code was doing.
    Something like this:
    // Add 5 to i i += 5; // Send the packet sendPacket(&packet); // Wait on the semaphore sem_wait(&sem); // Increment thread count Threadcount++; These comments just added to the noise in the file, made the code not fit on one page, harder to read and did not tell me anything that the code was not already telling me. What was missing was what I was grappling with. Why was it done this way, why not any other way?
    I asked a colleague and to my frustration his answer was that he remembered that there was some discussion about this part of the code and that it was done this way for a very good reason! My first response was of course "well why is that not in the comments!?" 
    I remember having conversations about comments being a code smell many times in the past. There is an excellent talk by Kevlin Henney about this on youtube. Just like all other code smells, comments are not universally bad, but whenever I see a comment in a piece of code my spider sense starts tingling and I immediate look a bit deeper to try and understand why comments were actually needed here. Is there not a more elegant way to do this which would not require comments to explain, where reading the code would make what it is doing obvious?
    WHAT vs. WHY Comments
    We all agree that good code is code which is properly documented, referring to the right amount of comments, but there is a terrible trap here that programmers seem to fall in all of the time. Instead of documenting WHY they are doing things a particular way, they instead put in the documentation WHAT the code is doing. As Henney explains English, or whatever written language for that matter, is not nearly as precise a language as the programming language used itself. The code is the best way to describe what the code is doing and we hope that someone trying to maintain the code is proficient in the language it is written in, so why all of the WHAT comments?
    I quite like this Codemanship video, which shows how comments can be a code smell, and how we can use the comments to refactor our code to be more self-explanatory. The key insight here is that if you have to add a comment to a line or a couple of lines of code you can probably refactor the code into a function which has the comment as the name. If you have a line which only calls a function that means that the function is probably not named well enough to be obvious. Consider taking the comment and using it as the name of the function instead. 
    This blog has a number of great examples of how NOT to comment your code, and comical as the examples are the scary part is how often I actually see these kinds of comments in production code! It has a good example of a "WHY" comment as follows.
    /* don't use the global isFinite() because it returns true for null values */ Number.isFinite(value) So what are we to do, how do we know if comments are good or bad?
    I would suggest the golden rule must be to test your comment by asking whether is it explaining WHY the code is done this way or if it is stating WHAT the code is doing. If you are stating WHAT the code is doing then consider why you think the comment is necessary in the first place. First, consider deleting the comment altogether, the code is already explaining what is being done after all. Next try to rename things or refactor it into a well-named method or fix the problem in some other way. If the comment is adding context, explaining WHY it was done this way, what else was considered and what the trade-offs were that led to it being done this way, then it is probably a good comment.
    Quite often we try more than one approach when designing and implementing a piece of code, weighing various metrics/properties of the code to settle finally on the preferred solution. The biggest mistake we make is not to capture any of this in the documentation of the code. This leads to newcomers re-doing all your analysis work, often re-writing the code before realizing something you learned when you wrote it the first time.
    When you comment your code you should be capturing that kind of context. You should be documenting what was going on in your head when you were writing the code. Nobody should ever read a piece of your code and ask out loud "what were they thinking when they did this?". What you were thinking should be there in plain sight, documented in the comments. 
    Conclusion
    If you find that you need to find the right person to maintain any piece of code in your system because "he knows what is going on in that code" or even worse "he is the only one that knows" this should be an indication that the documentation is incomplete and more often than not you will find that the comments in this code are explaining WHAT it is doing instead of the WHY's.
    When you comment your code avoid at all costs explaining WHAT the code is doing. Always test your comments against the golden rule of comments, and if it is explaining what is happening then delete that comment! Only keep the WHY comments and make sure they are complete.
    And make especially sure that you document the things you considered and concluded would be the wrong thing to do in this piece of code and WHY that is the case.
     
     
  9. Orunmila
    The story of Mel Kaye is perhaps my favorite bit of Lore. Mel is often referred to as a "Real Programmer", of course a parody of the famous essay Real Men Don't Eat Quiche by Bruce Feirstein which led to the publication of a precious tongue-in-cheek book on stereotypes about masculinity, which in turn led Ed Post to speculate about the properties of the "Real Programmer" in his essay Real Programmers Don't use Pascal, but I digress.
    The story itself is famous enough to have it's own Wikipedia page. Even though this was written 36 years ago back in 1983 it still resonates well. The story shows us, by the mistakes of others, that doing something in an impressively complex way may very well demonstrate impressive skill and truly awe onlookers, but it is really not the best way to do things in the real world where our code needs to be maintained later.
    A good friend once told me that Genius is like a fertile field. However much it's potential to grow fantastic crops, it is also capable of growing the most impressive weeds if left untended, as Mel has shown us first hand here.
    Without further ado, here it is for your reading pleasure:
    The story of Mel
    A recent article devoted to the *macho* side of programming
         made the bald and unvarnished statement:
                    Real Programmers write in Fortran.
         Maybe they do now,
         in this decadent era of
         Lite beer, hand calculators and "user-friendly" software
         but back in the Good Old Days,
         when the term "software" sounded funny
         and Real Computers were made out of drums and vacuum tubes,
         Real Programmers wrote in machine code.
         Not Fortran. Not RATFOR. Not, even, assembly language.
         Machine Code.
         Raw, unadorned, inscrutable hexadecimal numbers.
         Directly.
         Lest a whole new generation of programmers
         grow up in ignorance of this glorious past,
         I feel duty-bound to describe,
         as best I can through the generation gap,
         how a Real Programmer wrote code.
         I'll call him Mel,
         because that was his name.
         I first met Mel when I went to work for Royal McBee Computer Corp.,
         a now-defunct subsidiary of the typewriter company.
         The firm manufactured the LGP-30,
         a small, cheap (by the standards of the day)
         drum-memory computer,
         and had just started to manufacture
         the RPC-4000, a much-improved,
         bigger, better, faster -- drum-memory computer.
         Cores cost too much,
         and weren't here to stay, anyway.
         (That's why you haven't heard of the company, or the computer.)
         I had been hired to write a Fortran compiler
         for this new marvel and Mel was my guide to its wonders.
         Mel didn't approve of compilers.
         "If a program can't rewrite its own code,"
         he asked, "what good is it?"
         Mel had written,
         in hexadecimal,
         the most popular computer program the company owned.
         It ran on the LGP-30
         and played blackjack with potential customers
         at computer shows.
         Its effect was always dramatic.
         The LGP-30 booth was packed at every show,
         and the IBM salesmen stood around
         talking to each other.
         Whether or not this actually sold computers
         was a question we never discussed.
         Mel's job was to re-write
         the blackjack program for the RPC-4000.
         (Port?  What does that mean?)
         The new computer had a one-plus-one
         addressing scheme,
         in which each machine instruction,
         in addition to the operation code
         and the address of the needed operand,
         had a second address that indicated where, on the revolving drum,
         the next instruction was located.
         In modern parlance,
         every single instruction was followed by a GO TO!
         Put *that* in Pascal's pipe and smoke it.
         Mel loved the RPC-4000
         because he could optimize his code:
         that is, locate instructions on the drum
         so that just as one finished its job,
         the next would be just arriving at the "read head"
         and available for immediate execution.
         There was a program to do that job,
         an "optimizing assembler",
         but Mel refused to use it.
         "You never know where its going to put things",
         he explained, "so you'd have to use separate constants".
         It was a long time before I understood that remark.
         Since Mel knew the numerical value
         of every operation code,
         and assigned his own drum addresses,
         every instruction he wrote could also be considered
         a numerical constant.
         He could pick up an earlier "add" instruction, say,
         and multiply by it,
         if it had the right numeric value.
         His code was not easy for someone else to modify.
         I compared Mel's hand-optimized programs
         with the same code massaged by the optimizing assembler program,
         and Mel's always ran faster.
         That was because the "top-down" method of program design
         hadn't been invented yet,
         and Mel wouldn't have used it anyway.
         He wrote the innermost parts of his program loops first,
         so they would get first choice
         of the optimum address locations on the drum.
         The optimizing assembler wasn't smart enough to do it that way.
         Mel never wrote time-delay loops, either,
         even when the balky Flexowriter
         required a delay between output characters to work right.
         He just located instructions on the drum
         so each successive one was just *past* the read head
         when it was needed;
         the drum had to execute another complete revolution
         to find the next instruction.
         He coined an unforgettable term for this procedure.
         Although "optimum" is an absolute term,
         like "unique", it became common verbal practice
         to make it relative:
         "not quite optimum" or "less optimum"
         or "not very optimum".
         Mel called the maximum time-delay locations
         the "most pessimum".
         After he finished the blackjack program
         and got it to run,
         ("Even the initializer is optimized",
         he said proudly)
         he got a Change Request from the sales department.
         The program used an elegant (optimized)
         random number generator
         to shuffle the "cards" and deal from the "deck",
         and some of the salesmen felt it was too fair,
         since sometimes the customers lost.
         They wanted Mel to modify the program
         so, at the setting of a sense switch on the console,
         they could change the odds and let the customer win.
         Mel balked.
         He felt this was patently dishonest,
         which it was,
         and that it impinged on his personal integrity as a programmer,
         which it did,
         so he refused to do it.
         The Head Salesman talked to Mel,
         as did the Big Boss and, at the boss's urging,
         a few Fellow Programmers.
         Mel finally gave in and wrote the code,
         but he got the test backwards,
         and, when the sense switch was turned on,
         the program would cheat, winning every time.
         Mel was delighted with this,
         claiming his subconscious was uncontrollably ethical,
         and adamantly refused to fix it.
         After Mel had left the company for greener pa$ture$,
         the Big Boss asked me to look at the code
         and see if I could find the test and reverse it.
         Somewhat reluctantly, I agreed to look.
         Tracking Mel's code was a real adventure.
         I have often felt that programming is an art form,
         whose real value can only be appreciated
         by another versed in the same arcane art;
         there are lovely gems and brilliant coups
         hidden from human view and admiration, sometimes forever,
         by the very nature of the process.
         You can learn a lot about an individual
         just by reading through his code,
         even in hexadecimal.
         Mel was, I think, an unsung genius.
         Perhaps my greatest shock came
         when I found an innocent loop that had no test in it.
         No test. *None*.
         Common sense said it had to be a closed loop,
         where the program would circle, forever, endlessly.
         Program control passed right through it, however,
         and safely out the other side.
         It took me two weeks to figure it out.
         The RPC-4000 computer had a really modern facility
         called an index register.
         It allowed the programmer to write a program loop
         that used an indexed instruction inside;
         each time through,
         the number in the index register
         was added to the address of that instruction,
         so it would refer
         to the next datum in a series.
         He had only to increment the index register
         each time through.
         Mel never used it.
         Instead, he would pull the instruction into a machine register,
         add one to its address,
         and store it back.
         He would then execute the modified instruction
         right from the register.
         The loop was written so this additional execution time
         was taken into account --
         just as this instruction finished,
         the next one was right under the drum's read head,
         ready to go.
         But the loop had no test in it.
         The vital clue came when I noticed
         the index register bit,
         the bit that lay between the address
         and the operation code in the instruction word,
         was turned on--
         yet Mel never used the index register,
         leaving it zero all the time.
         When the light went on it nearly blinded me.
         He had located the data he was working on
         near the top of memory --
         the largest locations the instructions could address --
         so, after the last datum was handled,
         incrementing the instruction address
         would make it overflow.
         The carry would add one to the
         operation code, changing it to the next one in the instruction set:
         a jump instruction.
         Sure enough, the next program instruction was
         in address location zero,
         and the program went happily on its way.
         I haven't kept in touch with Mel,
         so I don't know if he ever gave in to the flood of
         change that has washed over programming techniques
         since those long-gone days.
         I like to think he didn't.
         In any event,
         I was impressed enough that I quit looking for the
         offending test,
         telling the Big Boss I couldn't find it.
         He didn't seem surprised.
         When I left the company,
         the blackjack program would still cheat
         if you turned on the right sense switch,
         and I think that's how it should be.
         I didn't feel comfortable
         hacking up the code of a Real Programmer."

             -- Source: usenet: utastro!nather, May 21, 1983.
     
  10. Orunmila
    There is one problem we keep seeing bending people's minds over and over again and that is race conditions under concurrency.
    Race conditions are hard to find because small changes in the code which subtly modifies the timing can make the bug disappear entirely. Adding any debugging code such as a printf can cause the bug to mysteriously disappear. We spoke about this kind of "Heisenbug" before in the lore blog.
    In order for there to be a concurrency bug you have to either have multiple threads of execution (as you would have when using an RTOS) or, like we see more often in embedded systems, you are using interrupts. In this context an interrupt behaves the same as a parallel execution thread and we we have to take special care whenever both contexts of execution will be accesing the same variable. 
    Usually this cannot be avoided, we e.g. receive data in the ISR and want to process it in the main loop. Whenever we have this kind of interaction where we need to access shared data which will be accessed by multiple contexts we need to serialize access to the data to ensure that the two contexts will safely navigate this. You will see that a good design will minimise the amount of data that will be shared between the two contexts and carefully manage this interaction.
    Simple race condition example
    The idea here is simply that we receive bytes in the ISR and process them in the main loop. When we receive a byte we increase a counter and when we process one we reduce the counter. When the counter is at 0 this means that we have no bytes left to process. 
    uint8_t bytesInBuffer = 0; void interrupt serialPortInterrupt(void) { enqueueByte(RCREG); // Add received byte to buffer bytesInBuffer++; } void main(void) { while(1) { if ( bytesInBuffer > 0 ) { uint8_t newByte = dequeueByte(); bytesInBuffer--; processByte(newByte); } } } The problem is that both the interrupt and the mainline code access the same memory here and these interactions last for multiple instruction cycles. When an operation completes in a single cycle we call it "atomic" which means that it is not interruptable. There are a numbe of ways that instructions which seem to be atomic in C can take multiple machine cycles to complete. 
    Some examples:
    Mathematical operations - these often use an accumulator or work register Assignment. If I have e.g. a 32bit variable on an 8-bit processor it can take 8 cycles or more to do x = y. Most pointer operations (indirect access). [This one can be particularly nasty btw.] Arrays [yes if you do i[12] that actually involves a multiply!] In fact this happens at two places here, the counter as well as the queue containing they bytes, but we will focus only on the first case for now, the counter "bytesInBuffer".
    Consider what would happen if the code "bytesInBuffer++" compiled to something like this:
    MOVFW bytesInBuffer ADDLW 1 MOVWF bytesInBuffer Similarly the code to reduce the variable could look like this:
    MOVFW bytesInBuffer SUBLW 1 MOVWF bytesInBuffer The race happens once the main execution thread is trying the decrement the variable. Once the value has been copied to the work register the race is on. If the mainline code can complete the calculation before an interrupt happens everything will work fine, but this will take 2 instructions. If an interrupt happens after the first instruction or after the 2nd instruction the data will be corrupted.
    Lets look at the case where there is 1 byte in the buffer and the main code starts processing this byte, but before the processing is complete another byte arrives. Logically we would expect the counter to be incremented by one in the interrupt and decremented by 1 in the mainline code, so the end result should be bytesInBuffer = 1, and the newly received byte should be ready for processing.
    The execution will proceed something like this - we will simplify to ignore clearing and checking of interrupt flags etc. (W is the work register, I_W is the interrupt context W register):
    // State before : bytesInBuffer = 1. Mainline is reducing, next byte arrives during operation Mainline Interrupt // state ... [bytesInBuffer = 1] MOVFW bytesInBuffer [bytesInBuffer = 1, W=1] SUBLW 1 [bytesInBuffer = 1, W=0] MOVFW bytesInBuffer [bytesInBuffer = 1, W=0,I_W=1] ADDLW 1 [bytesInBuffer = 1, W=0,I_W=2] MOVWF bytesInBuffer [bytesInBuffer = 2, W=0,I_W=2] MOVWF bytesInBuffer [bytesInBuffer = 0, W=0,I_W=2] ... [bytesInBuffer = 0] As you can see instead of ending up with bytesInBuffer = 0 instead of 1 and the newly received byte is never processed.
    This typically leads to a bug report saying that the serial port is either losing bytes randomly or double-receiving bytes e.g.
           UART (or code) having different behaviors for different baudrates - PIC32MX - https://www.microchip.com/forums/m1097686.aspx#1097742
    Deadlock Empire
    When I get a new junior programmer in my team I always make sure they understand this concept really well by laying down a challenge. They have to defeat "The Deadlock Empire" and bring me proof that they have won the BossFight at the end.
    Deadlock Empire is a fun website which uses a series of examples to show just how hard proper serialization can be, and how even when you try to serialze access using mutexes and/or semaphores you can still end up with problems. 
    You can try it out for youerself - the link is https://deadlockempire.github.io
    Even if you are a master of concurrency I bet you will still learn something new in the process!
    Serialization
    When we have to share data between 2 or more contexts it is critical that we properly serialize access to it. By this we mean that the one context should get to complete it's operation on the data before the 2nd context can access it. We want to ensure that access to the variables happen in series and not in parallel to avoid all of these problems.
    The simplest way to do this (and also the least desireably) is to simply disable interrupts while accessing the variable in the mainline code. That means that the mainline code will never be interrupted in the middle of an operation on the shared date and so we will be safe.
    Something like this:
    // State before : bytesInBuffer = 1. Mainline is reducing, next byte arrives during operation Mainline Interrupt // state ... [bytesInBuffer = 1] BCF GIE /* disable interrupts */ [bytesInBuffer = 1] MOVFW bytesInBuffer [bytesInBuffer = 1, W=1] SUBLW 1 [bytesInBuffer = 1, W=0] MOVWF bytesInBuffer [bytesInBuffer = 0, W=0] BSF GIE /* enable interrupts */ [bytesInBuffer = 0, W=0] MOVFW bytesInBuffer [bytesInBuffer = 0, W=0, I_W=0] ADDLW 1 [bytesInBuffer = 0, W=0, I_W=1] MOVWF bytesInBuffer [bytesInBuffer = 1, W=0, I_W=1] ... [bytesInBuffer = 1] And the newly received byte is ready to process and our counter is correct.
    I am not going to go into fancy serialization mechanisms such as mutexes and semaphores here. If you are using a RTOS or are working on a fancy advanced processor then please do read up on the abilities of the processor and the mecahnisms provided by the OS to make operations atomic, critical sections, semaphores, mutexes and concurrent access in general.
    Concurrency Checklist
    We always want to explicitly go over a checklist when we are programming in any concurrent environment to make sure that we are guarding all concurrent access of shared data. My personal flow goes something like this:
    Do you ever have more than one context of execution? Interrupts Threads Peripherals changing data (We problems with reading/writing 16-bit timers on 8-bit PIC's ALL OF THE TIME!) List out all data that is accessed in more than one context. Look deep, sometimes it is subtle e.g. accessing an array index may be calling multiply internally. Don't miss registers, especially 16-bit values like Timers or ADC results Is it possible to reduce the amount of data that is shared?  Take a look at the ASM generated by the compiler to make 100% sure you understand what is happening with your data. Operations that look atomic in C are often not in machine instructions. Explicitly design the mechanism for serializing between the two contexts and make sure it is safe under all conditions. Serialization almost always causes one thread to be blocked, this will slow down either processing speed by blocking the other thread or increase latency and jitter in the case of interrupts We only looked into the counter variable above. Of course the queue holding the data is also shared, and in my experience I see much more issues with queues being corrupted due to concurrent access than I see with simple counters, so do be careful with all shared data.
    One last example
    I mentioned 16-bit timers a couple of times, I have to show my favourite example of this, which happens when people write to the timer register without stopping the timer. 
    // Innocent enough looking code to update Timer1 register void updateTimer1(uint16_t value) { TMR1 = value; } // This is more common, same problem void updateTimer1_v2(uint16_t value) { TMR1L = value >> 8; TMR1H = value & 0xFF; } With the code above the compiler is generally smart enough not to do actual right shifts or masking, realizing that you are working with the lower and upper byte and this code compiles in both cases to something looking like this:
    MOVFW value+1 // High byte MOVWF TMR1H MOVFW value // Low byte MOVWF TMR1L And people always forget that when the timer is still running this can have disastrous results as follows when the timer increments in the middle of this all like this:
    // We called updateTimer1(0xFFF0) - expecting a period of 16 cycles to the next overflow // We show the register value to the right for just one typical error case // The timer ticks on each instruction ... [TMR1 = 0x00FD] MOVFW value+1 [TMR1 = 0x00FE] MOVWF TMR1H [TMR1 = 0xFFFF] MOVFW value [TMR1 = 0x0000] MOVWF TMR1L [TMR1 = 0x00F0] ... [TMR1 = 0x00F1] ... [TMR1 = 0x00F2] This case is always hard to debug because you can only catch it failing when the update is overlapping with the low byte overflowing into the high byte, which happens only 1 in 256 timer cycles, and since the update takes 2 cycles we have only a 0.7% chance of catching it in the act.
    As you can see, depending on whether you clear the interrupt flag after this operation or not you can end up with either getting an immediate interrupt due to the overflow in the middle of the operation, or a period which vastly exceeds what you are expecting (by 65280 cycles!).
    Of course if the low byte did not overflow in the middle of this the high byte is unaffected and we get exactly the expected behavior.
    When this happens we always see issues reported on the forums that sound something like "My timer is not working correctly. Every once in a while I get a really short or really long period"
    When I see these issues posted in future I will try to remember to come update this page with links just for fun 🙂
     
     
  11. Orunmila
    I realize now that I have a new pet peeve.
    The widespread and blind inclusion, by Lemmings calling themselves embedded C programmers, of  extern "C" in C header files everywhere.
    To add insult to injury, programmers (if I can call them that) who commit this atrocity tend to gratuitously leave a comment in the code claiming that this somehow makes the code "compatible" with C++ (whatever that is supposed to mean - IT DOES NOT !!!).
    It usually looks something like this (taken from MCC produced code - uart.h)
    #ifdef __cplusplus // Provide C++ Compatibility extern "C" { #endif So why does this annoy me so much? (I mean besides the point that I am unaware of any C++ compiler for the PIC16, which I generated this code for - which in itself is a hint that this has NEVER been tested on any C++ compiler ! )
    First and foremost - adding something to your code which claims that it will "Provide C++ Compatibility" is like entering into a contract with the user that they can now safely expect the code to work with C++.  Well I have bad news for you buddy - it takes a LOT more than wrapping everything in extern "C" to make your code "Compatible with C++" !!!  
    If you are going to include that in your code you better know what you are doing and your code better work (as in actually being tested) with a real C++ compiler. If it does not I am going to call out the madness you perpetrated by adding something which only has value when you mix C and C++, but your code cannot be used with C++ at all in the first place - because you are probably doing a host of things which are incompatible with C++ !
    In short, as they say -  your header file comment is writing cheques your programming skills cannot cash.
    Usually a conversation about this starts with me asking the perpetrator if they can explain to me what "That thing" actually does, and when do you think you need this to use the code with a C++ compiler, so let's start there. I can count on one hand the people who have been able to explain to me how this works correctly.
    Extern "C" - What it does and what it is meant for
    This construct is used to tell a C++ compiler that it should set a property called langage linkage which affects mangling of names as well as possible calling conventions. It does not guarantee that you can link to C code compiled with a different compiler.
    This is a good point to quote the C++14 specification, section 7.5 on Language linkage 
    That said, if you are e.g. using GCC for your C as well as your C++ compiler you will have a very good chance of being lucky enough that the implementation-defined linkage will be compatible!
    A C++ compiler will use "name mangling"  to ensure that symbols with identical names can be distinguished from each other by passing additional semantic information to the linker. This becomes important when you use e.g. function overloading, or namespaces. This mangling of names is a feature of C++ (which allows duplicate names across the program, but does NOT exist in C (which does not allow duplicate symbols to be linked at all).
    When you place an extern "C" wrapper around the declaration of a symbol you are telling the C++ compiler to disable this name mangling feature and also to alter it's calling convention when calling a function to be more likely to link with C code, although calling conventions are a topic all in it's own.
    As you may have deduced by now there is a reason that this is not called  "extern C++"! This is required to make your C++ calls match the names and conventions in the C object files, which does not contain mangled names.  If you ARE going to place a comment next to your faux pas, at least get it right and say "For compatibility with C naming and calling conventions" instead of claiming incorrectly that this is required for C++ compatibility somehow! 
    There is one very specific use case, one which we used to encounter all the time in what would now be called "the olden days" of Windows 3.1 when everything was shared through dynamic link libraries (.dll files). It turns out that in order to use a dll which was written using C++ from your C program you had to wrap the publicly exported function declarations in extern "C". 
    Yes that is correct, this is used to make C++ code compatible with C, NOT THE OTHER WAY AROUND!
    So when do I really need this then?
    Let's take a small step back. If you want to use your C code with a C++ compiler, you can simply compile it all with the C++ compiler! The compiler will resolve the names just fine after mangling them all, and since you are properly using name spaces with your C++ code this will reduce the chances of name collisions and you will be all the happier for doing the right thing. No need for disabling name mangling there. 
    If you do not need this to get your code to compile with a C++ compiler, then when DO you need it?
    Ahhh, now you are beginnng to see why this has become a pet peeve of mine ... but the plot still has some thickening to do before we are done...
    Pretty much the only case where it makes sense to do this is when you are going to compile some of your code with a C compiler, then compile some other code with a C++ compiler, and in the end take the object files from the two compilers and link them together, probably using the C++ linker. Of course when you feed your .c files and your .cpp files to a typical C++ compiler it will actually run a C compiler on the .c files and the C++ compiler on the .cpp files by default, and mangling will be an issue and you will exclaim "you see! He was wrong!", but not that fast ... it is simple enough to tell the compiler to compile all files with the C++ compiler, and this remains the best way to use C libraries in source form with your C++ code.
    If you are going to compile the C code into a binary library and link it in (which sometimes happens with commercial 3rd party libraries - like .dll's - DUH ! ) then there is a case for you to do this, but most likely this does not apply to you at all as you will have the source available all the time and you are working on an embedded system where you are making a single binary.
    To help you ease the world of hurt you are probably in for - you should read up on how to tell your compiler to use a particlar calling convention which has a chance of being compatible with the linker. If you are using GCC you can start here.
    To be clear, if you add extern "C" to your C code and then compiles it into an object file to be linked with your C++ program the extern "C" qualifier is entirely ignored by the C compiler. Yes that is right, all the way to producing the object file this has no effect whatsoever. It is only when you are producing calls to the C code from the C++ code that the C++ code is altered to match the C naming and calling conventions, so this means that the C++ code is altered to be compatible with C.
    In the end
    There is a hell of a lot of things you will need to consider if you want to mix C and C++. I promise that I will write another blog on that next time around if I get anybody asking for it in the comments.
    Adding extern "C" to your code is NOT required to make it magically "compatible with C++". In order to do that you need to heed with extreme caution the long list of incompatibilities between C and C++ and you will probably have to be much more specific than just stating C and C++. Probably something like C89 and C++11 if I had to wager a guess to what will be most relevant.
    And remember the reasons why it is almost always just plain stupid to use extern "C" in your C headers.
    It does not do what you think it does. It especially does not do what your comment claims it does - it does not provide C++ compatability! Don't let your comments write cheques your code cannot cash! - Before you even think of adding that, make sure you KNOW AND FOLLOW ALL the C++ compatability rules first. For heaven's sake TEST your code with a C++ compiler ! If you want to use a "C" library with your C++ code simply compile all the code with your C++ compiler. QED - no mess no fuss! If it is not possible to do 5 above, then compile the C code normally (without that nonsense) and place extern "C" around the #include of the C library only! (example below). After all, this is for providing C linkage compatability to C++ compiled code!  If you are producing a binary library using C to be used/linked with a C++ compiler, then please save us all and just compile the library with both C and C++ and provide 2 binaries!  If all of the above fail, beacuase you really just hit that one in a million case where you think you need this, then for Pete's sake educate yourself before you attempt something that hard, hopefully in the process you will realize that it is just a bad idea after all! Now please just STOP IT!
    I feel very much like pulling a Jeff Atwood and saying after all that only a moron would use extern "C" in their C headers (of course he was talking about using tabs).
    Orunmila
     
    Oh - I almost forgot - the reasonable way of using extern "C" looks like this:
    #include <string> // Yes <string>, not <string.h> - this is C++ ! #include <stdio> // Include C libraries extern "C" { #include "clib1.h" #include "clib2.h" } using namespace std; // Because the cuticles on my typing finger hurt if I have to type out "std::" all the time! // This is a proper C++ class because I am too patrician to use "C" like that peasant library writer! class myClass { private: int myPrivateInt; ... ...  
    Appeals to Authority
    Dan Saks CppCon talk entitled “extern c: Talking to C Programmers about C++” 
    A very good answer on Stackoverflow
    Some decent answers on this Stackoverflow question
    Fairly OK post on geeksforgeeks
    The dynamic loading use case with examples 
    Note: That first one is a bit of a red herring as it does not explain extern C at all - nevertheless it is a great talk which all embedded programmers can benefit from 🙂 
  12. Orunmila
    There is this beautiful story about testing.
    I think the moral of the story is that if you do something really well, and people get recognition for doing that, it can become an inspiration to excell. The credit here should not just go to The Black Team, but also to management that allowed this to develop into such a "thing".
    When I was establishing a testing team at my previous employer I started off by sending this story to each of the hand-picked members. They felt like programming was much more exotic and interesting than testing and nobody wanted to be on the testing team. To my delight the team took the story to heart and in their endeavors to improve testing more and more ended up trying out all kinds of exotic languages like Golang scripting, which they would never have done as embedded C programmers alone. 
    I am tremendously proud of what that testing team of mine has become (you know who you are!), and perhaps this story can be an inspiration to more teams out there.
    The Story
    From : http://www.penzba.co.uk/GreybeardStories/TheBlackTeam.html
    Sometimes because of the nature of my work I get to hear stories from a greybeard with a past that is, well, "interesting." And again, because of the nature of my work, or more accurately, because of the nature of their work, the stories can't be verified.
    That's just life.
    Sometimes the stories can't even be told. But some, after time, can. I'm starting to write them up, and this is one of them.
    Once upon a time most hackers knew about the infamous IBM "Black Team". In case you don't, I suggest you go read about them first, before you carry on here:  
    The Black Team (That should open in a new window or tab - just close it when you're done.)
    So let me tell you a story involving a member of the Black Team. This has come to me 13th hand, so I can't vouch for its veracity, but I can believe it. Oh yes, I can believe it.
    There was a programmer at IBM who was terribly excited to get the job writing the driver/interface software for a brand spanking new tape-drive. This was one of the fascinating machines - cabinet sized - that you might see in movies from the 60s, 70s and beyond. Reel to reel tape, running forward, then backwards, stopping, starting, no apparent reason for the direction, speed or duration. Almost hypnotic. Maybe I should write a simulator as a screen-saver.
    Hmm ...
    Anyway, our programmer was very excited by the idea of writing the software for this new piece of hardware, but also a little anxious. After all, his work would attract the attention of the Black Team, something no programmer ever really wanted.
    So he decided to thwart them. He decided to write his code perfectly. He decided to prove the code, formally. He decided to write code that was so clean, so neat, so perfect, that nothing wrong could be found with it.
    There are two ways to write code: write code so simple there are obviously no bugs in it, or write code so complex that there are no obvious bugs in it.  -- C.A.R. Hoare He worked really hard at getting this code perfect, and the time came for a member of the Black Team to inspect his code and test his software, they found no bugs.
    None!
    And they were annoyed.
    They came back. At first in twos and threes, but finally the entire team descended on him and his code to find the bugs they simply knew must be there, all to no avail. So our programmer was happy, content, confident, and above all, smug.
    So the day came when the hardware was to be unveiled and demonstrated to the world. This sort of event was always a big one. "New Hardware" was a big deal, and duly trumpeted.
    Then, at the last minute before the demonstration began, a member of the Black Team hurried up to the console and began frantically typing in commands. Our programmer was confident - he knew the code was perfect. It was proven and tested.
    Nothing could go wrong.
    He wasn't even perturbed when the tape started spinning at full speed and running right to the end. It stopped before the end, as he knew it would. It was safe, it was working, it was perfect.
    The tape then started to rewind at full speed, but that wasn't a problem either. Again, it stopped just short of the end. No problem.
    Again and again the tape ran all the way to the end, then all the way to the start. Again and again it stopped within tolerances. Our programmer smirked, knowing the Black Team were beaten.
    And then he saw it. The cabinet had started to build up a gentle, rocking motion. And it was growing. What could be happening?
    The Black Team had found the fundamental frequency at which the cabinet would rock, and had programmed the tape to resonate. In mounting horror, matching the mounting amplitude, the programmer watch as the cabinet at first subtly, then clearly, and finally unmistakably began rocking ever more violently, until finally, inevitably, it fell over.
    In front of the World's press.
    History doesn't relate what happened to the programmer, or to the product. Despite the tale being utterly believable, I've been able to find no record of it. The greybeard who told me the story has moved on to a place where I can no longer ask questions, and so I'm left with what might just be a tall tale.
    Surely there would survive some report of this event, some record, somewhere.
    Do you have a copy?
    [you can read Colin Wright's blog here https://www.solipsys.co.uk/new/ColinsBlog.html?ColinWright ]
    8<-------8<-------8<-------8<-------8<-------8<-------8<-------8<-------8<-------
    The epilog to that background story linked there sums it up best:
    Epilog
    Readers not familiar with the software industry might not grasp the full significance of what the Black Team accomplished within IBM, but the real lesson to be learned from the story has nothing to do with software.
    A group of slightly above-average people assigned to do what many considered an unglamorous and thankless task not only achieved success beyond anyone's wildest expectations, but undoubtedly had a great time doing it and wound up becoming legends in their field.
    As I read through the end-of-year lists of all the problems the computer industry and the world as a whole is facing, I just can't seem to bring myself to view them with gravity the authors seem to intended. After all, even the worst of are problems seem solvable by a few like-minded people with a bit of dedication.
     
     
     
     
  13. Orunmila
    Melvin Conway quipped the phrase back in 1967 that "organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations."
    Over the decades this old adage has proven to be quite accurate and this has become known as "Conway's Law". Researchers from MIT and Harvard have since shown that there is strong evidence for this correllation, they called it the "The Mirroring Hypothesis". 
    When you read "The mythical man month" by Fred Brooks we see that we already knew back in the seventies that there is no silver bullet when it comes to Software Engineering, and that the reason for this is essentially the complexity of software and how we deal with it. It turns out that adding more people to a software project increases the number of people we need to communicate with and the number of people who need to understand it. When we just make one big team where everyone has to communicate with everyone the code tends to reflect this structure. As we can see the more people we add into a team the more the structure quickly starts to resemble something we all know all too well!

    When we follow the age old technique of divide and concquer, making small Agile teams that each work on a part of the code which is their single responsibility, it turns out that we end up getting encapsulation and modularity with dependencies managed between the modules.
    No wonder the world is embracing agile everywhere nowadays!
    You can of course do your own research on this, here are some org carts of some well known companies out there you can use to check the hypothesis for yourself!

     
  14. Orunmila
    Exactly how long is a nanosecond?
    This Lore blog is all about standing on the shoulders of giants.
    Back in February 1944 IBM shipped the Harvard Mark 1 to Harvard University. It looked like this:

    The Mark I was a remarkable machine at the time, it could perform addition in 1 cycle (which took roughly 0.3 seconds) and multiplication in 20 cycles or 6 seconds. Calculating sin(x)  would run up to 60 seconds (1 minute).
    The team that ran this Electromechanical computer had on it a remarkable young giant, Grace Brewster Murray Hopper, who went on to become a Rear Admiral in the US Navy before her retirement 43 odd years later. During her career as one of the first and finest computer scientists she was involved in a myriad of innovation.
    In 1949 she proposed the concept of creating a human readable language made up entirely of English words to program a computer. Her idea was readily dismissed because "computers do not understand English". It was only 3 years later that her idea gained traction when she published a paper entitled "The education of a computer" on the subject. She called this a "compiler". 
    In 1959 she was part of the team commissioned to develop a new programming language for business use. It was called COBOL.
    But enough of the introductions already! If we had to go over everything Grace Hopper accomplished we would be here all day - here is the Wikipedia page on Grace Hopper for the uninitiated who do not know who this remarkable woman was.
    I have never seen a better explanation of orders of magnitude and scale than the representation Grace Hopper was known for explaining how long a nanosecond really is. Here is a video from the archives of her explaining this brilliantly in layman's terms in just under 2 minutes.
     
    I also love the interview Letterman did with her back in 1986 where she explained this briefly on national television. For television she stepped up her game a notch and took the nanosecond to the next level, also explaining through an analogy how long a Pico-second is!

    This 10 minute interview really goes a long way in capturing the essence of who Grace Hopper really was, a remarkable pioneer in Engineering indeed!
     
     
  15. Orunmila
    Whenever I start a new project I always start off reaching for a simple while(1) "superloop" architecture https://en.wikibooks.org/wiki/Embedded_Systems/Super_Loop_Architecture . This works well for doing the basics but more often than not I quickly end up short and looking to employ a timer to get some kind of scheduling going.
    MCC makes this pretty easy and convenient to set up. It contains a library called "Foundation Services" which has 2 different timer implementations, TIMEOUT and RTCOUNTER. These two library modules have pretty much the same interface, but they are implemented very differently under the hood. For my little "Operating System" I am going to prefer the RTCOUNTER version as keeping time accurately is more important to me than latency.
    The TIMEOUT module is capable of providing low latency reaction times whenever a timer expires by adjusting the timer overflow point so that an interrupt will occur right when the next timer expires. This allows you to use the ISR to call the action you want to happen directly and immediately from the interrupt context.
    Nice as that may be in some cases, it always increases the complexity, the code size and the overall cost of the system. In our case RTCOUNTER is more than good enough so we will stick with that.
    RTCOUNTER Operation
    First a little bit more about RTCOUNTER.
    In short RTCOUNTER keeps track of a list of running timers. Whenever you call the "check" function it will compare the expiry time of the next timer and call the task function for that timer if it has expired.
    It achieves this by using a single hardware timer which will operate in "Free Running" mode. This means the hardware timer will never be "re-loaded" by the code, it will simply overflow back to it's starting value naturally, and every time this happens the module will count that another overflow has happened in an overflow counter. The count of the timer is made up by a combination of the actual hardware timer and the overflow counter. By "hiding" or abstracting the real size of the hardware timer like this the module can easily be switched over to use any of the PIC timers, regardless if they count up or down or how many bits they implement in hardware. 
    Mode 32-bit Timer value General (32-x)-bit of g_rtcounterH x-bit Hardware Timer Using TMR0 in 8-bit mode 24-bits of g_rtcounterH TMR0 (8-bit) Using TMR1 16-bits of g_rtcounterH TMR1H (8-bit) TMR1L (8-bit) RTCOUNTER is compatible with all the PIC timers, and you can switch it to a different timer later without modifying your application code which is nice. Since all that happens when the timer overflows is updating the counter the Timer ISR is as short and simple as possible.
    // We only increment the overflow counter and clear the flag on every interrupt void rtcount_isr(void) { g_rtcounterH++; PIR4bits.TMR1IF = 0; } When we run the "check" function it will construct the 32-bit time value by combining the hardware timer and the overflow counter (g_rtcounterH). It will then compare this value to the expiry time of the next timer in the list to expire. By keeping the list of timers sorted by expiry time it saves time during the checking (which happens often) by doing the sorting work during creation of the timer (which happens infrequently). 
    How to use it
    Using it is failry straight-forward. 
    Create a callback function which returns the "re-schedule" time for the timer. Allocate memory for your timer/task and tie it to your callback function Create the timer (which starts it) specifying how long before it will first expire Regularly call the check function to check if the next timer has expired, and call it's callback if it has. In C the whole program may look something like this example:
    #include "mcc_generated_files/mcc.h" int32_t ledFlasher(void* p); rtcountStruct_t  myTimer = {ledFlasher}; void main(void) {     SYSTEM_Initialize();              INTERRUPT_GlobalInterruptEnable();     INTERRUPT_PeripheralInterruptEnable();     rtcount_create(&myTimer, 1000); // Create a new timer using the memory at &myTimer          // This is my main Scheduler or OS loop, all tasks are executed from here from now on     while (1)     {         // Check if the next timer has expired, call it's callback from here if it did         rtcount_callNextCallback();     } } int32_t ledFlasher(void* p) {     LATAbits.RA0 ^= 1; // Toggle our pin          return 1000; // When we are done we want to restart this timer 1000 ticks later, return 0 to stop }  
    Example with Multiple Timers/Tasks
    Ok, admittedly blinking an LED with a timer is not rocket science and not really impressive, so let's step it up and show how we can use this concept to write an application which is more event-driven than imperative. 
    NOTE: If you have not seen it yet I recommend reading Martin Fowler's article on Event Sourcing and how this design pattern reduces the probability of errors in your system on his website here.
    By splitting our program into tasks (or modules) which each perform a specific action and works independently of other tasks, we can generate code modules which are completely independent and re-usable quite easily. Independent and re-usable (or mobile as Uncle Bob says) does not only mean that the code is maintainable, it also means that we can test and debug each task by itself, and if we do this well it will make the code much less fragile. Code is "fragile" when you are fixing something in one place and something seemingly unrelated breaks elsewhere ... that will be much less likely to happen.
    For my example I am going to construct some typical tasks which need to be done in an embedded system. To accomplish this we will
    Create a task function for each of these by creating a timer for it. Control/set the amount of CPU time afforded to each task by controlling how often the timer times out Communicate between tasks only through a small number of shared variables (this is best done using Message Queue's - we will post about those in a later blog some time) Let's go ahead and construct our system. Here is the big picture view

    This system has 6 tasks being managed by the Scheduler/OS for us.
    Sampling the ADC to check the battery level. This has to happen every 5 seconds Process keys, we are looking at a button which needs to be de-bounced (100 ms) Process serial port for any incoming messages. The port is on interrupt, baud is 9600. Our buffer is 16 bytes so we want to check it every 10ms to ensure we do not loose data. Update system LCD. We only update the LCD when the data has changed, we want to check for a change every 100ms Update LED's. We want this to happen every 500ms Drive Outputs. Based on our secret sauce we will decide when to toggle some pins, we do this every 1s These tasks will work together, or co-operate, by keeping to the promise never to run for a long time (let's agree 10ms is a long time, tasks taking longer than that needs to be broken into smaller steps). This arrangement is called Co-operative Multitasking . This is a well-known mechanism of multi-tasking on a microcontroller, and has been implemented in systems like "Windows 3.1" and "Windows 95" as well as "Classic Mac-OS" in the past.
    By using the Scheduler and event driven paradigm here we can implement and test each of these subsystems independently. Even when we have it all put together we can easily replace one of these subsystems with a "Test" version of it and use that to generate test conditions for us to ensure everything will work correctly under typical operation conditions. We can "disable" any part of the system by simply commenting out the "create" function for that timer and it will not run. We can also adjust how often things happen or adjust priorities by modifying the task time values.
    As before we first allocate some memory to store all of our tasks. We will initialize each task with a pointer to the callback function used to perform this task as before. The main program= ends up looking something like this.
    void main(void) { SYSTEM_Initialize(); INTERRUPT_GlobalInterruptEnable(); INTERRUPT_PeripheralInterruptEnable(); rtcount_create(&adcTaskTimer, 5000); rtcount_create(&keyTaskTimer, 100); rtcount_create(&serialTaskTimer, 10); rtcount_create(&lcdTaskTimer, 100); rtcount_create(&ledTaskTimer, 500); rtcount_create(&outputTaskTimer, 1000); // This is my main Scheduler or OS loop, all tasks are executed from events while (1) { // Check if the next timer has expired, call it's callback from here if it did rtcount_callNextCallback(); } }  
    As always the full project incuding the task functions and the timer variable declarations can be downloaded. The skeleton of this program which does initialize the other peripherals, but runs the timers completely compiles to only 703 words of code or 8.6% on this device, and it runs all 6 program tasks using a single hardware timer.
     
  16. Orunmila
    The basic idea behind this principle is that taking measurements influence the thing you are measuring.
    In microcontrollers we all get intruduced to this idea at some point when you are trying to figure out if the oscillator is running, and measuring it with a scope probe you realize - of course only after a couple of hours of struggling - that the 10pF impedence load the scope probe is adding to the pin is actually causing the oscillator, which was working just fine, to stop dead.
    In software we have similar problems. First we find out we cannot set breakpoints because the optimizer has mixed the instructions so much that they no longer correlate to the C instructions. We are advised to disable optimization, so that we can set a breakpoint to inspect what is going on, only to discover that when optimizations are disabled the problem is gone. We think we are smart and add printf's to the code, side-stepping the optimization problem entirely, only to end up with working code once again, and the moment we remove the prints it breaks! It is almost like the bug disappears the moment you start looking at it ...
    These "Heisenbugs" are usually an indication that there is some kind of concurrency problem or race condition, where the timing of the code is a critical part of the problem being exposed.
    The oldest reference to a "Heisenbug" is a 1983 paper in the ACM.
    As I often do, here is the link to Heisenbug on Wikipedia.
  17. Orunmila
    The King's Toaster
    Anonymous
    Once upon a time, in a kingdom not far from here, a king summoned two of his advisors for a test. He showed them both a shiny metal box with two slots in the top, a control knob and a lever.
    "What do you think this is?"
    One advisor, an engineer, answered first. "It is a toaster," he said.
    The king asked, "How would you design an embedded computer for it?"
    The engineer replied, "Using a four-bit microcontroller, I would write a simple program that reads the darkness knob and quantizes its position to one of 16 shades of darkness, from snow white to coal black. The program would use that darkness level as the index to a 16-element table of initial timer values. Then it would turn on the heating elements and start the timer with the initial value selected from the table. At the end of the time delay, it would turn off the heat and pop up the toast. Come back next week, and I'll show you a working prototype."
    The second advisor, a computer scientist, immediately recognized the danger of such short-sighted thinking. He said, "Toasters don't just turn bread into toast, they are also used to warm frozen waffles. What you see before you is really a breakfast food cooker. As the subjects of your kingdom become more sophisticated, they will demand more capabilities. They will need a breakfast food cooker that can also cook sausage, fry bacon, and make scrambled eggs. A toaster that only makes toast will soon be obsolete. If we don't look to the future, we will have to completely redesign the toaster in just a few years.
    With this in mind, we can formulate a more intelligent solution to the problem. First, create a class of breakfast foods. Specialize this class into subclasses: grains, pork and poultry. The specialization process should be repeated with grains divided into toast, muffins, pancakes and waffles; pork divided into sausage, links and bacon; and poultry divided into scrambled eggs, hard-boiled eggs, poached eggs, fried eggs, and various omelet classes.
    The ham and cheese omelet class is worth special attention because it must inherit characteristics from the pork, dairy and poultry classes. Thus, we see that the problem cannot be properly solved without multiple inheritance. At run time, the program must create the proper object and send a message to the object that says, 'Cook yourself'. The semantics of this message depend, of course, on the kind of object, so they have a different meaning to a piece of toast than to scrambled eggs.
    Reviewing the process so far, we see that the analysis phase has revealed that the primary requirement is to cook any kind of breakfast food. In the design phase, we have discovered some derived requirements. Specifically, we need an object-oriented language with multiple inheritance. Of course, users don't want the eggs to get cold while the bacon is frying, so concurrent processing is required, too.
    We must not forget the user interface. The lever that lowers the food lacks versatility and the darkness knob is confusing. Users won't buy the product unless it has a user-friendly, graphical interface.
    When the breakfast cooker is plugged in, users should see a cowboy boot on the screen. Users click on it and the message 'Booting UNIX v. 8.3' appears on the screen. (UNIX 8.3 should be out by the time the product gets to the market.) Users can pull down a menu and click on the foods they want to cook.
    Having made the wise decision of specifying the software first in the design phase, all that remains is to pick an adequate hardware platform for the implementation phase. An Intel 80386 with 8MB of memory, a 30MB hard disk and a VGA monitor should be sufficient. If you select a multitasking, object oriented language that supports multiple inheritance and has a built-in GUI, writing the program will be a snap. (Imagine the difficulty we would have had if we had foolishly allowed a hardware-first design strategy to lock us into a four-bit microcontroller!)."
    The king had the computer scientist thrown in the moat, and they all lived happily ever after.
  18. Orunmila
    Epigrams on Programming
    Alan J. Perlis
    Yale University
    This text has been published in SIGPLAN Notices Vol. 17, No. 9, September 1982, pages 7 - 13. 
    The phenomena surrounding computers are diverse and yield a surprisingly rich base for launching metaphors at individual and group activities. Conversely, classical human endeavors provide an inexhaustible source of metaphor for those of us who are in labor within computation. Such relationships between society and device are not new, but the incredible growth of the computer's influence (both real and implied) lends this symbiotic dependency a vitality like a gangly youth growing out of his clothes within an endless puberty.
    The epigrams that follow attempt to capture some of the dimensions of this traffic in imagery that sharpens, focuses, clarifies, enlarges and beclouds our view of this most remarkable of all mans' artifacts, the computer.
    One man's constant is another man's variable. Functions delay binding: data structures induce binding. Moral: Structure data late in the programming process. Syntactic sugar causes cancer of the semi-colons. Every program is a part of some other program and rarely fits. If a program manipulates a large amount of data, it does so in a small number of ways. Symmetry is a complexity reducing concept (co-routines include sub-routines); seek it everywhere. It is easier to write an incorrect program than understand a correct one. A programming language is low level when its programs require attention to the irrelevant. It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures. Get into a rut early: Do the same processes the same way. Accumulate idioms. Standardize. The only difference (!) between Shakespeare and you was the size of his idiom list - not the size of his vocabulary. If you have a procedure with 10 parameters, you probably missed some. Recursion is the root of computation since it trades description for time. If two people write exactly the same program, each should be put in micro-code and then they certainly won't be the same. In the long run every program becomes rococo - then rubble. Everything should be built top-down, except the first time. Every program has (at least) two purposes: the one for which it was written and another for which it wasn't. If a listener nods his head when you're explaining your program, wake him up. A program without a loop and a structured variable isn't worth writing. A language that doesn't affect the way you think about programming, is not worth knowing. Wherever there is modularity there is the potential for misunderstanding: Hiding information implies a need to check communication. Optimization hinders evolution. A good system can't have a weak command language. To understand a program you must become both the machine and the program. Perhaps if we wrote programs from childhood on, as adults we'd be able to read them. One can only display complex information in the mind. Like seeing, movement or flow or alteration of view is more important than the static picture, no matter how lovely. There will always be things we wish to say in our programs that in all known languages can only be said poorly. Once you understand how to write a program get someone else to write it. Around computers it is difficult to find the correct unit of time to measure progress. Some cathedrals took a century to complete. Can you imagine the grandeur and scope of a program that would take as long? For systems, the analogue of a face-lift is to add to the control graph an edge that creates a cycle, not just an additional node. In programming, everything we do is a special case of something more general - and often we know it too quickly. Simplicity does not precede complexity, but follows it. Programmers are not to be measured by their ingenuity and their logic but by the completeness of their case analysis. The 11th commandment was "Thou Shalt Compute" or "Thou Shalt Not Compute" - I forget which. The string is a stark data structure and everywhere it is passed there is much duplication of process. It is a perfect vehicle for hiding information. Everyone can be taught to sculpt: Michelangelo would have had to be taught how not to. So it is with the great programmers. The use of a program to prove the 4-color theorem will not change mathematics - it merely demonstrates that the theorem, a challenge for a century, is probably not important to mathematics. The most important computer is the one that rages in our skulls and ever seeks that satisfactory external emulator. The standardization of real computers would be a disaster - and so it probably won't happen. Structured Programming supports the law of the excluded muddle. Re graphics: A picture is worth 10K words - but only those to describe the picture. Hardly any sets of 10K words can be adequately described with pictures. There are two ways to write error-free programs; only the third one works. Some programming languages manage to absorb change, but withstand progress. You can measure a programmer's perspective by noting his attitude on the continuing vitality of FORTRAN. In software systems it is often the early bird that makes the worm. Sometimes I think the only universal in the computing field is the fetch-execute-cycle. The goal of computation is the emulation of our synthetic abilities, not the understanding of our analytic ones. Like punning, programming is a play on words. As Will Rogers would have said, "There is no such thing as a free variable." The best book on programming for the layman is "Alice in Wonderland"; but that's because it's the best book on anything for the layman. Giving up on assembly language was the apple in our Garden of Eden: Languages whose use squanders machine cycles are sinful. The LISP machine now permits LISP programmers to abandon bra and fig-leaf. When we understand knowledge-based systems, it will be as before - except our finger-tips will have been singed. Bringing computers into the home won't change either one, but may revitalize the corner saloon. Systems have sub-systems and sub-systems have sub-systems and so on ad infinitum - which is why we're always starting over. So many good ideas are never heard from again once they embark in a voyage on the semantic gulf. Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy. A LISP programmer knows the value of everything, but the cost of nothing. Software is under a constant tension. Being symbolic it is arbitrarily perfectible; but also it is arbitrarily changeable. It is easier to change the specification to fit the program than vice versa. Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it. In English every word can be verbed. Would that it were so in our programming languages. Dana Scott is the Church of the Lattice-Way Saints. In programming, as in everything else, to be in error is to be reborn. In computing, invariants are ephemeral. When we write programs that "learn", it turns out we do and they don't. Often it is means that justify ends: Goals advance technique and technique survives even when goal structures crumble. Make no mistake about it: Computers process numbers - not symbols. We measure our understanding (and control) by the extent to which we can arithmetize an activity. Making something variable is easy. Controlling duration of constancy is the trick. Think of all the psychic energy expended in seeking a fundamental distinction between "algorithm" and "program". If we believe in data structures, we must believe in independent (hence simultaneous) processing. For why else would we collect items within a structure? Why do we tolerate languages that give us the one without the other? In a 5 year period we get one superb programming language. Only we can't control when the 5 year period will begin. Over the centuries the Indians developed sign language for communicating phenomena of interest. Programmers from different tribes (FORTRAN, LISP, ALGOL, SNOBOL, etc.) could use one that doesn't require them to carry a blackboard on their ponies. Documentation is like term insurance: It satisfies because almost no one who subscribes to it depends on its benefits. An adequate bootstrap is a contradiction in terms. It is not a language's weaknesses but its strengths that control the gradient of its change: Alas, a language never escapes its embryonic sac. It is possible that software is not like anything else, that it is meant to be discarded: that the whole point is to always see it as soap bubble? Because of its vitality, the computing field is always in desperate need of new cliches: Banality soothes our nerves. It is the user who should parameterize procedures, not their creators. The cybernetic exchange between man, computer and algorithm is like a game of musical chairs: The frantic search for balance always leaves one of the three standing ill at ease. If your computer speaks English it was probably made in Japan. A year spent in artificial intelligence is enough to make one believe in God. Prolonged contact with the computer turns mathematicians into clerks and vice versa. In computing, turning the obvious into the useful is a living definition of the word "frustration". We are on the verge: Today our program proved Fermat's next-to-last theorem! What is the difference between a Turing machine and the modern computer? It's the same as that between Hillary's ascent of Everest and the establishment of a Hilton hotel on its peak. Motto for a research laboratory: What we work on today, others will first think of tomorrow. Though the Chinese should adore APL, it's FORTRAN they put their money on. We kid ourselves if we think that the ratio of procedure to data in an active data-base system can be made arbitrarily small or even kept small. We have the mini and the micro computer. In what semantic niche would the pico computer fall? It is not the computer's fault that Maxwell's equations are not adequate to design the electric motor. One does not learn computing by using a hand calculator, but one can forget arithmetic. Computation has made the tree flower. The computer reminds one of Lon Chaney - it is the machine of a thousand faces. The computer is the ultimate polluter. Its feces are indistinguishable from the food it produces. When someone says "I want a programming language in which I need only say what I wish done," give him a lollipop. Interfaces keep things tidy, but don't accelerate growth: Functions do. Don't have good ideas if you aren't willing to be responsible for them. Computers don't introduce order anywhere as much as they expose opportunities. When a professor insists computer science is X but not Y, have compassion for his graduate students. In computing, the mean time to failure keeps getting shorter. In man-machine symbiosis, it is man who must adjust: The machines can't. We will never run out of things to program as long as there is a single program around. Dealing with failure is easy: Work hard to improve. Success is also easy to handle: You've solved the wrong problem. Work hard to improve. One can't proceed from the informal to the formal by formal means. Purely applicative languages are poorly applicable. The proof of a system's value is its existence. You can't communicate complexity, only an awareness of it. It's difficult to extract sense from strings, but they're the only communication coin we can count on. The debate rages on: Is PL/I Bactrian or Dromedary? Whenever two programmers meet to criticize their programs, both are silent. Think of it! With VLSI we can pack 100 ENIACs in 1 sq.cm. Editing is a rewording activity. Why did the Roman Empire collapse? What is the Latin for office automation? Computer Science is embarrassed by the computer. The only constructive theory connecting neuroscience and psychology will arise from the study of software. Within a computer natural language is unnatural. Most people find the concept of programming obvious, but the doing impossible. You think you know when you learn, are more sure when you can write, even more when you can teach, but certain when you can program. It goes against the grain of modern education to teach children to program. What fun is there in making plans, acquiring discipline in organizing thoughts, devoting attention to detail and learning to be self-critical? If you can imagine a society in which the computer-robot is the only menial, you can imagine anything. Programming is an unnatural act. Adapting old programs to fit new machines usually means adapting new machines to behave like old ones. In seeking the unattainable, simplicity only gets in the way. If there are epigrams, there must be meta-epigrams.
    Epigrams are interfaces across which appreciation and insight flow. Epigrams parameterize auras. Epigrams are macros, since they are executed at read time. Epigrams crystallize incongruities. Epigrams retrieve deep semantics from a data base that is all procedure. Epigrams scorn detail and make a point: They are a superb high-level documentation. Epigrams are more like vitamins than protein. Epigrams have extremely low entropy. The last epigram? Neither eat nor drink them, snuff epigrams.
  19. Orunmila
    If you are going to be writing any code you can probably use all the help you can get, and in that line you better be aware of the "Ballmer Peak".
    Legend has it that drinking alcohol impairs your ability to write code, BUT there is a curious peak somewhere in the vicinity of a 0.14 BAC where programmers attain almost super-human programming skill. The XKCD below tries to explain the finer nuances.
    But seriously many studies have shown that there is some truth to this in the sense that an increase in BAC may remove inhibitions which stifle creativity. 
    It is believed the name really came from Balmer (with one L) series which are sudden peaks in the emmision spectrum of Hydrogen which was discovered in 1885 by Johan Balmer, but since this is about programming that soon got converted to Ballmer after the former CEO of Microsoft who did tend to behave like he was on such a peak on stage from time to time.(see  https://www.youtube.com/watch?v=I14b-C67EXY)
    Either way, we are just going to believe that this one is true - confirmation bias and all, really - just Google it ! 🙂
     

  20. Orunmila
    They write the stuff - what we can learn from the way the Space Shuttle software was developed
    As a youngster I watched the movie "The right stuff" about the Mercury Seven. The film deified the picture of an Astronaut in my mind as some kind of superhero. Many years later I read and just absolutely loved an article written by Charles Fishman in Fast Company in 1996, entitled "They write the stuff", which did something similar for the developers who wrote the code which put those guys up there.
    Read the article here : https://www.fastcompany.com/28121/they-write-right-stuff
    I find myself going back to this article time and again when confronted by the question about how many bugs is acceptable in our code, or even just on what is possible in terms of quality. The article explores the code which ran the space shuttle and the processes the team responsible for this code follows.
    This team is one of very few teams certified to CMM level 5, and the quality of the code is achieved not through intelligence or heroic effort, but by a culture of quality engrained in the processes they have created.
    Here are the important numbers for the Space Shuttle code:
    Lines of code 420,000 Known bugs per version 1 Total development cost ($200M) $200,000,000.00 Cost per line of code (1975 value) $476.00 Cost per line of code (2019 inflation adjusted value) $2223.00 Development team (count) +-100 Average Productivity (lines/developer/workday) 8.04 The moral of the story here is that the quality everyone desires is indeed achievable, but we tend to severely underestimate the cost which would go along with this level of quality. I have used the numbers from this article countless times in arguments about the trade-off between quality and cost on my projects.
    The most important lesson to learn from this case study is that quality seems to depend primarily on the development processes the team follows and the culture of quality which they adhere to.
    As Fishman points out, in software development projects most people focus on attempting to test quality into the software, and of course as Steve McConnell pointed out this is "like weighing yourself more often" in an attempt to lose weight. 
    Another lesson to be learned is that the process inherently accepts the fact that people will make mistakes. Quality is not ensured by people trying really hard to avoid mistakes, instead it accepts the fact that mistakes will be made and builds the detection and correction of these mistakes into the process itself. This means that if something slips through it is inappropriate to blame any individual for this mistake, the only sensible thing to do is to acknowledge a gap in the process, so when a bug does make it out into the wild the team will focus on how to fix the process instead of trying to lay blame on the person who was responsible for the mistake.
    Over all this article is a great story which is chock full of lessons to learn and pass on.
    References
    The full article is archived here : https://www.fastcompany.com/28121/they-write-right-stuff
    NASA has a great piece published on the history of the software https://history.nasa.gov/computers/Ch4-5.html
    Fantastic read- the interview with Tony Macina published in Communications of the ACM in 1984 -  http://klabs.org/DEI/Processor/shuttle/shuttle_primary_computer_system.pdf
     
     
     
  21. Orunmila
    Some reading is just compulsary in Computer Science, like Shakespere is to English, you will get no admiration from your peers if it comes out that you have never heard of the Bastard Operator from Hell.
    There is a whole collection of BOFH stories online here http://bofh.bjash.com/index.html#Original and even a WikiPedia page of course https://en.wikipedia.org/wiki/Bastard_Operator_From_Hell. The stories were originally posted on Usenet around 1992 Simon Travaglia.
    I would recommend you start at #1 here http://bofh.bjash.com/bofh/bofh1.html
    For those who need some movitvation to click, here is a short excerpt ...
     
  22. Orunmila
    Real Programmers don't use Pascal
    I mentioned this one briefly before in the story of Mel, but this series would not be complete without this one.
     
    Real Programmers Don't Use PASCAL
    Ed Post
    Graphic Software Systems
    P.O. Box 673
    25117 S.W. Parkway
    Wilsonville, OR 97070
    Copyright (c) 1982
    (decvax | ucbvax | cbosg | pur-ee | lbl-unix)!teklabs!ogcvax!gss1144!evp
    Back in the good old days -- the "Golden Era" of computers, it was easy to separate the men from the boys (sometimes called "Real Men" and "Quiche Eaters" in the literature). During this period, the Real Men were the ones that understood computer programming, and the Quiche Eaters were the ones that didn't. A real computer programmer said things like "DO 10 I=1,10" and "ABEND" (they actually talked in capital letters, you understand), and the rest of the world said things like "computers are too complicated for me" and "I can't relate to computers -- they're so impersonal". (A previous work [1] points out that Real Men don't "relate" to anything, and aren't afraid of being impersonal.)
    But, as usual, times change. We are faced today with a world in which little old ladies can get computerized microwave ovens, 12 year old kids can blow Real Men out of the water playing Asteroids and Pac-Man, and anyone can buy and even understand their very own Personal Computer. The Real Programmer is in danger of becoming extinct, of being replaced by high-school students with TRASH-80s!
    There is a clear need to point out the differences between the typical high-school junior Pac-Man player and a Real Programmer. Understanding these differences will give these kids something to aspire to -- a role model, a Father Figure. It will also help employers of Real Programmers to realize why it would be a mistake to replace the Real Programmers on their staff with 12 year old Pac-Man players (at a considerable salary savings).
     
    LANGUAGES
    The easiest way to tell a Real Programmer from the crowd is by the programming language he (or she) uses. Real Programmers use FORTRAN. Quiche Eaters use PASCAL. Nicklaus Wirth, the designer of PASCAL, was once asked, "How do you pronounce your name?". He replied "You can either call me by name, pronouncing it 'Veert', or call me by value, 'Worth'." One can tell immediately from this comment that Nicklaus Wirth is a Quiche Eater. The only parameter passing mechanism endorsed by Real Programmers is call-by-value-return, as implemented in the IBM/370 FORTRAN G and H compilers. Real programmers don't need abstract concepts to get their jobs done: they are perfectly happy with a keypunch, a FORTRAN IV compiler, and a beer.
     
    Real Programmers do List Processing in FORTRAN. Real Programmers do String Manipulation in FORTRAN. Real Programmers do Accounting (if they do it at all) in FORTRAN. Real Programmers do Artificial Intelligence programs in FORTRAN. If you can't do it in FORTRAN, do it in assembly language. If you can't do it in assembly language, it isn't worth doing.
     
    STRUCTURED PROGRAMMING
    Computer science academicians have gotten into the "structured programming" rut over the past several years. They claim that programs are more easily understood if the programmer uses some special language constructs and techniques. They don't all agree on exactly which constructs, of course, and the examples they use to show their particular point of view invariably fit on a single page of some obscure journal or another -- clearly not enough of an example to convince anyone. When I got out of school, I thought I was the best programmer in the world. I could write an unbeatable tic-tac-toe program, use five different computer languages, and create 1000 line programs that WORKED. (Really!) Then I got out into the Real World. My first task in the Real World was to read and understand a 200,000 line FORTRAN program, then speed it up by a factor of two. Any Real Programmer will tell you that all the Structured Coding in the world won't help you solve a problem like that -- it takes actual talent. Some quick observations on Real Programmers and Structured Programming:
     
    Real Programmers aren't afraid to use GOTOs. Real Programmers can write five page long DO loops without getting confused. Real Programmers enjoy Arithmetic IF statements because they make the code more interesting. Real Programmers write self-modifying code, especially if it saves them 20 nanoseconds in the middle of a tight loop. Programmers don't need comments: the code is obvious. Since FORTRAN doesn't have a structured IF, REPEAT ... UNTIL, or CASE statement, Real Programmers don't have to worry about not using them. Besides, they can be simulated when necessary using assigned GOTOs. Data structures have also gotten a lot of press lately. Abstract Data Types, Structures, Pointers, Lists, and Strings have become popular in certain circles. Wirth (the above-mentioned Quiche Eater) actually wrote an entire book [2] contending that you could write a program based on data structures, instead of the other way around. As all Real Programmers know, the only useful data structure is the array. Strings, lists, structures, sets -- these are all special cases of arrays and and can be treated that way just as easily without messing up your programing language with all sorts of complications. The worst thing about fancy data types is that you have to declare them, and Real Programming Languages, as we all know, have implicit typing based on the first letter of the (six character) variable name.
     
    OPERATING SYSTEMS
    What kind of operating system is used by a Real Programmer? CP/M? God forbid -- CP/M, after all, is basically a toy operating system. Even little old ladies and grade school students can understand and use CP/M.
    Unix is a lot more complicated of course -- the typical Unix hacker never can remember what the PRINT command is called this week -- but when it gets right down to it, Unix is a glorified video game. People don't do Serious Work on Unix systems: they send jokes around the world on USENET and write adventure games and research papers.
    No, your Real Programmer uses OS/370. A good programmer can find and understand the description of the IJK305I error he just got in his JCL manual. A great programmer can write JCL without referring to the manual at all. A truly outstanding programmer can find bugs buried in a 6 megabyte core dump without using a hex calculator. (I have actually seen this done.)
    OS/370 is a truly remarkable operating system. It's possible to destroy days of work with a single misplaced space, so alertness in the programming staff is encouraged. The best way to approach the system is through a keypunch. Some people claim there is a Time Sharing system that runs on OS/370, but after careful study I have come to the conclusion that they are mistaken.
     
    PROGRAMMING TOOLS
    What kind of tools does a Real Programmer use? In theory, a Real Programmer could run his programs by keying them into the front panel of the computer. Back in the days when computers had front panels, this was actually done occasionally. Your typical Real Programmer knew the entire bootstrap loader by memory in hex, and toggled it in whenever it got destroyed by his program. (Back then, memory was memory -- it didn't go away when the power went off. Today, memory either forgets things when you don't want it to, or remembers things long after they're better forgotten.) Legend has it that Seymour Cray, inventor of the Cray I supercomputer and most of Control Data's computers, actually toggled the first operating system for the CDC7600 in on the front panel from memory when it was first powered on. Seymour, needless to say, is a Real Programmer.
    One of my favorite Real Programmers was a systems programmer for Texas Instruments. One day, he got a long distance call from a user whose system had crashed in the middle of some important work. Jim was able to repair the damage over the phone, getting the user to toggle in disk I/O instructions at the front panel, repairing system tables in hex, reading register contents back over the phone. The moral of this story: while a Real Programmer usually includes a keypunch and lineprinter in his toolkit, he can get along with just a front panel and a telephone in emergencies.
    In some companies, text editing no longer consists of ten engineers standing in line to use an 029 keypunch. In fact, the building I work in doesn't contain a single keypunch. The Real Programmer in this situation has to do his work with a text editor program. Most systems supply several text editors to select from, and the Real Programmer must be careful to pick one that reflects his personal style. Many people believe that the best text editors in the world were written at Xerox Palo Alto Research Center for use on their Alto and Dorado computers [3]. Unfortunately, no Real Programmer would ever use a computer whose operating system is called SmallTalk, and would certainly not talk to the computer with a mouse.
    Some of the concepts in these Xerox editors have been incorporated into editors running on more reasonably named operating systems. EMACS and VI are probably the most well known of this class of editors. The problem with these editors is that Real Programmers consider "what you see is what you get" to be just as bad a concept in text editors as it is in women. No, the Real Programmer wants a "you asked for it, you got it" text editor -- complicated, cryptic, powerful, unforgiving, dangerous. TECO, to be precise.
    It has been observed that a TECO command sequence more closely resembles transmission line noise than readable text [4]. One of the more entertaining games to play with TECO is to type your name in as a command line and try to guess what it does. Just about any possible typing error while talking with TECO will probably destroy your program, or even worse -- introduce subtle and mysterious bugs in a once working subroutine.
    For this reason, Real Programmers are reluctant to actually edit a program that is close to working. They find it much easier to just patch the binary object code directly, using a wonderful program called SUPERZAP (or its equivalent on non-IBM machines). This works so well that many working programs on IBM systems bear no relation to the original FORTRAN code. In many cases, the original source code is no longer available. When it comes time to fix a program like this, no manager would even think of sending anything less than a Real Programmer to do the job -- no Quiche Eating structured programmer would even know where to start. This is called "job security".
    Some programming tools NOT used by Real Programmers:
     
    FORTRAN preprocessors like MORTRAN and RATFOR. The Cuisinarts of programming -- great for making Quiche. See comments above on structured programming. Source language debuggers. Real Programmers can read core dumps. Compilers with array bounds checking. They stifle creativity, destroy most of the interesting uses for EQUIVALENCE, and make it impossible to modify the operating system code with negative subscripts. Worst of all, bounds checking is inefficient. Source code maintainance systems. A Real Programmer keeps his code locked up in a card file, because it implies that its owner cannot leave his important programs unguarded [5].  
    THE REAL PROGRAMMER AT WORK
    Where does the typical Real Programmer work? What kind of programs are worthy of the efforts of so talented an individual? You can be sure that no real Programmer would be caught dead writing accounts-receivable programs in COBOL, or sorting mailing lists for People magazine. A Real Programmer wants tasks of earth-shaking importance (literally!):
     
    Real Programmers work for Los Alamos National Laboratory, writing atomic bomb simulations to run on Cray I supercomputers. Real Programmers work for the National Security Agency, decoding Russian transmissions. It was largely due to the efforts of thousands of Real Programmers working for NASA that our boys got to the moon and back before the cosmonauts. The computers in the Space Shuttle were programmed by Real Programmers. Programmers are at work for Boeing designing the operating systems for cruise missiles. Some of the most awesome Real Programmers of all work at the Jet Propulsion Laboratory in California. Many of them know the entire operating system of the Pioneer and Voyager spacecraft by heart. With a combination of large ground-based FORTRAN programs and small spacecraft-based assembly language programs, they can to do incredible feats of navigation and improvisation, such as hitting ten-kilometer wide windows at Saturn after six years in space, and repairing or bypassing damaged sensor platforms, radios, and batteries. Allegedly, one Real Programmer managed to tuck a pattern-matching program into a few hundred bytes of unused memory in a Voyager spacecraft that searched for, located, and photographed a new moon of Jupiter.
    One plan for the upcoming Galileo spacecraft mission is to use a gravity assist trajectory past Mars on the way to Jupiter. This trajectory passes within 80 +/- 3 kilometers of the surface of Mars. Nobody is going to trust a PASCAL program (or PASCAL programmer) for navigation to these tolerances.
    As you can tell, many of the world's Real Programmers work for the U.S. Government, mainly the Defense Department. This is as it should be. Recently, however, a black cloud has formed on the Real Programmer horizon.
    It seems that some highly placed Quiche Eaters at the Defense Department decided that all Defense programs should be written in some grand unified language called "ADA" (registered trademark, DoD). For a while, it seemed that ADA was destined to become a language that went against all the precepts of Real Programming -- a language with structure, a language with data types, strong typing, and semicolons. In short, a language designed to cripple the creativity of the typical Real Programmer. Fortunately, the language adopted by DoD has enough interesting features to make it approachable: it's incredibly complex, includes methods for messing with the operating system and rearranging memory, and Edsgar Dijkstra doesn't like it [6]. (Dijkstra, as I'm sure you know, was the author of "GoTos Considered Harmful" -- a landmark work in programming methodology, applauded by Pascal Programmers and Quiche Eaters alike.) Besides, the determined Real Programmer can write FORTRAN programs in any language.
    The real programmer might compromise his principles and work on something slightly more trivial than the destruction of life as we know it, providing there's enough money in it. There are several Real Programmers building video games at Atari, for example. (But not playing them. A Real Programmer knows how to beat the machine every time: no challange in that.) Everyone working at LucasFilm is a Real Programmer. (It would be crazy to turn down the money of 50 million Star Wars fans.) The proportion of Real Programmers in Computer Graphics is somewhat lower than the norm, mostly because nobody has found a use for Computer Graphics yet. On the other hand, all Computer Graphics is done in FORTRAN, so there are a fair number people doing Graphics in order to avoid having to write COBOL programs.
     
    THE REAL PROGRAMMER AT PLAY
    Generally, the Real Programmer plays the same way he works -- with computers. He is constantly amazed that his employer actually pays him to do what he would be doing for fun anyway, although he is careful not to express this opinion out loud. Occasionally, the Real Programmer does step out of the office for a breath of fresh air and a beer or two. Some tips on recognizing real programmers away from the computer room:
     
    At a party, the Real Programmers are the ones in the corner talking about operating system security and how to get around it. At a football game, the Real Programmer is the one comparing the plays against his simulations printed on 11 by 14 fanfold paper. At the beach, the Real Programmer is the one drawing flowcharts in the sand. A Real Programmer goes to a disco to watch the light show. At a funeral, the Real Programmer is the one saying "Poor George. And he almost had the sort routine working before the coronary." In a grocery store, the Real Programmer is the one who insists on running the cans past the laser checkout scanner himself, because he never could trust keypunch operators to get it right the first time.  
    THE REAL PROGRAMMER'S NATURAL HABITAT
    What sort of environment does the Real Programmer function best in? This is an important question for the managers of Real Programmers. Considering the amount of money it costs to keep one on the staff, it's best to put him (or her) in an environment where he can get his work done.
    The typical Real Programmer lives in front of a computer terminal. Surrounding this terminal are:
     
    Listings of all programs the Real Programmer has ever worked on, piled in roughly chronological order on every flat surface in the office. Some half-dozen or so partly filled cups of cold coffee. Occasionally, there will be cigarette butts floating in the coffee. In some cases, the cups will contain Orange Crush. Unless he is very good, there will be copies of the OS JCL manual and the Principles of Operation open to some particularly interesting pages. Taped to the wall is a line-printer Snoopy calender for the year 1969. Strewn about the floor are several wrappers for peanut butter filled cheese bars (the type that are made stale at the bakery so they can't get any worse while waiting in the vending machine). Hiding in the top left-hand drawer of the desk is a stash of double stuff Oreos for special occasions. Underneath the Oreos is a flow-charting template, left there by the previous occupant of the office. (Real Programmers write programs, not documentation. Leave that to the maintainence people.) The Real Programmer is capable of working 30, 40, even 50 hours at a stretch, under intense pressure. In fact, he prefers it that way. Bad response time doesn't bother the Real Programmer -- it gives him a chance to catch a little sleep between compiles. If there is not enough schedule pressure on the Real Programmer, he tends to make things more challenging by working on some small but interesting part of the problem for the first nine weeks, then finishing the rest in the last week, in two or three 50-hour marathons. This not only inpresses his manager, who was despairing of ever getting the project done on time, but creates a convenient excuse for not doing the documentation. In general:
     
    No Real Programmer works 9 to 5. (Unless it's 9 in the evening to 5 in the morning.) Real Programmers don't wear neckties. Real Programmers don't wear high heeled shoes. Real Programmers arrive at work in time for lunch. [9] A Real Programmer might or might not know his wife's name. He does, however, know the entire ASCII (or EBCDIC) code table. Real Programmers don't know how to cook. Grocery stores aren't often open at 3 a.m., so they survive on Twinkies and coffee.  
    THE FUTURE
    What of the future? It is a matter of some concern to Real Programmers that the latest generation of computer programmers are not being brought up with the same outlook on life as their elders. Many of them have never seen a computer with a front panel. Hardly anyone graduating from school these days can do hex arithmetic without a calculator. College graduates these days are soft -- protected from the realities of programming by source level debuggers, text editors that count parentheses, and user friendly operating systems. Worst of all, some of these alleged computer scientists manage to get degrees without ever learning FORTRAN! Are we destined to become an industry of Unix hackers and Pascal programmers?
    On the contrary. From my experience, I can only report that the future is bright for Real Programmers everywhere. Neither OS/370 nor FORTRAN show any signs of dying out, despite all the efforts of Pascal programmers the world over. Even more subtle tricks, like adding structured coding constructs to FORTRAN have failed. Oh sure, some computer vendors have come out with FORTRAN 77 compilers, but every one of them has a way of converting itself back into a FORTRAN 66 compiler at the drop of an option card -- to compile DO loops like God meant them to be.
    Even Unix might not be as bad on Real Programmers as it once was. The latest release of Unix has the potential of an operating system worthy of any Real Programmer. It has two different and subtly incompatible user interfaces, an arcane and complicated terminal driver, virtual memory. If you ignore the fact that it's structured, even C programming can be appreciated by the Real Programmer: after all, there's no type checking, variable names are seven (ten? eight?) characters long, and the added bonus of the Pointer data type is thrown in. It's like having the best parts of FORTRAN and assembly language in one place. (Not to mention some of the more creative uses for #define.)
    No, the future isn't all that bad. Why, in the past few years, the popular press has even commented on the bright new crop of computer nerds and hackers ([7] and [8]) leaving places like Stanford and M.I.T. for the Real World. From all evidence, the spirit of Real Programming lives on in these young men and women. As long as there are ill-defined goals, bizarre bugs, and unrealistic schedules, there will be Real Programmers willing to jump in and Solve The Problem, saving the documentation for later. Long live FORTRAN!
     
    ACKNOWLEGEMENT
    I would like to thank Jan E., Dave S., Rich G., Rich E. for their help in characterizing the Real Programmer, Heather B. for the illustration, Kathy E. for putting up with it, and atd!avsdS:mark for the initial inspriration.
     
    REFERENCES
    [1] Feirstein, B., Real Men Don't Eat Quiche, New York, Pocket Books, 1982.
    [2] Wirth, N., Algorithms + Datastructures = Programs, Prentice Hall, 1976.
    [3] Xerox PARC editors . . .
    [4] Finseth, C., Theory and Practice of Text Editors - or - a Cookbook for an EMACS, B.S. Thesis, MIT/LCS/TM-165, Massachusetts Institute of Technology, May 1980.
    [5] Weinberg, G., The Psychology of Computer Programming, New York, Van Nostrabd Reinhold, 1971, page 110.
    [6] Dijkstra, E., On the GREEN Language Submitted to the DoD, Sigplan notices, Volume 3, Number 10, October 1978.
    [7] Rose, Frank, Joy of Hacking, Science 82, Volume 3, Number 9, November 1982, pages 58 - 66.
    [8] The Hacker Papers, Psychology Today, August 1980.
    [9] Datamation, July, 1983, pp. 263-265.
  23. Orunmila
    Abandon all sanity, ye who enter here!
    (The section above is of course Dante's description of the inscription to the gates of Hell)
    Computers work in essentially the same way, executing instructions, moving data around, etc. Programming languages are mere abstractions allowing us to tell the same computer how to do the same things using different "words and methods". These abstractions provided by languages like C, C++ or even Java, GoLang or LISP were created on the back of many years of research and when we learn to program we seldom spend the time to understand WHY the language works the way that it does, focussing rather on HOW to use it.
    Some concepts are best explored by imagining what life would be like if the language we were using to program looked very different. For example every embedded programmer I ask has told me that GOTO is bad, but very seldom could someone actually explain why, and it would be even more rare to find someone who was familiar with Dijkstra's paper on the subject. (see this page for that one and many more examples). I have struggled to find a better explanation of what Dijkstra was trying to convey than the "COME FROM" statement in INTERCAL!
    The following article appeared in Computer Shopper (the British magazine of that name, not the Ziff-Davis title) around September 1992.
    Intercal -- the Language From Hell
    Where would we be without effective programming languages?
    Wait, don't turn the page; that was a rhetorical question. (The answer isn't: word-processing on slide rules.) Every once in a while it's worth thinking about the matter. Most of us (the lucky ones, that is) don't come within spitting distance of programming from one day to the next. Those who do, usually don't give any thought to them: it's time to write that function so do it, in C or Cobol or dBase whatever comes to hand. We don't waste time contemplating the deeper significance of the tools we use to do the job. But what would we do without well-designed, efficient programming languages?
    Go back to slide rules, maybe.
    A computer, in a very real sense, is its language. The processor speaks in a native tongue, its own machine code: unreadable by humans and next to impossible to program in, but nevertheless essential. It's the existence of this language which makes the computer such a general purpose tool; a hardware platform upon which we layer the abstractions, the virtual machines, of our operating systems. Which are, when you stop to think about it, simply a set of grammatically correct expressions in the computer's internal language. Take the humble 386 PC for example. Feed it one set of instructions and it pretends to be a cute little Windows machine; feed it another set, and something hairy happens -- it turns into a big, bad file server and starts demanding passwords and issuing threats.
    This flexibility is not an accident. Almost from the start, the feature which distinguished computers from their complex calculating predecessors was the fact that they were programmable -- indeed, universally programmable, capable of tackling any computationally tractable problem. Strip away the programmability of the machine and you lose the universality. Not to mention the usefulness.
    Programming languages, as any fule kno, originated in the nineteen-fifties as a partial solution to a major difficulty; machine code is difficult to write and maintain and next to impossible to move (or port) from one computer to another of a different type. Assemblers, which allowed the programmers to write machine code using relatively simple mnemonic commands (which were then "assembled" into machine code) were an early innovation, but still failed to address the portability issue. The solution that finally took off was the concept of an abstract "high level" language: and the first to achieve widespread use was Fortran.
    A high level language is an artificial language with a rigidly specified grammar, which can be translated into machine code via a program called a "compiler". Statements in the high level language may correspond to individual statements, or whole series of statements, in the machine language ofthe computer. Indeed, the only thing that makes compilation practical is the fact that all computer language systems are effectively equivalent; any algorithm which can be expressed in one language can be solved in any other, in principle. So why are there so many languages?
    A load of old cobol'ers
    There are several answers to the question of language proliferation. Besides the facetious (some of us don't like Cobol) and the obvious (designing languages looks like a Fun Thing to a certain type of warped personality), there's the pragmatic reason. Simply put, some languages are excellent for expressing certain types of problem, but are lousy at dealing with other situations. Take Prolog, for example. Prolog is a brilliant language for resolving formal logic propositions expressable in the first order predicate calculus, but you'd need your head examining if you tried to write an operating system in it. (The Japanese MITI tried to do something similar with their Fifth Generation Project, and when was the last time you heard of them? Right.) Alternatively, take C. C is a wonderful language. It combines the flexibility and speed of raw machine code with the readability of ... er. Yes, you can re-jig a C compiler to run on anything. You can fine-tune it to produce tight, fast machine code that will execute on your toaster. Yes, you can write wonderful device drivers in it. But, again, you'd need your head examining if you set out to write a humongous database application in it. That's what Cobol is for; or SQL. So to the point of this article ... INTERCAL.
    An icky mess
    INTERCAL is a programming language which is to other languages as elephants are to deep-sea fishing -- it has nothing whatsoever to do with them. Nothing ... except that it's a programming language. And it serves as a wonderful example of the fact that, despite their theoretical, abstract interchangeability, not all languages are born equal. Here's a potted history of the offender:
    INTERCAL is short for "Computer Language With No Readily Pronouncable Acronym". It was not so much designed as perpetrated at Princeton University, on the morning of May 26th, 1972, by Donald R. Woods and James M. Lyon. They have been trying to live it down ever since. The current implementation, C-INTERCAL, was written by Eric S. Raymond (also the author of The New Hacker's Dictionary), and -- god help us -- runs on anything that supports C (and the C programming tools lex and yacc).
    INTERCAL is, in its own terms, elegant, simple and concise. It is also flexible and forgiving; if the compiler (called, appropriately enough, ick) encounters something it doesn't understand it simply ignores it and carries on. In order to insert a comment, just type in something that ick thinks is wrong; but be careful not to embed any valid INTERCAL code in your comments, or ick will get all excited and compile it. There are only two variable types: 16-bit integers, and 32-bit integers, denoted by .1 (for the 16-bit variable called "1") or :1 (for the 32-bit variable named "1"); note that .1 is not equivalent to :1 and definitely has nothing to do with 0.1 (unless it happens to be storing a value of 0.1).
    INTERCAL supports two unique bitwise operators, interleave (or mingle) and select, denoted by the "^" and "~" symbols respectively. You interleave two variables by alternating the bits in the two operands; you select two variables by taking from the first operand whichever bits correspond to 1's in the second operand, then pack these bits to the right in the result. There are also numerous unary bitwise operators, and in order to resolve matters of precedence the pedantic may use sparks (') or rabbit-ears (") to group expressions (Don't blame me for the silly names: INTERCAL has a character set which is best described as creative.) It is not surprising that these operators are unique to INTERCAL; the parlous readability of C would not be enhanced by the addition of syntax like:
    PLEASE DO IGNORE .1 <-".1^C'&:51~"#V1^C!12~;&75SUB"V'V.1~ Like any other language, INTERCAL has flow-of-control statements and input and output statements. To write something or other into a variable, you need the WRITE IN list statement, where list is a string of variables and/or array elements. The format of the input data should be as numbers, the digits of which are spelt out in english in the range ZERO (or OH) to FOUR TWO NINE FOUR NINE SIX SEVEN TWO NINE FIVE. To output some information, you need the READ OUT list statement, where list again consists of variables. Numbers are printed, by default, in the form of "extended" Roman numerals (the syntax of which I will draw a merciful veil over), although the scholarly may make use of the standard library, which contains routines for formatting output in Sanskrit.
    Like FORTRAN, INTERCAL uses line numbers which are optional and follow in no particular order. Unlike FORTRAN, INTERCAL has no evil, unspeakable GOTO command, and not even an IF statement. However, you would be wrong to ascribe this to INTERCAL being designed for structured programming; it is actually because C-INTERCAL is the only known language that implements the legendary COME FROM ... control statement (originally described by R. L. Clark in "A Linguistic contribution to GOTO-less programming", Comm. ACM 27 (1984), pp. 349-350). For many years the GOTO statement -- once the primary means of controlling the sequence of execution of a program -- has been reviled as contributing to unreadable, unpredictable code that is virtually impossible to follow because it jumps about the place like a kangaroo on amphetamines. The COME FROM statement enables INTERCAL to do away with nasty GOTO's, while still preserving for the programmer that sense of warm self-esteem and achievement that comes from successfully writing a really nasty piece of self-modifying code involving computed GOTO's in FORTRAN (Or, for the terminally hip and hacker-ish, involving a triple-indirect pointer to a union of UNIX kernel data structures in C. Capisce?) Basically, the COME FROM statement specifies a line, or lines, which -- when the program executes them -- will jump to the COME FROM: effectively the opposite of GOTO.
    Because INTERCAL contains no equivalent of the NEXT statement for controlling whether or not some statement is executed, it provides a creative, endearing and unique substitute; abstention. For example, you can abstain from executing a given line of code with the ABSTAIN FROM (label) form of the command. Alternatively, and more uselessly, you can abstain from executing all statements of a specified type; for example, you can say
    PLEASE ABSTAIN FROM IGNORING + FORGETTING or
    DO ABSTAIN FROM ABSTAINING The abstention command is revoked by the REINSTATE statement. It is a Bad Idea to ABSTAIN FROM REINSTATING. It is also worth noting that the ABSTAIN syntax is rather confusing; for example, DO ABSTAIN FROM GIVING UP is not accepted, although a valid synonym for this is DON'T GIVE UP. (GIVE UP is the statement that terminates execution of a program. You are encouraged to use this statement at every opportunity. It is your only hope of escape.)
    The designers of the INTERCAL language believed that source code should be easy to understand or, failing that, friendly; in extremis, good old-fashioned politeness will do. Consequently, the syntax of the language looks a bit odd at first. After the line label (if any) there should be a statement identifier; this can be one of DO, PLEASE, or PLEASE DO. Programs which are insufficiently polite to the compiler may be rejected with the error message PROGRAMMER IS INSUFFICIENTLY POLITE; likewise, programs which spend too much time grovelling to the compiler will be terminated with extreme prejudice. The DO, PLEASE or PLEASE DO is then followed by (optionally) one of NOT, N'T, or %n, then a statement: the NOT or N'T's meaning should be self-evident, while the %n is merely the percentage probability that the following statement will be executed.
    Of course, with only two binary and three unary operators, it is rather difficult for programmers to get to grips with the language. Therefore, in a fit of quite uncharacteristic generosity, the designers have supplied -- and even partially documented -- a library of rather outre subroutines that carry out such esoteric operations as addition, subtraction, logical comparison, and generation of random numbers. The subroutines will be charmingly familiar to those PDP-11 assembly-language hackers among you who are also fluent in FORTRAN and SPITBOL.
    Why you should program in Intercal
    INTERCAL, despite being afflicted with these unique features (which, arguably, should remain that way) has survived for twenty years and is indeed thriving. The relatively recent C-INTERCAL dialect (which introduced the COME FROM statement, among other things) has spread it far beyond its original domain; the infamy continues, transmitted like a malign influence across the Internet. It is a matter of record that no computer language has ever succeeded in gaining widespread acceptance on the basis of elegance, comprehensibility or necessity. Look at FORTRAN, Cobol or C; all of them spread despite the fact that better alternatives were available. In fact, there are reasons to believe that INTERCAL is the language of the future.
    Firstly, INTERCAL creates jobs. Yes, it's true. In general, if a particular programming tool is unfriendly, ugly, and absolutely essential, the response of management is to throw more programmers at it instead of replacing it. INTERCAL is so monumentally difficult to use for anything sensible that it is surely destined to be the cause of full employment, massive increases in the DP budget of every company where it is used, and general prosperity for programmers.
    Secondly, once you have learned INTERCAL you will be able to win friends and influence people. As the authors point out, if you were to state that the simplest way to store a value of 65536 in a 32-bit variable is
    DO :1 <- #0$#256 any sensible programmer would say that this was absurd. Since this is indeed the simplest way of doing this, they'd be made to look like a fool in front of their boss (who, by Murphy's law, would have turned up at just that moment). This will have a devastating effect on their ego and simultaneously make you look brilliant (until they realise that you cribbed this example from the manual, like me. Deep shame.)
    Thirdly, INTERCAL helps sell hardware. There's been a lot of fuss recently about big corporations inventing gigantic, bloated windowing and operating systems that require monstrously powerful computers to run on; some cynics suggest that this is because personal computers are now so powerful that they can already do anything any reasonable individual would want them to, and it's necessary to invent warm furry buttons that soak up millions of processor cycles in order to sell new equipment. With INTERCAL, you no longer need Windows or OS/2 to slow your 486 PC to a crawl! A Sieve of Eratosthenes benchmark (that computes all the prime numbers less than 65536), when coded in INTERCAL, clocked over seventeen hours on a SPARCStation-1. The same program, in C, took less than 0.5 seconds -- thus proving, quite clearly, that INTERCAL software is so slow that you absolutely must buy a CRAY-3 or equivalent in order to have any hope of getting any work out of it.
    Consequently, it is quite clear that INTERCAL represents a major alternative to conventional programming languages. Anyone who learns this language is going to go far -- at least as far as the nearest psychiatric institution. Rumour has it that INTERCAL was not entirely unrelated to the collapse of the Soviet Union and the success of the Apollo missions. It is even reported to improve your sex life and restore hair loss! So we warmly advise you to take this utterly unbiased report at face value and go forth and get your friendly neighbourhood software company to switch to INTERCAL, wherever or whatever they may be using right now. You know it makes sense ...
    References
    The INTERCAL Resources Page
    INTERCAL on WikiPedia
    Intercal on Rosettacode
    The INTERCAL reference manual
     
  24. Orunmila
    Every programmer should read what Edsger Dijkstra wrote as part of their Education. The University of Texas hosts a comprehensive  Archive of Dijkstra's writings
    Perhaps my faovourite piece is what he wrote about "Users" in EWD618.
    In the context of the article Dijkstra was calling this kind of stereotyping out saying that 
    Read the full paper here : On Webster, users, bugs and Aristotle
     
  25. Orunmila
    What every embedded programmer should know about … Duff's Device
    "A trick used three times becomes a standard technique"
    (George Polya)
    C is a peculiar language and knowing the internal details of how it works can be very powerful. At the same time, when we say things like that of course we first think of the Story of Mel which we wrote about before, so be careful not to write unmaintainable code!
    Probably the most famous "trick" in C is something called  Duff's Device. It was discovered in 1983 by Duff while working at Lucasfilm (since they were working on Indiana Jones and the Temple of Doom around that time I am just going to believe that the booby traps were the inspiration for this one).
    The purpose of this "trick" is to unroll a loop in C, to get the performance increase of having the loop unrolled even if the number of times the loop runs is not always the same.  We all know unrolling a loop which runs a fixed number of times is trivially easy, but doing it for an arbitrary number of cycles is quite a challenge.
    It relies on two facts about C. Firstly that case labels fall through if there is no break, and secondly that C allows safely jumping into the middle of a loop. Duff originally had this problem when he had to simply copy integers. This is very similar to the problem we often have in embedded programs that need to copy data, or do anything mundane repeatedly in a loop.
    Duff's problem was sending multiple src bytes to the same destination, similar to sending a string on a UART where to points to the send register location (of course his machine did not require waiting between bytes! 
    send (uint8_t* to, uint8_t* from, uint16_t count) { do { /* count > 0 assumed */ *to = *from++; } while(--count > 0); } The problem with this code is that the post-increment together with the assignment is a fairly cheap operation, but the decrement of the count, comparison to zero together with the jump is more than twice that expense, and doing this all once for every byte means we are spending a lot more time on the loop than copying the data!
    Duff realized that he could change the time spent by any ratio he desired which would allow him to make a tradeoff at will between code space and speed. Of course the mileage you get will vary based on what you are doing in the loop and the capability of the processor to do the +7 and /8 math, but if you keep to powers of 2 and the divide becomes a simple shift, or even a swap and a mask like you have for 16, this can yield impressive results. 
    The solution looks like this:
    send (uint8_t* to, uint8_t* from, uint16_t count) { register uint16_t n = (count + 7) / 8; switch (count % 8) { case 0: do { *to = *from++; case 7: *to = *from++; case 6: *to = *from++; case 5: *to = *from++; case 4: *to = *from++; case 3: *to = *from++; case 2: *to = *from++; case 1: *to = *from++; } while (--n > 0); } } Just to test that it works, let's go through 2 examples.
    If we pass in a count of 1 the following happens
    the variable n will become (1+7)/8 = 1 the switch will jumpt to (1%8) = "case 1" n-- becomes 0  and the while loop terminates immediately and we are done If we pass in a count of 12 the following happens:
    the variable n will become (12+7)/8 = 2 the switch will jumpt to (12%8) = "case 4" The unrolled loop will execute and fall through 4 case statements n-- becomes 1  and the while loop jumps to "case 0" The loop executes 8 more times n-- becomes 0  and the while loop terminates and we are done. As you can see the loop was executed 4 + 8  = 12 times but the loop variable (n) was only decremented twice and the jump happend 2 times instead of 12 for around 80% saving in loop costs.
    This concept is easily expanded to something like memcpy which increments both the from and to pointers.
    As always there is a link at the end with a zipped MPLAB-X project you can use with the simulator to play with Duff's Device!
    References
    Wikipedia page on Duff's Device 
    Duff's device entry in The Jargon File 
    Source Code
     

×
×
  • Create New...