Jump to content
Β 

All Activity

This stream auto-updates     

  1. Last week
  2. Problem #14: I2C bus repeaters. A lot of them use miniscule voltage shifts to determine which end is driving the bus so it can drive the signal the right way. If you get them to work reliably, go out and buy some lottery tickets before it's too late.
  3. Earlier
  4. It has been said that software is the most complicated system humanity has created and like all complicated things we manage this complexity by breaking the problem into small problems. Each small problem is then solved and each solution is assembled to create larger and larger solutions. In other fields of work, humans created standardized solutions to common problems. For example, nails and screws are common solutions to the problem of fastening wood together. Very few carpenters worry about the details of building nails and screws, they simply use them as needed. This practice of creating common solutions to typical problems is also done in software. Software Libraries can easily be used to provide drivers, and advanced functions saving a developer many hours of effort. To make a software library useful, the library developer must create an abstraction of the problem solved by the library. This abstraction must interact with the library user in a simple way and hide the specialist details of the problem. For example, if your task is to convert RGB color values into CMYK color values, you would get a library that had a function such as this one: struct cmyk { float cyan; float magenta; float yellow; float black; }; struct rgb { float red; float green; float blue; }; struct cmyk make_CMYK_from_RGB(struct rgb); This seems very simple and it would absolutely be simple to use. But, if you had to implement such a function yourself you may quickly find your self immersed in color profiles and the behavior of the human eye. All of this complexity is hidden behind a simple function. In the embedded world we often work with hardware and we are very used to silicon vendors providing hardware abstraction layers. These hardware abstraction layers are an attempt to simplify the use use of a vendors hardware and to make it more complicated to switch the hardware to a competing system. Let us go into a little more detail. Here is a typical software layer cake as drawn by a silicon vendor. Often they will provide the bottom 3 layers and even a few demo applications. The hope is you will create your application using their libraries. The benefit for you is a significant time savings (you don't need to make your nails and screws). The benefit to the silicon vendor is getting you locked into a proprietary solution. Here is a short story about the early "dark ages" of computing before PC's had reasonable device drivers (hardware abstraction). In the early days of PC gaming all PC games run in MSDOS. This was to improve game performance be removing any software that was not specifically required. The only sound these early PC had was a simple buzzer so a large number of companies developed a spectacular number of sound cards. There were so many sound cards that PC games could not keep up adding sound card support. Each game had a setup menu where the sound card was selected along with its I/O memory, IRQ, and other esoteric parameters. We had to write the HW configuration down on a cheat sheet and each time we upgraded we had to juggle the physical configuration of our PC (with jumpers) so everything ran without conflict. Eventually, the sound blaster card became the "standard" card and all other vendors either designed their HW to be compatible or wrote their drivers to look just like the sound blaster drivers and achieve compatibility in software. Hardware abstraction has the goal of creating a Hardware interface definition that allows the hardware to present the needed capabilities to the running application. The hardware can have many additional features and capabilities but these are not important to the application so they are not part of the interface. So abstraction provides a simplification by hiding the stuff the application does not care about. The simplification comes from focusing just on the features the application does care about. So if the silicon vendors are providing these abstractions, life can be only good!... We should look a little more closely. Silicon is pretty cheap to make but expensive to design. So each micro controller has a large number of features on each peripheral in the hopes that it will find a home in a large number of applications. Many of these features are mutually exclusive such as synchronous vs asynchronous mode in the EUSART on a PIC micro controller. These features are all well documented in the data sheets but at some point it was decided across the entire industry that if there were functions tied to each feature they would be easier to use. Here is an example from MCC's MSSP driver in SPI mode: void SPI2_ClearWriteCollisionStatus(void) { SSP2CON1bits.WCOL = 0; } Now it may be useful to have a more readable name for the WCOL flag and perhaps ClearWriteCollisionStatus does make the code easier to use. The argument is that making this function call will be more intuitive than clearing the WCOL bit. As you go through many of the HAL layers you find hundreds of examples of very simple functions setting or clearing a few bits. In a few cases you will find an example where all the functions are working together to create a specific abstraction. Most cases, you simply find the HW flags hidden behind more verbosely named functions. Simply renaming the bits is NOT a hardware abstraction. In fact, if the C compiler does not automatically inline these functions they are simply creating overhead. Sadly there is another problem in this mess. The data sheets are very precisely written documents that accurately describe the features of the hardware. Typically these datasheets are written with references to registers and bits. If the vendor provides a comprehensive function interface to the hardware, the data sheet will need to be recreated with function calls and function use examples rather than the bits and registers. In my opinion the term HAL (Hardware Abstraction Layer) has been hijacked to represent a function call interface to all the peripheral functions. What about Board Support Package (BSP)? Generally the BSP is inserted in the layer cake to provide a place for all the code that enables the vendor demo code to run on the "HAL". Arguably, the BSP is what the purist would call the HAL. Enough of the ranting....How does this topic affect you the hapless developer who is likely using vendor code. Silicon Vendors will continue to provide HAL's to interface the hardware, Middleware to provide customers with high function libraries and board support packages to link everything to their proprietary demo boards. As customers, we can evaluate their offering on their systems but we can expect to write our own BSP's to get the rest of their software running on our final product hardware. Software Vendors will continue to provide advanced libraries, RTOS's and other forms of middleware for us to buy to speed our development. The ability to adapt this software to our systems largely depends upon how well the software vendor defines the expected interfaces that we need to provide. Sometimes these vendors can be contracted to get their software running on our hardware and and get us going. FW engineers will continue to spend a significant part of the the project nudging all these pieces into one cohesive system so we can layer our secret sauce on top. One parting comment. Software layers are invented to allow large systems to take advantage of the single responsibility principle. This is great, but if you use too many layers you end up with a new problem called Lasagna code. If you use too few layers you end up with Spaghetti code. One day I would love to know why Italian food is used to name two of the big software smells. Good Luck
  5. This happens from time to time, I have also had periods where nobody could post anything. If that happens please do report it here, it may just help some poor soul who is desperately looking for help πŸ™‚
  6. so maybe it is just slow.................. The webpage at https://www.microchip.com/WebResource.axd?d=s7-zxpMGrfzFqvNoaI5niI88ie3FQikPEzifp9mbqcCCHkN6bVuhHFXDt5AwXEeSdy-rc6bi250kqB7UpkayZtTH4PI1wR37d59huswsxA-21q1hAju4GpYquocl5j0SE3aFtFc2yhhPu09xC3nHWviD5yZKNkkmIzv5gnwfrw7aisbg0&t=636904521560000000 might be temporarily down or it may have moved permanently to a new web address. ERR_NAME_RESOLUTION_FAILED
  7. This morning I find Microchip.com is down 07:15 PDT ( GMT - 8 )
  8. Something that comes up all the time, PWM resolution. Engineers are often disappointed when they find out that the achievable resolution of the PWM is not nearly what they expected from the headline claims made in the microcontroller datasheet. Here is just one example of a typically perplexed customer. Most of the time this is not due to dishonest advertising but rather a easily overlooked property of how PWM's work, so lets clear that up so that we can avoid the disappointment. A conventional PWM will let you set the period and the duty cycle something like the image to the right shows. In the picture Tsys represents the clock of the PWM, and Tpwm shows the period of the PWM. In hte example the perod (Tpwm) of the PWM is 4x the system clock. Additionally you can set the duty cycle register, which will let you choose for how many of the Tsys clocks that fit into one Tpwm the output should remain high. In the example the duty cycle is set to 3. It is typical for a microcontroller datasheet to advertise that it can accomodate a duty cycle of 12-bits or 16-bits or something in that order. Of course this is an achievable number of clocks for the PWM to remain high, but the range of the number is always going to be limited by the period of the PWM. That means that if we select the period to be 2^16 = 65536 clocks, then we will also be able to control the duty cycle up to 16 bits. It is easy to make the mistake of believing that you will get 16 bits of resolution over the achievable range, but this is very rarely the case. Let's look at some real numbers as an example using the PIC16F1778. The first page from the datasheet can be seen to the right. It advertises here that the PWM on this device is 16-bit. Importantly it also shows that the timer capability is limited to 16-bit. Looking at the PWM's on this device we will try to see what is the highest frequency (lowest period) at which we can get 16-bits of PWM resolution. The fastest clock this PWM can use as it's time base is the system clock, which is limited to 32MHz on this device. That means in terms of Figure 1 above that Tsys would be the period of one clock at 32MHz = 31.25ns. If we want to achieve the full resolution of the PWM we have to run the timer at it's 16-bit limit, which means that the PWM frequency will be 32MHz / 2^16 = 488Hz ! So if you need the PWM frequency to be anything more than that you will have to compromise on the resolution in order to achieve a faster switching frequency. Typically engineers will try to run at switching frequencies above 20kHz because this is roughly the audible range of the human ear. If you switch at a lower frequency people will hear a hum or a high pitched tone which can be very irritating. So let's say we compromise to the lowest limit here and try to run the PWM at 20kHz, how much of the PWM resolution will we be giving up by using a higher frequency? The easiest way to calculate this is to simply realize that one clock is 31.25ns and the resolution of the PWM will be limited to how many times 31.25ns fits into the period of the PWM. At 488Hz the period is 1/488Hz = 2ms and we can calculate that 2ms/31.25ns = 65536. We can determine how many bits are required to represent that by taking log(65536)/log(2) = 16-bits of resolution. This would mean that to get the number of usable options for duty cycle at 20kHz we need to calculate (1/20kHz)/31.25ns = 1600. So with 1600 divisions the resolution of the PWM is reduced to log(1600)/log(2) = 10.64bits, which means that we achieve slightly better than 10-bits of resolution. This is the point where people are usually unhappy that the advertised 16-bits of resolution has somehow evaporated and turned into only 10 bits! So the advice I have for you is this. When selecting a device where you have PWM resolution requirements you better make sure you do all the math to make sure that you can run at the resolution you need with the clocks you have available. And remember that when it comes to PWM resolution the PWM clock speed is always going to be king, so it is typically better to select the device with the higher clock speed instead of the one that claims the highest PWM resolution (at a snail's pace...) And you you feel adventurous you can always try something more exotic like using the NCO to generate a high resolution PWM at high frequencies as described in this application note
  9. I2C is such a widely used standard, yet it has caused me endless pain and suffering. In principle I2C is supposed to be simple and robust, a mechanism for "Inter-Integrated Circuit" communication. I am hoping that this summary of the battle scars I have picked up from using I2C might just save you some time and suffering. I have found that despite a couple of typical initial hick-ups it is generally not that hard to get I2C communication going, but making it robust and reliable can prove to be a quite a challenge. Problem #1 - Address Specification I2C data is not represented as a bit-stream, but rather a specific packet format with framing (start and stop conditions) preceded by an address, which encapsulates a sequence of 8-bit bytes, each followed by an ACK or NAK bit. The first byte is supposed to be the address, but right from the bat, you have to deal with the first special case. How to combine this 7-bit address with the R/W bit always causes confusion. There is no consistency in datasheets of I2C slave devices for specifying the device address, and even worse most vendors fail to specify which approach they use, leaving users to figure it out through trial and error. This has become bad enough that I would not recommend trying to implement I2C without an oscilloscope in hand to resolve these kinds of guessing games. Let's say the 7-bit device address was 0x76 (like the ever-popular Bosh Sensortech BME280). Sometimes this will be specified simply as 0x76, but the API in the software library, in order to save the work of shifting this value by 1 and masking in the R/W bit will often require you to pass in 0xEC as the address (0x76 left-shifted by one). Sometimes the vendor will specify 0xEC as the "write" address and 0xED as the "read" address. To add insult to injury your bus analyzer or Saleae will typically show the first 8-bits as a hex value so you will never see the actual 7-bit address as a hex number on the screen, leaving you to be bit twiddling in your head on a constant basis while trying to make sense of the traces. Problem #2 - Multiple Addresses To add to the confusion from above many devices (like the BME280) has the ability to present on more than one address, so the datasheet will specify that (in the case of the BME280) if you pull down the unused SDO pin on the device it's address will be 0x76, but if you pull the pin up it will be 0x77. I have seen many users leave this "unused" pin floating in their layouts, causing the device to schizophrenically switch between the 2 addresses at runtime and behavior to look erratic. This also, of course, doubles the number of possible addresses the device may end up responding to, and the specification of exactly 2 addresses fools a lot of people into thinking that the vendor is actually specifying a read and write address as described above. This all adds to the guessing game of what the actual device address may be. To add to the confusion most devices have internal registers and these also have their own addresses, so it is very easy to get confused about what should go in the address byte. It is not the register address, it is the slave address, the register address goes in the data byte of the "write" you need to use if you want to do a "read", in order to read a register from a specific address on the slave. Ok, if that is not confusing to you I salute you sir! Problem #3 - 10-bit address mode As if there was not enough address confusion already the limitation of only 127 possible device addresses lead to the inclusion of an extension called 10-bit addressing. A 10-bit address is actually a pre-defined 5-bits, followed by the 2 most significant bits of the 10-bit address, then the R/W bit, after this an Ack from all the devices on the bus using 10-bit addressing with the same 2 MSB addresses, and after this the remaining 8 bits of the address followed by the real/full address ack. So once again there is no standard way to represent the 10-bit address. Let's say the device has 10-bit address 0x123, how would this be specified now? The vendor could say 0x123 (and only 10 of the 12 bits implied are the 10-bit address), or they could include the prefix and specify it as 0xF223. Of course that number contains the R/W bit in the middle somewhere, so they may specify a "read" and a "write" address as 0xF223 and 0xF323, or they could right-shift the high-byte to show it as a normal 7-bit address, removing the R/W bit, and say it is 0x7123. I think you get the picture here, lots of room for confusion and we have not even received our first ACK yet! Problem #4 - Resetting during Debugging Since I2C is essentially transaction/packet based and it does not include timeouts in the specification (SMBUS does of course, but most slave sensors conform to I2C only) there is a real chance that you are going to reset your host processor (or bus master) in the middle of such a transaction. This happens as easily as re-programming the processor during development (which you will likely be doing a lot). The problem that tends to catch everybody at some point is that a hardware reset of your host processor is entirely invisible to the slave device which does not lose power when you toggle the master device's reset pin! The result is that the slave thinks that it is in the middle of an I2C transaction and awaits the expected number of master clock pulses to complete the current transaction, but the master thinks that it should be creating a start condition on the bus. This often leads to the slave holding the data line low and the master unable to generate a start condition on the bus. When this happens you will lose the ability to communicate with the I2C sensor/slave and start debugging your code to find out what has broken. In reality, there is nothing wrong with your code and simply removing and re-applying the power to the entire board will cause both the master and slave to be reset, leaving you able to communicate again. Of course, re-applying the power typically causes the device to start running, and if you want to debug you will have to attach the debugger which may very well leave you in a locked-up state once again. The only way around this is to use your oscilloscope or Saleae all of the time and whenever the behavior seems strange stare very carefully at what is happening with the data line, is the address going out, is the start condition recognized and is the slave responding as it should, if not you are stuck and need to reset the slave device somehow. Problem #5 - Stuck I2C bus The situation described in #4 above is often referred to as a "stuck bus" condition. I have tried various strategies in the past to robustly recover from such a stuck bus condition programmatically, but they all come with a number of compromises. Firstly slave devices are essentially allowed to clock-stretch indefinitely, and if a slave device state machine goes bonkers it is possible that a single slave device can hold the entire bus hostage indefinitely and the only thing you can possibly do is remove the power from all slave devices. This is not a very common failure mode but it is definitely possible and needs addressing for robust or critical systems. Often getting the bus "unstuck" is as simple as providing the slave device enough clocks to convince it that the last transaction is complete. Some slaves behave well and after clocking them 8 times and providing a NAK they will abort their current transaction. I have seen slaves, especially I2C memories, where you have to supply more than 8 clocks to be certain that the transaction terminates, e.g. 32 clocks. I have also seen specialized slave devices that will ignore your NAK's and insist on sending even more data e.g. 128 or more bits before giving up on an interrupted transaction. The nasty part about getting an I2C bus "unstuck" is that you usually not use the I2C peripheral itself to do this service. This means that typically you will need to disable the peripheral, change the pins to GPIO mode and bit-bang the clocks you need out of the port, after which you need to re-initialize the I2C peripheral and try the next transaction, and if this fails then rinse and repeat the process until you succeed. This, of course, is expensive in terms of code space, especially on small 8-bit implementations. Problem #6 - Required Repeated Start conditions The R/W bit comes back to haunt us for this one. The presence of this bit implies that all transactions on I2C should be uni-directional, that is they must either read or write, but in practice, things are not that simple. Typically a sensor or memory will have a number of register locations inside of the device and you will have to "write" to the device to specify which location you wish to address, followed by "reading" from the device to get the data. The problem with a bus is that something may interrupt you between these two operations that form one larger transaction. In order to overcome this limitation, I2C allows you to concatenate 2 I2C operations into a single transaction by omitting the stop condition between them. So you can do the write operation and instead of completing it with a stop condition on the bus you can follow with a second start condition and the latter half of the operation, terminating the whole thing with a stop condition only when you are done. This is called a "repeated start" condition and looks as follows (from the BME280 datasheet). It can often be quite a challenge to generate such a repeated start condition as many I2C drivers will require you to specify read/write and a pointer and number of bytes and not give you the option to omit the stop condition, and many slave devices will reset their state machines at a stop condition so without a repeated start it is not possible to communicate with these devices. Of course, I should also mention that the requirement to send the slave address twice for these transactions significantly reduces the throughput you can get through the bus. Problem #7 - What is Ack and Nak supposed to be? This brings us to the next problem. It is quite clear that the Address is ack-ed by the slave, but when you are reading data what is the exact semantics of the ack/nak? The BME280 datasheet is a bit unique in that it clearly distinguishes in that figure in #6 above whether the ack should be generated by the master or the slave (ACKS vs ACKM), but from the specification, it is not immediately clear. If I read data from a slave, who is supposed to provide the ack at the end of the data? Is this the master or the slave? What would be the purpose of the master providing an ack to the slave to data? Clearly, the master is alive as it is generating clocks, and the slave may be sending all 1's which means it does not touch the bus at all. So what is the slave supposed to do if the master makes a Nak in response to a byte? And how would a slave determine if it should Nak your data since there is no checksum or CRC on it there is no way to determine if it is correct? None of this is clearly specified anywhere. To add confusion I have seen people spend countless hours looking for the bug in their BME280 code which causes the last data byte to get a NAK! When you look on the bus analyzer or Oscilloscope you will be told that every byte was followed by an ACK except for the last one where you will see a NAK. Most people interpret this NAK to be an indication that something is wrong, but no, look carefully at the image from the datasheet in section #6 above! Each byte received by the master is followed by an ACKM (ack-ed by the master) EXCEPT for the last byte, in which case the master will not ACK it, causing a NAK to proceed the stop condition! To make this even harder, most I2C hardware peripherals will not allow you fine-grained control of whether the master will ACK or NAK. Very often the peripheral will just blithely ack every byte that it reads from the slave regardless. Problem #8 - Pull-up resistors and bus capacitance The I2C bus is designed to be driven only through open-drain connections pulling the bus down, it is pulled up by a pair of pull-up resistors (one on the clock line and one on the data line). I have seen many a young engineer struggle with unreliable I2C communication due to either the entire lack of or incorrect pull-up resistors. Yes it is possible to actually communicate even without the resistors due to parasitic pull-ups which will be much larger than required, meaning that it will pull weakly enough to get the bus high-ish and under some conditions can provoke an ack from a slave device. There is no clear specification of what the size of these pull-up resistors should be, and for good reason, but this causes a lot of uncertainty. The I2C specification does specify that the maximum bus capacitance should be 400pF. This is a pretty tricky requirement to meet if you have a large PCB with a number of devices on the bus and it is often overlooked, so it is typical to encounter boards where the capacitance is exceeding the official specification. In the end, the pull-up needs to be strong enough (that is small enough) to pull the bus to Vdd fast enough to communicate at the required bus speed (typically 100kHz or 400kHz). The higher the bus capacitance is the stronger you will have to pull up in order to bring the bus to Vdd in time. If you look at the Oscilloscope you will see that the bus goes low fairly quickly (pulled down strongly to ground) but goes up fairly slowly, something like this: As you can see in the trace to the right there are a number of things to consider. If your pull-ups are too large you will get a lot of interference as indicated by the red arrows in the trace where the clock line is coupling through onto the data line which has too high an impedance. This can often be alleviated with good layout techniques, but if you see this on your scope consider lowering the pull-up value to hold the bus more steady. If the rise times through the pull-up are too slow for the bus speed you are using you will have to either work on reducing the capacitance on the bus or pulling up harder through a smaller resistor. Of course, you cannot just tie the bus to Vdd in the extreme as it still needs to be pulled to 0 by the master and slaves. As a last consideration, the smaller the resistor is you use the more power you will consume while driving the bus. Problem #9 - Multi-master I have been asked many times to implement multi-master I2C. There are a large number of complications when you need multiple masters on an I2C bus and this should only be attempted by true experts. Arbitrating the bus when multiple masters are pulling it down simultaneously presents a number of race conditions that are extremely hard to robustly deal with. I would like to just point out one case here as illustration and leave it at that. Typical schemes for multi-master will require the master to monitor the bus while it is emitting the address byte. When the master is not pulling the bus low, but it reads low this is an indication that another master is trying to emit an address at the same time and you as a master should then abort your transaction immediately, yielding the bus to the other master. A problem arises though when both masters are trying to read from the same slave device. When this happens it is possible that both addresses match exactly and that the 2 masters start their transactions in close proximity. Due to the clock skew between the masters, it is possible that they are trying to read from different control registers on the slave, that the slave will match only one of the 2 masters, but both masters will think that they have the bus and the slave is responding to their request. When this happens the one master will end up receiving incorrect data from the wrong address. Consider e.g. a BME280 where you may get the pressure reading instead of humidity, causing you to react incorrectly. Like I said there are many obscure ways multi-master can fail you, so beware when you go there. Problem #10 - Clock Stretching In the standard slaves are allowed to stretch the clock by driving the clock line low after the master releases it. Clock stretching slaves are a common cause of I2C busses becoming stuck as the standard does not provide for timeouts. This is something where SMBUS has provided a large improvement over the basic I2C standard, although there can still be ambiguity around how long you really have to wait to ensure that all slaves have timed out, and the idea with SMBUS is that you can safely mix with non-SMBUS slaves, but this one aspect makes it unreliable to do so. In critical systems, you will as a result very often see 2 I2C slave devices connected via different sets of pins, using I2C as a point to point communications channel instead of a bus in order to isolate failure conditions to a single sensor. Problem #11 - SMBUS voltage levels In I2C logical 1 voltage levels depends on the bus voltage and are above 70% of bus voltage for a 1 and below 30% for a 0. The problems here are numerous, resulting in different devices seeing a 0 or 1 at different levels. SMBUS devices do not use this mechanism but instead specify thresholds at 0.8v and 2.1v. These levels are often not supported by the microcontroller you are using leaving some room for misinterpretation, especially if you add the effects of bus capacitance and the pull-up resistors to the signal integrity. For more information about SMBUS and where it differs from the standard I2C specification take a look at this WikiPedia page. Problem #12 - NAK Polling NAK polling often comes into play when you are trying to read from or write to an I2C memory and the device is busy. These memory devices will use the NAK mechanism to signal the master that he has to wait and retry the operation in a short while. The problem here is that many hardware I2C peripherals simply ignore acks and nak's altogether or does not give you the required hooks to respond to these. Many vendors try to accelerate I2C operations by letting you pre-load a transaction for sending to the slave and doing all of the transmission in hardware using a state machine, but these implementations rarely have accommodations for retrying the byte if the slave was to NAK it. NAK-polling also makes it very hard to use DMA for speeding up I2C transmissions as once again you need to make a decision based on the Ack/Nak after every byte, and the hooks to make these decisions typically require an interrupt or callback at the end of every byte which causes huge overhead. Problem #13 - Bus Speeds When starting to bring up an I2C bus I often see engineers starting with one sensor and working their way through them one by one. This can lead to a common problem where the first sensor is capable of high-speed transmission e.g. 1MHz, but you only need 1 sensor on the bus that is limited to 100KHz and this can cause all kinds of intermittent failures. When you have more than 1 slave on the same bus make sure that the bus is running at a speed that all the slaves can handle, this means that when you bring up the bus it is always a good idea to start things out at 100kHz and only increase the speed once you have communication established with all the slaves on the bus. The more slaves you have on the bus the more likely you will be to have an increased bus capacitance and signal integrity problems. In conclusion I2C is quite a widely used standard. When I am given the choice between using I2C or something like SPI for communication with sensor devices I tend to prefer SPI for a number of reasons. It is possible to go much faster using SPI as the bus is driven hard to both 0 and 1, the complexities of I2C and the problems outlined above inevitably raises the complexity significantly and presents a number of challenges to achieving a robust system. To be clear I am not saying I2C is not a robust protocol, just that it takes some real skill to use it in a truly robust way, and other protocols like SPI do not require the same effort to achieve a robust solution. So like the Plain White T's say, hate is a strong word, but I really really don't like I2C ... I am sure there are more ways I2C has bitten people, please share your additional problems in the comments below, or if you are struggling with I2C right now, feel free to post your problem and we will try to help you debug it!
  10. Here I am on a clear cool evening, by the fire outside with my laptop. Tonight I will talk about a new peripheral, the timer/counter. Periodically I will be interrupted to put another log on my fire, but it should not slow me down too much. Timers and counters are almost the same peripheral with only the difference of what is causing the counter to count. If the counter is incrementing from a clock source, it becomes a timer because each count registers the passage of a precise unit of time. If the counter is incrementing from an unknown signal (perhaps not even a regular signal), it is simply a counter. Clearly, the difference between these is a matter of the use-case and not a matter of design. Through there are some technical details related to clocking any peripheral from an external "unknown" clock that is not synchronized with the clocks inside the microcontroller. We will happy ignore those details because the designers have done a good job of sorting them out. Let us take a peek at a very simple timer on the PIC16F18446. Turn to page 348 of your PIC16F18446 data sheet and take a look at figure 25-1. (shown below) This is the basic anatomy of a pretty vanilla timer. Of course most timers have many more features so this article is simply an introduction. On the left side of this image there are a number of clock sources entering a symbol that represents a multiplexer. A multiplexer is simply a device that can select one input and pass it to its output. The T0CS<2:0> signal below the multiplexer is shorthand for a 3-bit signal named T0CS. The slash 3 also indicates that that is a 3-bit signal. Each of the possible 3-bit codes is inside the multiplexer next to one of the inputs. This indicates the input you will select if you apply that code on the signal T0CS. Pretty simple. The inputs are from top to bottom (ignoring the reserved ones) SOSC (Secondary Oscillator), LFINTOSC (Low Frequency Internal Oscillator), HFINTOSC (High Frequency Internal Oscillator) Fosc/4 (The CPU Instruction Clock) an input pin (T0CKI) inverted or not-inverted. Let us cover each section of this peripheral in a little more detail. Of course you should go to the data sheet to read all the information. SOSC The secondary oscillator is a second crystal oscillator on 2 I/O pins. A crystal oscillator is a type of clock that uses a piece of quartz crystal to produce a very accurate frequency. The secondary oscillator is designed to operate at 32.768kHz which by some coincidence is 2^15 counts per second. This makes keeping accurate track of seconds very easy and very low power. You could configure the hardware to wake up the CPU every second and spend most of your time in a low power sleep mode. LFINTOSC and HFINTOSC There are two internal oscillator in the PIC16F18446. The LFINTOSC is approximately 31kHz and is intended for low power low speed operation but not very accurate timing. The HFINTOSC is adjustable from 1-32MHz and is better than 5% accurate so it is often sufficient for most applications. Because these two oscillators are directly available to the timer, the CPU can be operating at a completely different frequency allowing high resolution timing of some events, while running the CPU at a completely different frequency. FOSC/4 This option is the simplest option to select because most of the math you are doing for other peripherals is already at this frequency. If you are porting software for a previous PIC16 MCU, the timer may already be assumed to be at this frequency. Due to historical reasons, a PIC16 is often clocked at 4MHz. This makes the instruction clock 1MHz and each timer tick is 1us. Having a 1us tick makes many timing calculations trivial. If you were clocking at 32MHz, each tick would be 31ns which is much smaller but does not divide as nicely into base 10 values. T0CKI This option allows your system to measure time based upon an external clock. You might connect the timing wheel of an engine to this input pin and compute the RPM with a separate timer. Prescaler After the input multiplexer, there is an input pre-scaler. The goal of the pre-scaler is to reduce the input clock frequency to a slower frequency that may be more suitable for the application. The most prescallers are implemented as a chain of 'T' flip-flops. A T flip-flop simply changes its output (high to low or low to high) on each rising edge of an input signal. That makes a T Flip-Flop a divide by 2 circuit for a clock. If you have a chain of these and you use a multiplexer to decide which T flip flop to listen to, you get a very simple divider that can divide by some power of 2. i.e. 2, 4, 8, 16... with each frequency 1/2 of the previous one. Synchronizer The synchronizer ensures that input pulses that are NOT sourced by an oscillator related to FOSC are synchronized to FOSC. This synchronization ensures reliable pulses for counting or for any peripherals that are attached to the counter. However, synchronization requires the FOSC/4 clock source to be operating and that condition is not true in when the CPU is saving power in sleep. If you are building an alarm clock that must run on a tiny battery, you will want the timer to operate while the CPU is sleeping and to produce a wakeup interrupt at appropriate intervals. To do this, you disable synchronization. Once the CPU has been awakened, it is a good idea to activate synchronization or to avoid interacting with the counter while it is running. TMR0 Body The TMR0 body used to be a simple counter, but in more recent years it has gained 2 options. Either, the timer can be a 16-bit counter, or it can be an 8-bit counter with an 8-bit compare. The 8-bit compare allows the timer to be reset to zero on any 8-bit value. The 16-bit counter allows it to count for a longer period of time before an overflow. The output from the TMR0 body depends upon the module. In the 8-bit compare mode, the output will be set each time there is a compare match. In the 16-bit mode, the output will be set each time the counter rolls from 0xFFFF to 0x0000. Output The output from the core can be directed to other peripherals such as the CLC's, it can also be sent through a postscaler for further division and then create an interrupt or toggle an output on an I/O pin. The postscaler is different than the prescaler because it is not limited to powers of two. It is a counting divider and it can divide by any value between 1 and 16. We shall use that feature in the example. Using the Timer Timers can be used for a great number of things but one common thing is to produce a precise timing interval that does not depend upon your code. For instance, 2 lessons ago, we generated a PWM signal. The one way to do this was to set and clear the GPIO pin every so many instruction cycles. Unfortunately, as we added code to execute the PWM would get slower and slower. Additionally, it could get less reliable because the program could take different paths through the code. Using the PWM peripheral was the perfect solution, but another solution would be to use a timer. For instance, you could configure the timer to set the output after an interval. After that interval had elapsed, you could set a different interval to clear the output. By switching back and forth between the set interval and the clear interval, you would get a PWM output. Still more work than the PWM peripheral, but MUCH better than the pure software approach. For this exercise we will use the timer to force our main loop to execute at a fixed time interval. We will instrument this loop and show that even as we add work to the loop, it still executes at the same time interval. This type of structure is called an exec loop and it is often used in embedded programming because it ensures that all the timing operations can be simple software counting in multiples of the loop period. And here is the program. void main(void) { TRISAbits.TRISA2 = 0; // Configure the TRISA2 as an output (the LED) T0CON1bits.ASYNC = 0; // Make sure the timer is synchronized T0CON1bits.CKPS = 5; // Configure the prescaler to divide by 32 T0CON1bits.CS = 2; // use the FOSC/4 clock for the input // the TMR0 clock should now be 250kHz TMR0H = 250; // Set the counter to reset to 0 when it reaches 250 (1ms) TMR0L = 0; // Clear the counter T0CON0bits.T0OUTPS = 9; // Configure the postscaler to divide by 10 T0CON0bits.T0EN = 1; // turn the timer on // the timer output should be every 10ms while(1) { LATAbits.LATA2 = 0; // Turn on the LED... this allows us to measure CPU time __delay_ms(5); // do some work... could be anything. LATAbits.LATA2 = 1; // Turn off the LED... Any extra time will be while the LED is off. while(! PIR0bits.TMR0IF ); // burn off the unused CPU time. This time remaining could be used as a CPU load indicator. PIR0bits.TMR0IF = 0; // clear the overflow flag so we can detect the next interval. } } I chose to use a delay macro to represent doing useful work. In a "real" application, this area would be filled with all the various functions that need to be executed every 10 milliseconds. If you needed something run every 20 milliseconds you would execute that function every other time. In this way, many different rates can be easily accommodated so long as the total execution time does not exceed 10 milliseconds because that will stretch a executive cycle into the next interval and break the regular timing. Special Considerations One interesting effect in timers is they are often the first example of "concurrency issues" that many programmers encounter. Concurrency issues arise when two different systems access the same resource at the same time. Quite often you get unexpected results which can be seen as "random" behavior. In the code above I configured the timer in 8-bit mode and took advantage of the hardware compare feature so I never needed to look at the timer counter again. But let us imagine a slightly different scenario. Imagine that we needed to measure the lap time of a race car. When the car started the race we would start the timer. As the car crossed the start line, we would read the timer BUT WE WOULD NOT STOP IT. When the car finished the last lap, we could stop the timer and see the total time. IN this way we would have a record for every lap in the race. Simply by subtracting the time of completion for each lap, we would have each lap interval which would be valuable information for the race driver. Each time we read the timer without stopping it, we have an opportunity for a concurrency issue. For an 8-bit timer we can read the entire value with one instruction and there are no issues. However, the race is likely to last longer than we can count on 8-bits so we need a bigger timer. With a 16-bit timer we must perform 2 reads to get the entire value and now we encounter our problem. In the picture above I have shown two scenarios where TMR0 in 16-bit mode is counting 1 count per instruction cycle. This is done to demonstrate the problem. Slowing down the counting rate does not really solve the problem but it can affect the frequency of the issue. In this example the blue cell indicates the first read while the red cell indicates the second read to get all 16-bits. When the counter was 251, the reads are successful, however when the counter is 255, the actual value we will read will be 511 which is about 2x the actual value. If we reverse the read order we have the same problem. One solution is to read the high, then the low and finally, read the high a second time. With these three data points and some math, it is possible to reconstruct the exact value at the time of the first read. Another solution is in hardware. In the data sheet we see that there is some additional circuitry surrounding TMR0H. With this circuitry, the TMR0 will automatically read TMR0H from the counter into a holding register when TMR0L is read. So if you read TMR0L first and then TMR0H you will NEVER have the issue. Now consider the following line of C. timerValue = TMR0 It is not clear from just this line of code which byte of TMR0 is read first. If it is the low byte this line is finished and perfect. However, if it is the high byte, then we still have a problem. One way to be perfectly clear in the code is the following: timerValue = TMR0L; timerValue |= TMR0H << 8; This code is guaranteed to read the registers in the correct order and should be no less efficient. The left shift by 8 will probably not happen explicitly because the compiler is smart enough to simply read the value of TMR0H and store it in the high byte of timerValue. These concurrency issues can appear in many areas of computer programming. If your program is using interrupts then it is possible to see variables partially updated when an interrupt occurs causing the same concurrency issues. Some bigger computers use real-time operating systems to provide multi-tasking. Sharing variables between the tasks is another opportunity for concurrency issues. There are many solutions, for now just be aware that these exist and they will affect your future code. Timer 0 is probably the easiest timer on a PICmicrocontroller. It has always been very basic and its simplicity makes it the best timer to play with as you learn how they work. Once you feel you have mastered timer 0, spend some time with timer 1 and see what additional features it has. Once again, the project is in the attached files. Good Luck. exercise_6.zip
  11. I decided to write this up as bootloaders have pretty much become ubiquitous for 32-bit projects, yet I was unable to find any good information on the web about how to use linker scripts with XC32 and MPLAB-X. When you need to control where the linker will place what part of your code, you need to create a linker script which will instruct the linker where to place each section of the program. Before we get started you should download the MPLAB XC32 C/C++ Linker and Utilities Users Guide. There is also some useful information in the MPLAB XC32 C/C++ Compiler User’s Guide for PIC32M MCUs, the appropriate version for your compiler should be in the XC32 installation folder under "docs". This business of linker scripts is quite different from processor to processor. I have recently been working quite a bit with the PIC32MZ2048EFM100, so I will target this to this device using the latest XC32 V2.15. This post will focus on what you need to to do to get the tools to use your linker script. Since XC32 is basically a variant of the GNU C compiler you can find a lot of information on the web about how to write linker scripts, here is a couple. http://www.scoberlin.de/content/media/http/informatik/gcc_docs/ld_3.html https://sourceware.org/binutils/docs-2.17/ld/Scripts.html#Scripts Adding a linker script The default linker script for the PIC32MZ2048EFM100 can be found in the compiler folder at /xc32/v2.15/pic32mx/lib/proc/32MZ2048EFM100/p32MZ2048EFM100.ld. If you need a starting point that would be a good place. For MPLAB-X and XC32 the extention of the linker script does not have any meaning. The linker script itself is not compiled, it is passed into the linker at the final step of building your program. The command line should look something like this for a simple program: "/Applications/microchip/xc32/v2.15/bin/xc32-gcc" -mprocessor=32MZ2048EFM100 -o dist/default/production/mine.X.production.elf build/default/production/main.o -DXPRJ_default=default -legacy-libc -Wl,--defsym=__MPLAB_BUILD=1,--script="myscript.ld",--no-code-in-dinit,--no-dinit-in-serial-mem,-Map="dist/default/production/mine.X.production.map",--memorysummary,dist/default/production/memoryfile.xml" The linker script should be listed on the command line as "--script="name" When you create a new project MPLAB will create a couple of "Logical Fodlers" for you. These folders are not actual folders on your file system, but files in these are sometimes treated differently, and Linker Files is a particular case of this. My best advice is not to ever rename of in any other way mess with these folders created for you by MPLAB. If you did edit configurations.xml or renamed any of these I suggest you just create new project file as there are so many ways this could go wrong fixing it will probably take you longer than just re-creating it. I have seen cases where it all looks 100% but the IDE simply does not use the linker script, just ignoring it. The normal way to add files to a MPLAB-X project is to right-click on the Logical folder you wanted the file to appear in and select which kind of file under the "New" menu. In this menu files that you use often are shown as a shortcut, to see the entire list of possible files you need to select "Other..." at the bottom of the list. Unfortunatley Microchip has not placed "Linker Script" in this list, so there is no way to discover using the IDE how to add a linker script. When it all goes according to plan (the happy path) you can simply right-click on "Linker Files" and add your script. This is also what the manual says to do of course. When you have added the file it should look like this (pay careful attention to the icon of the linker script file, it should NOT have a source code icon. It should just be a white block like this, and if this is the case the program should compile just fine using the linker script, you can confirm that the script is being passed in by inspecting the linker command line. Adding a linker script - Problems - when it all goes wrong! I noticed in the IDE that the icon for the script was actually that of a .C source file. When this happens something has gone very wrong, and the compiler will attempt to compiler your linker script as a C source file. You will end up getting an error similar to this, stating that there is "No rule to make target": CLEAN SUCCESSFUL (total time: 51ms) make -f nbproject/Makefile-default.mk SUBPROJECTS= .build-conf make[2]: *** No rule to make target 'build/default/production/newfile.o', needed by 'dist/default/production/aaa.X.production.hex'. Stop. make[1]: Entering directory '/Users/cobusve/MPLABXProjects/aaa.X' make[2]: *** Waiting for unfinished jobs.... make -f nbproject/Makefile-default.mk dist/default/production/aaa.X.production.hex make[2]: Entering directory '/Users/cobusve/MPLABXProjects/aaa.X' make[1]: *** [.build-conf] Error 2 "/Applications/microchip/xc32/v2.15/bin/xc32-gcc" -g -x c -c -mprocessor=32MZ2048EFM100 -MMD -MF build/default/production/main.o.d -o build/default/production/main.o main.c -DXPRJ_default=default -legacy-libc make: *** [.build-impl] Error 2 make[2]: Leaving directory '/Users/cobusve/MPLABXProjects/aaa.X' nbproject/Makefile-default.mk:90: recipe for target '.build-conf' failed make[1]: Leaving directory '/Users/cobusve/MPLABXProjects/aaa.X' nbproject/Makefile-impl.mk:39: recipe for target '.build-impl' failed BUILD FAILED (exit value 2, total time: 314ms) I tried jumping through every hoop here, even did the hokey pokey but nothing would work to get the IDE to accept my linker script! I even posted a question on the forum here and got no help. At first I thought I would be clever and remove the script I just added, and just re-add it to the project, but no luck there. So now I was following the instructions exactly, my project was building without the script, I right-clicked on "Linker Files" selected "Add Existing Item" and then selected my script and once again it showed up as a source file and caused the project build to fail by trying to compile this as C code 😞 Next attempt was to remove the file, then close the IDE. Open the IDE, build the project and then after this add the existing file. Nope, still does not work 😞 I know MPLAB-X from time to time will cache information and you can get rid of this by deleting everything from the project except for your source files, Makefile and configurations.xml and project.xml. I went ahead and deleted all these files, restarted the IDE, added the file again - nope - still does not work. So much for RTFM! Eventually our of desperation I tried to rename the file before adding it back in. Even this did not work until I got lucky - I changed the extention of the file to gld (a commonly used extension for gnu linker files), and tried to re-add the file, and this eventually worked ! If you are having a hard time getting MPLAB-X to add your linker script, do not dispair. You are probably not doing anywthing wrong! The right way to add a linker script to your project is indeed to just add it to "Linker Files" as they say, sometimes you just get unlucky due to some exotic bugs in the IDE. Just remove the file from your project, change the extention to something else (it seems like you can choose anything as long as it is different) and add the file back in and it should work. If not come back here and let me know and we can figure it out together :)
  12. Ok great news I figured out what was going wrong! I was working with an old project file. The project was not using a linker script before. It turns out that MPLAB is doing all kinds of strange things in the background to figure out that it has to treat files in the Logical Folder called by "name=LinkerScript" and "displayname=Linker Files" as linker scripts instead of C files, and once it has gotten itself confused about this there is no going back without recreating the entire project file. Now since ours contained hundreds of source files we tried to avoid this but alas, turns out there is not really another way :( There is an example here https://www.microchip.com/forums/m651658.aspx on how to add the item back in. This seems to only work if you add it in AND rename the item BEFORE opening the project in MPLAB-X, if you open the project first you will be out of luck. For now you will have to do a lot of trial and error, or just re-create the project if you need to add a linker script, and even then good luck, the IDE can muck it up quite easily! I think I see a blog post coming on how to get a linker script into your MPLAB-X project. It seems to be harder than it should be! Edit: I have written up my experience in a blog entry here:
  13. That is covered!!! As explained in the article and demonstrated in the 5'th listing, printf is setup by the putch function. Of course I did not do something as mundane as Hello, world!, but I did produce floating point values for the graph.
  14. Remember, this timer counts up.and you get the interrupt when it rolls over. To interrupt at a precise interval, you must compute the number of "counts" required for that interval and then subtract from 65535 to determine the timer load value. void setTimer(unsigned int intervalCounts) { TMR1ON = 0; TMR1 = 65535 - intervalCounts; TMR1ON = 1; } By turning the timer off and then setting the counts and restoring the timer, you can be sure that you will not get unexpected behavior if the timer value is written in an unexpected order. I will cover this topic in the next blog post on timers.
  15. There is one problem we keep seeing bending people's minds over and over again and that is race conditions under concurrency. Race conditions are hard to find because small changes in the code which subtly modifies the timing can make the bug disappear entirely. Adding any debugging code such as a printf can cause the bug to mysteriously disappear. We spoke about this kind of "Heisenbug" before in the lore blog. In order for there to be a concurrency bug you have to either have multiple threads of execution (as you would have when using an RTOS) or, like we see more often in embedded systems, you are using interrupts. In this context an interrupt behaves the same as a parallel execution thread and we we have to take special care whenever both contexts of execution will be accesing the same variable. Usually this cannot be avoided, we e.g. receive data in the ISR and want to process it in the main loop. Whenever we have this kind of interaction where we need to access shared data which will be accessed by multiple contexts we need to serialize access to the data to ensure that the two contexts will safely navigate this. You will see that a good design will minimise the amount of data that will be shared between the two contexts and carefully manage this interaction. Simple race condition example The idea here is simply that we receive bytes in the ISR and process them in the main loop. When we receive a byte we increase a counter and when we process one we reduce the counter. When the counter is at 0 this means that we have no bytes left to process. uint8_t bytesInBuffer = 0; void interrupt serialPortInterrupt(void) { enqueueByte(RCREG); // Add received byte to buffer bytesInBuffer++; } void main(void) { while(1) { if ( bytesInBuffer > 0 ) { uint8_t newByte = dequeueByte(); bytesInBuffer--; processByte(newByte); } } } The problem is that both the interrupt and the mainline code access the same memory here and these interactions last for multiple instruction cycles. When an operation completes in a single cycle we call it "atomic" which means that it is not interruptable. There are a numbe of ways that instructions which seem to be atomic in C can take multiple machine cycles to complete. Some examples: Mathematical operations - these often use an accumulator or work register Assignment. If I have e.g. a 32bit variable on an 8-bit processor it can take 8 cycles or more to do x = y. Most pointer operations (indirect access). [This one can be particularly nasty btw.] Arrays [yes if you do i[12] that actually involves a multiply!] In fact this happens at two places here, the counter as well as the queue containing they bytes, but we will focus only on the first case for now, the counter "bytesInBuffer". Consider what would happen if the code "bytesInBuffer++" compiled to something like this: MOVFW bytesInBuffer ADDLW 1 MOVWF bytesInBuffer Similarly the code to reduce the variable could look like this: MOVFW bytesInBuffer SUBLW 1 MOVWF bytesInBuffer The race happens once the main execution thread is trying the decrement the variable. Once the value has been copied to the work register the race is on. If the mainline code can complete the calculation before an interrupt happens everything will work fine, but this will take 2 instructions. If an interrupt happens after the first instruction or after the 2nd instruction the data will be corrupted. Lets look at the case where there is 1 byte in the buffer and the main code starts processing this byte, but before the processing is complete another byte arrives. Logically we would expect the counter to be incremented by one in the interrupt and decremented by 1 in the mainline code, so the end result should be bytesInBuffer = 1, and the newly received byte should be ready for processing. The execution will proceed something like this - we will simplify to ignore clearing and checking of interrupt flags etc. (W is the work register, I_W is the interrupt context W register): // State before : bytesInBuffer = 1. Mainline is reducing, next byte arrives during operation Mainline Interrupt // state ... [bytesInBuffer = 1] MOVFW bytesInBuffer [bytesInBuffer = 1, W=1] SUBLW 1 [bytesInBuffer = 1, W=0] MOVFW bytesInBuffer [bytesInBuffer = 1, W=0,I_W=1] ADDLW 1 [bytesInBuffer = 1, W=0,I_W=2] MOVWF bytesInBuffer [bytesInBuffer = 2, W=0,I_W=2] MOVWF bytesInBuffer [bytesInBuffer = 0, W=0,I_W=2] ... [bytesInBuffer = 0] As you can see instead of ending up with bytesInBuffer = 0 instead of 1 and the newly received byte is never processed. This typically leads to a bug report saying that the serial port is either losing bytes randomly or double-receiving bytes e.g. UART (or code) having different behaviors for different baudrates - PIC32MX - https://www.microchip.com/forums/m1097686.aspx#1097742 Deadlock Empire When I get a new junior programmer in my team I always make sure they understand this concept really well by laying down a challenge. They have to defeat "The Deadlock Empire" and bring me proof that they have won the BossFight at the end. Deadlock Empire is a fun website which uses a series of examples to show just how hard proper serialization can be, and how even when you try to serialze access using mutexes and/or semaphores you can still end up with problems. You can try it out for youerself - the link is https://deadlockempire.github.io Even if you are a master of concurrency I bet you will still learn something new in the process! Serialization When we have to share data between 2 or more contexts it is critical that we properly serialize access to it. By this we mean that the one context should get to complete it's operation on the data before the 2nd context can access it. We want to ensure that access to the variables happen in series and not in parallel to avoid all of these problems. The simplest way to do this (and also the least desireably) is to simply disable interrupts while accessing the variable in the mainline code. That means that the mainline code will never be interrupted in the middle of an operation on the shared date and so we will be safe. Something like this: // State before : bytesInBuffer = 1. Mainline is reducing, next byte arrives during operation Mainline Interrupt // state ... [bytesInBuffer = 1] BCF GIE /* disable interrupts */ [bytesInBuffer = 1] MOVFW bytesInBuffer [bytesInBuffer = 1, W=1] SUBLW 1 [bytesInBuffer = 1, W=0] MOVWF bytesInBuffer [bytesInBuffer = 0, W=0] BSF GIE /* enable interrupts */ [bytesInBuffer = 0, W=0] MOVFW bytesInBuffer [bytesInBuffer = 0, W=0, I_W=0] ADDLW 1 [bytesInBuffer = 0, W=0, I_W=1] MOVWF bytesInBuffer [bytesInBuffer = 1, W=0, I_W=1] ... [bytesInBuffer = 1] And the newly received byte is ready to process and our counter is correct. I am not going to go into fancy serialization mechanisms such as mutexes and semaphores here. If you are using a RTOS or are working on a fancy advanced processor then please do read up on the abilities of the processor and the mecahnisms provided by the OS to make operations atomic, critical sections, semaphores, mutexes and concurrent access in general. Concurrency Checklist We always want to explicitly go over a checklist when we are programming in any concurrent environment to make sure that we are guarding all concurrent access of shared data. My personal flow goes something like this: Do you ever have more than one context of execution? Interrupts Threads Peripherals changing data (We problems with reading/writing 16-bit timers on 8-bit PIC's ALL OF THE TIME!) List out all data that is accessed in more than one context. Look deep, sometimes it is subtle e.g. accessing an array index may be calling multiply internally. Don't miss registers, especially 16-bit values like Timers or ADC results Is it possible to reduce the amount of data that is shared? Take a look at the ASM generated by the compiler to make 100% sure you understand what is happening with your data. Operations that look atomic in C are often not in machine instructions. Explicitly design the mechanism for serializing between the two contexts and make sure it is safe under all conditions. Serialization almost always causes one thread to be blocked, this will slow down either processing speed by blocking the other thread or increase latency and jitter in the case of interrupts We only looked into the counter variable above. Of course the queue holding the data is also shared, and in my experience I see much more issues with queues being corrupted due to concurrent access than I see with simple counters, so do be careful with all shared data. One last example I mentioned 16-bit timers a couple of times, I have to show my favourite example of this, which happens when people write to the timer register without stopping the timer. // Innocent enough looking code to update Timer1 register void updateTimer1(uint16_t value) { TMR1 = value; } // This is more common, same problem void updateTimer1_v2(uint16_t value) { TMR1L = value >> 8; TMR1H = value & 0xFF; } With the code above the compiler is generally smart enough not to do actual right shifts or masking, realizing that you are working with the lower and upper byte and this code compiles in both cases to something looking like this: MOVFW value+1 // High byte MOVWF TMR1H MOVFW value // Low byte MOVWF TMR1L And people always forget that when the timer is still running this can have disastrous results as follows when the timer increments in the middle of this all like this: // We called updateTimer1(0xFFF0) - expecting a period of 16 cycles to the next overflow // We show the register value to the right for just one typical error case // The timer ticks on each instruction ... [TMR1 = 0x00FD] MOVFW value+1 [TMR1 = 0x00FE] MOVWF TMR1H [TMR1 = 0xFFFF] MOVFW value [TMR1 = 0x0000] MOVWF TMR1L [TMR1 = 0x00F0] ... [TMR1 = 0x00F1] ... [TMR1 = 0x00F2] This case is always hard to debug because you can only catch it failing when the update is overlapping with the low byte overflowing into the high byte, which happens only 1 in 256 timer cycles, and since the update takes 2 cycles we have only a 0.7% chance of catching it in the act. As you can see, depending on whether you clear the interrupt flag after this operation or not you can end up with either getting an immediate interrupt due to the overflow in the middle of the operation, or a period which vastly exceeds what you are expecting (by 65280 cycles!). Of course if the low byte did not overflow in the middle of this the high byte is unaffected and we get exactly the expected behavior. When this happens we always see issues reported on the forums that sound something like "My timer is not working correctly. Every once in a while I get a really short or really long period" When I see these issues posted in future I will try to remember to come update this page with links just for fun πŸ™‚
  16. Arghhh, why! I will see if I can beg it to listen then...
  17. Use a less buggy version of MPLAB-X? Hard to choose, I know. I pulled up my old project in MPLAB-X 2.35 and it's pulling my .ld file from MPLAB's Linker Files folder.
  18. I am trying to use a linker script with MPLAB-X for my PIC32 project but for some reason the script is not being passed to the linker at all. I expected that all I had to do was add the .ld file to my project, typically by placing it in the "Linker Files" virtual folder in MPLAB-X in my project. I did this and the linker script is being ignored by the linker. This is one of those $100 questions (if you know the story of the mechanic asking $100 for knowing where to hit ...). So my question is how do I get MPLAB-X to use my linker script which I have added to the PIC32 project?
  19. Oh by the way I figured this one out, I am surprized nobody jumped on this easy question! The syntax for hexmate in this case was simply hexmate <file1> <file2> -o <outputname> All I had to do was build the project from command line, pass to hexmate the resulting hex file as well as the bootloader hex file and it spits out the combination of the 2. Hexmate is such a simple little program I felt like just writing a script to do what it is doing, but with the continuation addresses in .hex files it turns out using this versatile tool is by far the best solution!
  20. Saw this question today on the MCHP forum so I thought I would post a short example. My first advice to anybody who has this kind of problem is to generate the code with MCC and study what it is doing in terms of order of initialization and also what it is setting in the configuration bits for the device. Even if you are planning on using ASM, do that first with your settings so you can have a working example to start with! Here is a minimal example on the PIC16F18344 for TMR1 #pragma config WDTE = OFF // Watchdog Timer Enable bits (WDT disabled; SWDTEN is ignored) #include <xc.h> void main(void) { INTCONbits.PEIE = 1; // Enable peripheral interrupts TMR1IE = 1; // Enable TMR1 peripheral interrupt T1CON = 0x01; // Set the timer to prescaler 1:1 (fastest), clock source instruction clock // and pick timer2 clock in as source (not Secondary Oscillator) // we also select synchronization, no effect as we run of FOSC here. TMR1H = 0x00; TMR1L = 0x00; INTCONbits.GIE = 1; // Enable GIE last, or interrupts can happen while we are setting up! while(1); // Wait forever for interrupts return; } void __interrupt() myISR(void) { NOP(); TMR1IF = 0; // Remember to clear the IF here or we will just end up in the ISR all of the time! }
  21. Awesome! What do I have to do to get that to do printf("Hello, world") ?
  22. It has been a busy few weeks but finally I can sit down and write another installment to this series on embedded programming. Today's project will be another peripheral and a visualization tool for our development machines. We are going to learn about the UART. Then we are going to write enough code to attach the UART to a basic C printing library (printf) and finally we are going to use a program called processing to visualize data sent over the UART. This should be a fun adventure so let us get started. UART (USART?) The word UART is an acronym that stands for Universal Asynchronous Receiver Transmitter. Some UART's have an additional feature so they are Universal Synchronous Asynchronous Receiver Transmitter or USART. We are not going to bother with the synchronous part so let us focus on the asynchronous part. The purpose of the UART is to send and receive data between our micro controller and some other device. Most frequently the UART is used to provide some form of user interface or simply debugging information live on a console. The UART signaling has special features that allow it to send data at an agreed upon rate (the baud rate) and an agreed upon data format (most typically 8 data bits, no parity and 1 stop bit). I am not going to go into detail on the UART signaling but you can easily learn about it by looking in chapter 36 of the PIC16F18446 datasheet. Specifically look at figure 36-3 if you are interested. UARTs are a device that transfer bytes of data (8-bits in a byte) one bit at a time across a single wire (one wire for transmit and one for receive) and a matching UART puts the bits back into a byte. Each bit is present on the wire for a brief but adjustable amount of time. Sending the data slowly is suitable for long noisy wires while sending the data fast is good for impatient engineers. A common baud rate for user interfaces is 9600 baud which is sends 1 letter every millisecond (0.001 seconds). Many years ago, before the internet, I had a 300 baud modem for communicating with other computers. When I read my e-mail the letters would arrive slightly slower than I could read. Later, after the internet was invented and we used modems to dial into the internet, I had a 56,600 baud modem so I could view web-pages and pictures more quickly. Today I use UARTS to send text data to my laptop and with the help of the processing program we can make a nice data viewer. Enough history... let us figure out the UART. Configuring the UART The UART is a peripheral with many options so we need to configure the UART for transmitting and receiving our data correctly. To setup the UART we shall turn to page 571 of our PIC16F18446 data sheet. The section labeled 36.1.1.7 describes the settings required to transmit asynchronous data. And now for some code. Here is the outline of our program. void init_uart(void) { } int getch(void) { } void putch(char c) { } void main(void) { init_uart(); while(1) { putch(getch()+1); } } This is a common pattern for designing a program. I start by writing my "main" function simply using simple functions to do the tricky bits. That way I can think about the problem and decide how each function needs to work. What sort of parameters each function will require and what sort of values each function will return. This program will be very simple. After initializing the UART to do its work, the program will read values from the UART, add 1 to each value and send it back out the UART. Since every letter in a computer is a simple value it will return the next letter in each series that I type. If I type 'A' it will return 'B' and so on. I like this test program for UART testing because it will demonstrate the UART is working and not simply shorting the receive pin to the transmit pin. Now that we have an idea for each function. Let us write them one a time. init_uart() This function will simply write all the needed values into the special function registers to activate the UART. To do that, we shall follow the checklist in the data sheet. We need to configure both the transmit and the receive function. Step 1 is to configure the baud rate. We shall use the BRG16 because that mode has the simplest baud rate formula. Of course, it is even easier when you use the supplied tables. I always figure that if you are not cheating, you are not trying so lets just find the 32MHz column in the SYNC=0, BRGH=0, BRG16 = 1 table. I like 115.2k baud because that makes the characters send much faster. So the value we need for the SPBRG is 16. Step 2 is to configure the pins. We shall configure TX and RX. To do that, we need the schematics. The key portion of the schematics is this bit. The confusing part is, the details of the TX and RX pin. If I had a $ for every time I got these backwards, I would have retired already. Fortunately the team at Microchip who drew these schematics were very careful to label the CDC TX connecting to the UART RX and the CDC RX connecting to the UART TX. They also labeled the UART TX as RB4 and UART RX as RB6. This looks pretty simple. Now we need to steer these signals to the correct pins via the PPS peripheral. NOTE: The PPS stands for Peripheral Pin Select. This feature allows the peripherals to be steered to almost any I/O pin on the device. This makes laying out your printed circuit boards MUCH easier as you can move the signals around to make all the connections straight through. You can also steer signals to more than one output pin enabling debugging by steering signals to your LED's or a test point. After steering the functions to the correct pins, it is time to clear the ANSEL for RB6 (making the pin digital) and clear TRIS for RB4 (making the pin an output). The rest of the initialization steps involve putting the UART in the correct mode. Let us see the code. void init_uart(void) { // STEP 1 - Set the Baud Rate BAUD1CONbits.BRG16 = 1; TX1STAbits.BRGH = 0; SPBRG = 16; // STEP 2 - Configure the PPS // PPS unlock sequence // this should be in assembly because the hardware counts the instruction cycles // and will not unlock unless it is exactly right. The C language cannot promise to // make the instruction cycles exactly what is below. asm("banksel PPSLOCK"); asm("movlw 0x55"); asm("movwf PPSLOCK"); asm("movlw 0xAA"); asm("movwf PPSLOCK"); asm("bcf PPSLOCK,0"); RX1PPSbits.PORT = 1; RX1PPSbits.PIN = 6; RB4PPS = 0x0F; // Step 2.5 - Configure the pin direction and digital mode ANSELBbits.ANSB6 = 0; TRISBbits.TRISB4 = 0; // Step 3 TX1STAbits.SYNC = 0; RC1STAbits.SPEN = 1; // Step 4 TX1STAbits.TX9 = 0; // Step 5 - No needed // Step 1 from RX section (36.1.2.1) RC1STAbits.CREN = 1; // Step 6 TX1STAbits.TXEN = 1; // Step 7 - 9 are not needed because we are not using interrupts // Step 10 is in the putchar step. } To send a character we simply need to wait for room in the transmitter and then add another character. void putch(char c) { while(!TX1IF); // sit here until there is room to send a byte TX1REG = c; } And to receive a character we can just wait until a character is present and retrieve it. int getch(void) { while(!RC1IF); // sit here until there is a byte in the receiver return RC1REG; } And that completes our program. Here is the test run. Every time I typed a letter or number, the following character is echoed. Typing a few characters is interesting and all but I did promise graphing. We shall make a simple program that will stream a series of numbers one number per row. Then we will introduce processing to convert this stream into a graph. The easiest way to produce a stream of numbers is to use printf. This function will produce a formatted line of text and insert the value of variables where you want them. The syntax is as follows: value = 5; printf("This is a message with a value %d\n",value); To get access to printf you must do two things. 1) You must include the correct header file. So add the following line near the top of your program. (by the xc.h line) #include <stdio.h> 2) You must provide a function named putch that will output a character. We just did that. So here is our printing program. #include <stdio.h> #include <math.h> /// Lots of stuff we already did void main(void) { double theta = 0; init_uart(); while(1) { printf("%f\r\n",sin(theta)); theta += 0.01; if(theta >= 2 * 3.1416) theta = 0; } } Pictures or it didn't happen. And now for processing. Go to http://www.processing.org and download a suitable version for your PC. Attached is a processing sketch that will graph whatever appears in the serial port from the PIC16. And here is a picture of it working. I hope you found this exercise interesting. As always post your questions & comments below. Quite a lot was covered in this session and your questions will guide our future work. Good Luck exercise_5a.zip exercise_5.zip graph_data.pde
  23. A colleague of mine recommended this little book to me sometime last year. I have been referring to it so often now that I think we should add this to our reading list for embedded software engineers. The book is called "Don't make me think", by Steve Krug.The one I linked below is the "Revisited" version, which is the updated version. This book explains the essense of good user interface design, but why would I recommend this to embedded software engineers? After all embedded devices seldom have rich graphical GUI's and this book seems to be about building websites? It turns out that all the principles that makes a website easy to read, that makes for an awesome website in other words, apply almost verbatim to writing readable/maintainable code! You see code is written for humans to read and maintain, not for machines (machines prefer to read assembly or machine code in binary after all!). The principles explained in this book, when applied to your software will make it a pleasure to read, and effortless to maintain, because it will clearly communicate it's message without the unnecessary clutter and noise that we usually find in source code. You will learn that people who are maintaining and extending your code will not be reasoning as much as they will be satisficing (yes that is a real word !). This forms the basis of what Bob Martin calls "Viscosity" in your code. (read about it in his excellent paper entitled Design Principles and Design Patterns. The idea of Viscosity is that developers will satisfice when maintaining or extending the code, which results in the easiest way to do things being followed most often, so if the easiest thing is the correct thing the code will not rot over time, on the other hand if doing the "right" thing is hard people will bypass the design with ugly hacks and the code will become a tangled mess fairly quickly. But I digress, this book will help you understand the underlying reasons for this and a host of other problems. This also made me think of some excellent videos I keep on sending to people, this excellent talk by Chandler Carruth which explains that, just like Krug explains in this little book, programmers do not actually read code, they scan it, which is why consistency of form is so important (coding standards). Also this great talk by Kevlin Henney which explains concepts like signal to noise ratio and other details about style in your code (including how to write code with formatting which is refactoring immune - hint you should not be using tabs - because of course only a moron would use tabs) Remember, your code is the user interface to your program for maintainers of the code who it was written for in the first place. Let's make sure they understand what the hell it is you were doing before they break your code! For the lazy - here is an Amazon share link to the book, click it, buy it right now! https://amzn.to/2ZEoO4O
  24. I have a MPLAB-X project which uses a loadable (it is combining my program with a bootloader which is in another project). I need to compile this project from the command line for CI automation. For the entire build process every command executed is nicely printed in the build window, but for the loadable it claims to be using Hexmate, but the command line to execute it is not shown at all. Can anyone help me with the syntax using Hexmate to get the same behavior as adding the Loadable from MPLAB-X?
  25. Beginners always have a hard time understanding when to use a header file and what goes in the .c and what goes in the .h. I think the root cause of the confusion is really a lack of information about how the compilation process of a C program actually works. C education tends to focus on the semantics of loops and pointers way too quickly and completely skips over linkage and how C code is compiled, so today I want to shine a light on this a bit. C code is compiled as a number of "translation units" (I like to think of them as modules) which are in the end linked together to form a complete program. In casual usage a "translation unit" is often referred to as a "Compilation Unit". The process of compilation is quite nicely described in the XC8 Users Guide in section 4.3 so we will look at some of the diagrams from that section shortly. Compilation Steps Before we get into a bit more detail I want to step back slightly and quite the C99 standard on the basic idea here. Section 5.1.1.1 of the C99 standard refers to this process as follows: A C (or C++ for that matter) compiler will process your code as shown to the right (pictures from XC8 User's Guide). The C files are compiled completely independenty of each other into a set of object files. After this step is completed the object files are linked together, like a set of Lego blocks, into a final program during the second stage. It is possible for 2 entirely different programs to share some of the same obj files this way. Re-usable object files that perform common functions are often bundled together into an archive of object files referred to as a "Library" by the standard, and is often "zipped" together into a single file called .lib or .a This sequence is pretty much standard for all C compilers as depicted on the right. Looking at XC8 in particular there is a little more details that only applies to this compiler. The PIC16 also poses some challenges for compilers as the architecture has banked memory. This means that moving code around from one location to another may not only change the addreses of objects (which is quite standard) but it may also require some extra instructions (bank switch instructions) to be added depending on where the code is ultimately placed in memory. We will not get any deeper into the details here, but I want to point out the most important aspects. Some useful tips: Most compilers will have an option to emit and keep the output from the pre-processor so that you can look at it. When debugging your #define's and MACROs are getting you under this is an excellent debugging tool. With the latest version of XC8 the option to keep these files is "-save-temps" which can be passed to the linker as an additional argument. (They will end up in a folder called "build" and the .pre files in the diagram may have an extention ".i" depending on the processor you are on). During the linking step all the objects (translation units) which will be combined to create the final program will be linked together to produce an executable. This process will decide where each variable and function will be placed, and all symbolic references are replaced with actual addresses. This process is sometimes referred to as allocation, re-allocation or fix-up. At this step it is possible to supply most linkers with a linker file or "linker script" which will guide the linker about which memory locations you want it to use. Although the C standard does not specify the format of the object files, some common formats do exist. Most compilers used to use the COFF format (Common Object File Format) which typically produces files with the extension .o or .obj. Another popular format favored by many compilers today is the ELF (Executable and Linkable Format). The most important thing to take away from all that is that your C files the H files they includ will be combined into a single translation unit which will be processed by the compiler. The compiler literally just pastes you include file into the translation unit at the place you include it. There is no magic here. So why do I need an H file at all then? As noted in the standard different translation units communicate with each other through either calling functions with external linkage, or manipulating objects with external linkage. When 2 translation units communicate in this way it is very important that they both have the exact same definition for the objects they are exchanging. If we just had the definitions in C files without any headers then the definitions of everything that is shared would have to be very carefully re-typed in each file, and all of these copies would later have to be maintained. It really is that simple, the only reason we have header files is to save us from maintaining shared code in multiple places. As you should have noticed by now the descriptions of translation units sound very similar to "libraries" or "modules" of your program, and this is precisely what they are. Each translation unit is an independent module which may or may not be dependent on one or more other modules, and you can use these concepts to split your programs into more managable and re-usable modules. This is the divide and conquer strategy. In this scheme the sole purpose of header files is to be used by more than one translation unit. They represent a definition of the interface that 2 modules in your program can use to communicate with each other and saves you from typing it all multiple times. Let's look at a simple example of a how a header file named module.h may be processed when it is included into your C code. // This is module.h - the header file // This file defines the interface specification for this module. // It contains all definitions of functions and/or variables with external linkage // The purpose of this file is to provide other translation units with the names of the objects that this translation unit // provides, so that they can be used to communicate with this translation unit. // This declaration promises the compiler that somewhere there will be a function called "myFunction" which the linker will be able to resolve void myFunction(void); // This declaration promises the compiler that somewhere there will be a variable called "i" which the linker will be able to resolve extern int i; And it's corresponding C source file module.c // This is module.c - the C file for my module // This file contains the implemenation of my module // It is typical for a module to include it's own interface, this makes it easier to implement by ensuring the interface and implemenation are identical #include "module.h" // Declaring a variable like this will allocate storage for it. // In C Variables with global scope has external linkage by default (This is NOT true for C++ where this would have internal linkage) int i = 42; // Functions have external linkage by default, it is not necessary to say extern void myFunction as the "extern" is implied void myFunction(void) { ... // some code here } As described above the pre-processor will convert this into a file for the compiler to process which looks like this : // This is module.c - the C file for my module // This file contains the implemenation of my module // It is typical for a module to include it's of interface, this makes it easier to implement in many ways // This is module.h - the header file // This file defines the interface specification for this module. // It contains all definitions of functions and/or variables with external linkage // The purpose of this file is to provide other translation units with the names of the objects that this translation unit // provides, so that they can be used to communicate with this translation unit. // This declaration promises the compiler that somewhere there will be a function called "myFunction" which the linker will be able to resolve void myFunction(void); // This declaration promises the compiler that somewhere there will be a variable called "i" which the linker will be able to resolve extern int i; // Defining a variable like this will allocate storage for it. // In C Variables with file scope has external linkage by default (This is NOT true for C++ where this would have internal linkage) int i = 42; // Functions have external linkage by default, it is not necessary to say extern void myFunction as the "extern" is implied void myFunction(void) { // some code here } void main(void) { myFunction(void); } Now you will notice that with the inclusion of the header like this some things like the "int i" end up occuring in the file twice. This can be very confusing when we try and establish exactly which of these statements are just declarations of the names of variables and which ones actually allocate memory. If a symbol like "int i" is declared more than once in the file how do we ensure that memory is not allocated more than once, especially if "int i" occurs in the global file scope of more than one tranlsation unit! In order to make more sense of this we can go over how the compiler will process the combined "tranlation unit" from top to bottom for our simple example. When the compiler processes this file it first finds a declaration of a function without an implementation/definition. This tells the compiler to only declare this name in what is commonly referred to as it's "dictionary". Once the name is established it is possible for the implementation to safely refer to this name. Such a declaration of a function without an implementation is called a function prototype. The next code line contains a declaration of an integer called "i" with external linkage (we will get to linkage in the next section). This is a declaration as opposed to a definition, as it does not have any initializer (an assignment with an initial value). This declaration places "i" in the dictionary, but does not allocate storage for the variable. It also marks the object as having external linkage. When 2 compilation units declare the same object with external linkage the compiler will know that they are linked (refer to the same thing), and it will only allocate space for it once so that both translation units end up manipulating the same variable! Later on the compiler finds "int i = 42", this is a definition of the same symbol "i", this time it also supplies an initializer, which tells the compiler to set this variable to 42 before main is run. As this is a definition this is the statement that will cause memory to be allocated for the variable. If you try and have 2 definitions for the same object (even in 2 separate translation units) the compile will report an error which will alert you that the object was defined more than once. It will either say "duplicate symbol" or "multiple definition for object" or something along these lines (error messages are not specified by the standard so these messages are different on each compiler). Next we encounter the implementation/definition of the function myFunction. Lastly we encounter the implementation/definition of main, which is traditionally the entry point of the application. I encourage you to cut and paste that snippet above into an empty project and compile it to assure yourself that this works fine. After that I want you to paste these examples so we can better understand the mechanics here, and prove that I am not smoking something here! // Some test code to show why include files actually end up working int j; int j; int j = 1; void main(void) { // Empty main } You can compile this and note that there will be no error. (a project with this code is attached for your convenience). This is because int j; is what is called a "tentative definition". What this means is that we are stating that there is a definition for this variable in this translation unit. If the end of the tranlation unit is reached and no definition has been provided (there was no definition with an initializer) then the compiler must behave as if there was a definition with an initializer of "0" at the end of the translation unit. You can have as many tentative definitions as you want for the same object, even if they are in the same compilation unit, as long as their types are all the same. The third line is the only definition of "j" which also triggers the allocation of storage for the variable. If this line is removed storage will be allocated at link time as if there was a definition with initializer of 0 at the end of the file if no definitions can be found in any of the translation units being linked. Now change the code to look as follows: // Some test code to show why include files actually end up working int j; int j = 1; int j = 1; void main(void) { // Empty main } This will result in the following error message, since more than one initializer is provided we have multiple definitions for the object in the same translation unit, which is not allowable. This is because a declaration with an initializer is a definition for a variable and a definition will allocate storage for the variable. We can only allocate storage for a variable once. On CLang the error looks only slightly different: Now let's try something else. Change it to look like this. This time the variable is an auto variable, so it has internal linkage (this is also called a local variable). In this case we are not allowed to declare the same variable more than once because there is no good reason (like with header files) to do this and if this happens it would most likely be a mistake so the compiler will not allow it. // Some test code to show why include files actually end up working void main(void) { int j; int j; int j = 1; } The error produced looks as follows on XC8, and I will get this even if I have only 2 j's with no initializer: An important note here, auto variables (local variables) will NOT be automatically initialized to 0 like variables at file scope with external linkage will be. This means that if you do not supply any initializer the variable can and will likely have any random value. Linkage We spoke about linkage quite a bit, so lets also make this clear. The C Standard states in section 6.2.2: For any identifier with file scope linkage is automatically external. External linkage means that all variables with this identical name in all translation units will be linked together to point to the same object. Variables with file scope which has a "static" storage-class specifier have internal linkage. This means that all objects WITHIN THIS TRANSLATION UNIT with the same name will be linked to refer to the same object, but objects in other translation units with the same symbol name will NOT be linked to this one. Local variables (variables with block or function scope) automatically has no linkage, this means they will never be linked, which means having the same symbol twice will cause an error (as they cannot be linked). An example of this was shown in the last sample block of the previous section. Note that adding "extern" in front of a local variable will give it external linkage, which means that it will be linked to any global variables elsewhere in the program. I have made a little example project to play with which demonstrates this behavior (perhaps to your surprize!) If two tranlation units both contain definitions for the same symbol with external linkage the compiler will only define the object once and both definitions will be linked to the same definition. Since the definitions provide initial values this only works if both definitions are identical. There is a nice example, as always, in the C99 standard. int i1 = 1; // definition, external linkage static int i2 = 2; // definition, internal linkage extern int i3 = 3; // definition, external linkage int i4; // tentative definition, external linkage static int i5; // tentative definition, internal linkage int i1; // valid tentative definition, refers to previous int i2; // 6.2.2 renders undefined, linkage disagreement int i3; // valid tentative definition, refers to previous int i4; // valid tentative definition, refers to previous int i5; // 6.2.2 renders undefined, linkage disagreement extern int i1; // refers to previous, whose linkage is external extern int i2; // refers to previous, whose linkage is internal extern int i3; // refers to previous, whose linkage is external extern int i4; // refers to previous, whose linkage is external extern int i5; // refers to previous, whose linkage is internal // I had to add the missing one extern int i6; // Valid declaration only, whose linkage is external. No storage is allocated. Note that if we have a declaration only such as i6 above in a compilation unit the unit will compile without allocating any storage to the object. At link-time the linker will attempt to locate the definition of the object that allocates it's storate, if none is found in any other compilation unit for the program you will get an error, something like "reference to undefined symbol i6" Looking over those examples you will note that the storage-class specifier "extern" should not be confused with the linkage of the variable. It is very possible that a variable with external storage class can have internal linkage as indicated by the examples from the standard for "i2" and also for "i5". To see if you understand "extern" take a look at this example. What happens when you have one file (e.g. main.c) which defines a local variable as extern like this? [ First try to predict and then test it and see for yourself] #include <stdio.h> void testFunction(void); int i = 1; void main(void) { extern int i; testFunction(); printf("%d\r\n", i); } And in a different file (e.g. module.c) place the following: int i; void testFunction(void) { i = 5; } You should be able to tell if this will compile or not, and if not what error it would give, or will it compile just fine and print 5 ? Also try the following: What happens when you remove the "extern" storage-class specifier? What happens when instead you just emove the entire line "extern int i;" from function main ? (no externs in either file). Is that what you expected? What happens when you move the initializer from file-scope (just leaving the int i), to the function scope definition inside of main (when you have "extern int i = 1;" inside of the main function)? What happens when you add "extern" to the file scope declaration (replace "int i = 1;" with "extern int i = 1;") In Closing When you are breaking your C code into independent "Translation Units" or "Compilation Units", keep in mind that the entire header file is being pasted into your C file whenever you use #include. Keeping this in mind can help you resolve all kinds of mysterious bugs. Make sure you understand when variables have external storage and when they have external linkage. Remember that if 2 modules declare file scope variables with external linkage and the same name, they will end up being the same variable, so 2 libraries using "temp" is a bad idea, as these will end up overwriting each other and causing hard to locate bugs. Answers Cheat Sheet: The code listed compiles fine and prints "5". Additional exercises: When you remove the extern from the first line in function main it prints some random value (on my machine 287088694). This is because the local variable does not have linkage to the other variable called i. When you instead remove the entire first line from the main function it compiles and prints "5" like before. Having "extern int i = 1" inside of function main does not compile at all, complaining that "an extern variable cannot have an initializer" Having "extern int i = 1" in file scope is allowed though! This compiles just fine although CLang will give a warning for this one as long as only one definition exists. If you now also add an initializer to the file scope int i in module.c it will not compile any more.
  26. Hi Orunmila, Thanks for the reply! I understood your analysis and suggestions. I will do the same and comeback with results.
  1. Load more activity
Β 


×
×
  • Create New...