The PICmcu is not known for being fast or supporting large memories but it does have one feature that can significantly simplify developing your applications. That feature is the lack of a hardware call stack.
Sacrilege you say! But wait...
The lack of this important feature has caused the compiler team to develop an incredibly useful alternative that I would argue is better in nearly every way than an actual stack.
For those of you who are now wondering what I am talking about, let's take a quick diversion into stacks.
A stack is simply a data structure that arranges a number of data elements so that the last thing you inserted is the first thing you get back. Think of a stack of plates, you add plates to the top of the stack and remove them in reverse order from the top of the stack. This is important for a few reasons. First, imagine your code was interrupt by a hardware interrupt. The current address of your code is pushed onto the stack, the interrupt runs and your address is popped from the stack so the interrupt can return where it interrupt you. This is a handy feature and for "free" you can handle any number of interruptions. In fact, a function can call itself and so long as there is sufficient room on the stack, everything will be sorted out on the return.
Now, the PICmcu's have hardware stacks for the the function calls and returns. They simply don't have any hardware support for anything else. If the hardware stack is so useful for keeping track of return addresses, it would also be useful for all the parameters your function will need and the values your functions will return. This parameter stack is an important feature of most languages and most especially the C language. The AVR, ARM, 68000, IA86, Z80, VAX11, all have instruction level support for implementing a parameter stack for each function. I have written millions of lines of C code for the PIC16 so how does it do its job without this important part of the language and why do I think this missing features is such a strong strength of the CPU.
The secret to the ability of XC8 to produce reasonably efficient C code for the PIC16 and PIC18 without a stack lies in the "compiled stack" feature. This feature analyses the call tree of your program and determines what the stack would look like at any point in the program. Functions that could be in scope at the same time (consider a multiply function in "main" and the interrupt) are duplicated so there are no parameters that need to be in two places at the same time. Any recursive functions are detected and the user alerted. Finally, the complete stack is converted to absolute addresses and mapped into the physical memory of the CPU. Then all the instructions are fixed up with those absolute addresses. This big-picture view of the program also allows the compiler to move parameters around to minimize banking (banking is extra instructions required to reach addresses further than 128 bytes away) and finally, the finished program is ready to run in your application. The amazing thing is the final memory report is the complete memory requirements of the program INCLUDING THE LOCAL VARIABLES. This is a shocker. In fact, when I was looking at the Arduino forums,. I would frequently encounter users who added one more line of code and suddenly their application stopped working. They were told to go buy the bigger CPU. Imagine if your compiler could tell you if the program would fit and that would include all the stack operations. This is a game changer and I would love to see the industry apply this technology across all CPU's.
There is no reason why any CPU would not be able to operate with a compiled stack. In fact, most CPU's are capable of operating with both kinds of stacks at the same time. The biggest reason against this sort of operation is really in large project management. Consider, you develop a library for some special feature, perhaps GPS token parsing, Now you want to include this library in all of your applications. You MUST NOT recompile this code because it has passed all of your certification testing and is known good "validated" code. (remember, the compiler is permitted to produce a different binary on each run). If you cannot recompile the code, on some architectures you cannot change the addresses as there could be side effects (banking on a PIC16). If your code only relies upon a stack, then there is never any recompiling required and the linker task is dramatically simplified.
Anyway, I must go back into the FreeRTOS world and continue my quest to find the thread with the overflowing stack. Life would be simpler with static analysis tools for the stack.
Until next time.,