Jump to content
 

Static vs Dynamic Data Containers


N9WXU
 Share

Recommended Posts

  • Member

Sometimes I get the sad impression that embedded FW engineers only understand 1 data container, the array.  The array is a powerful thing and there are many good reasons to use it but it also has some serious problems.  For instance, a number of TCP/IP libraries use arrays to hold a static list of sockets.  When you wish to create a new socket the library takes one of the unused sockets from the array and returns a reference (a pointer or index) so the socket can be used as a parameter for the rest of the API.  It turns out that sockets are somewhat heavy things (use lots of memory) so you always want to optimize the system to have the smallest number of sockets necessary.  Unfortunately, you must "pick" a reasonable number of sockets early in the development process.  If you run out of sockets you must go back and recompile the library to utilize the new socket count.  Now there is a dependency that is not obvious, only fails at run time and links the feature count of the product with the underlying library.  You will see bugs like, "when I am using the app I no longer get notification e-mails".

It turns out that this problem can be easily solved with a dynamic container.  i.e. one that grows at runtime as you need it to.  A brute force method would perhaps be to rely upon the heap to reallocate the array at runtime and simply give the library a pointer to an array.  That will work but it inserts a heavy copy operation and the library has to be paused while the old array is migrated to the new array.  I propose that you should consider a Linked List.  I get a number of concerns from other engineers when I have made this suggestion so just hang tight just one moment.

Concerns

  1. Allocating the memory requires the heap and my application cannot do that.
  2. Traversing the list is complicated and requires recursion.  We cannot afford the stack space.
  3. A linked list library is a lot of code to solve this problem when a simple array can manage it.
  4. The linking pointers use more memory.

If you have a new concern, post it below.  I will talk about these first.

Concern #1, Memory allocation

I would argue that a heap is NOT required for a linked list.  It is simply the way computer science often teaches the topic.

  1. Allocate a block of memory for the data.
  2. place the data in the block of memory.  Data is often supplied as function parameters.
  3. insert the block into the list in the correct place.

Computer science courses often teach linked lists and sorting algorithms at the same time so this process forms a powerful association.  However, what if the process worked a little differently.j

  1. Library Code -> Define a suitable data structure for the user data.  Include a pointer for the linked list.
  2. User Code -> Create a static instance of the library data structure.  Fill it with data.
  3. User Code -> Pass a reference to the data structure to the library.
  4. Library Code -> insert the data structure into the linked list.

If you follow this pattern, the user code can have as many sockets or timers or other widgets as it has memory for.  The library will manage the list and operate on the elements.  When you delete an element you are simply telling the library to forget but the memory is always owned by the user application.  That fixes the data count dependency of the array.

Concern #2, Traversing the list is complex and recursive.

First, Recursion is always a choice.  Just avoid it if that is a rule of your system.  Every recursive algorithm can be converted to a loop.

.Second, Traversing the list is not much different than an array.  The pointer data type is larger so it does take a little longer.

struct object_data
{
  int mass;
  struct object_data *nextObject;
};

int findTheMassOfTheObjects(struct object_data *objectList)
{
  thisObject = objectList;
  while(thisObject)
  {
      totalMass += thisObject->mass;
      thisObject = thisObject->nextObject;
  }
  printf("The mass of all the objects is %d grams\n", totalMass);
  return totalMass;
}

So here is a quick example.  It does have the potential of running across memory if the last object in the list does NOT point at NULL.  So that is a potential pitfall.

Concern #3, A linked list library is a lot of code

Yes it is.  Don't do that.  A generic library can be done and is a great academic exercise but most of the time the additional next pointers and a few functions to insert and remove objects are sufficient.  The "library" should be a collection of code snippets that your developers can copy and paste into the code.  This will provide reuse but break the dependency on a common library allowing data types to change, or modifications to be made.

Concern #4, A linked list will use more memory

It is true that the linked list adds a pointer element to the container data structure.  However, this additional memory is probably much smaller than the "just in case" additional memory of unused array elements.  It is probably also MUCH better than going back and recompiling an underlying library late in the program and adding a lot more memory so the last bug will not happen again.

A little history

The linked list was invented by Allen Newell, Cliff Shaw and Herbert Simon.  These men were developing IPL (Information Processing Language) and decided that lists were the most suitable solution for containers for IPL.  They were eventually awarded a Turing Award for making basic contributions to AI, Psychology of Human Cognition and list processing.  Interestingly IPL was developed for a computer called JOHNIAC which had a grand total of 16 kbytes of RAM.  Even with only 16KB IPL was very successful and linked lists were determined to be the most suitable design for that problem set.  Most of our modern microcontrollers have many times that memory and we are falling back on arrays more and more often.

If you are going to insist on an array where a linked list is a better choice, you can rest easy knowing that CACHE memory works MUCH better with arrays simply because you can guarantee that all the data is in close proximity so the entire array is likely living in the cache.

Good Luck

P.S. - The timeout driver and the TCP library from Microchip both run on 8-bit machines with less than 4KB of RAM and they both use linked lists.  Check out the code in MCC for details.

  • Like 1
Link to comment
Share on other sites

  • 1 month later...
On 9/9/2019 at 3:35 PM, N9WXU said:

Concerns

  1. Allocating the memory requires the heap and my application cannot do that.

With very good reason in embedded systems.

On 9/9/2019 at 3:35 PM, N9WXU said:

Concern #1, Memory allocation

I would argue that a heap is NOT required for a linked list.  It is simply the way computer science often teaches the topic.

  1. Allocate a block of memory for the data.

Where is the memory allocated from, if not the heap?

What happens when there is no more memory available? You will be back to bugs like, "when I am using the app I no longer get notification e-mails" or much worse.

Some RTOSs expect you to create memory pools at compile time that are handled much better than simple malloc()/free() but you are still in the same boat of having fixed, limited resources from which to allocate.

The real answer is to have proper error checking and give the user a meaningful hint so that they know why they are no longer seeing the expected behaviour.

Link to comment
Share on other sites

  • Member
On 10/22/2019 at 10:16 AM, crosland said:

Where is the memory allocated from, if not the heap?

What happens when there is no more memory available? You will be back to bugs like, "when I am using the app I no longer get notification e-mails" or much worse.

 

You can use statically linked memory (like a global array of items) to allocate, or you can allocate it on the stack by creating the variable in the function where it is being used. In the case where you statically allocate it the case where you run out of memory will cause the code not to compile, which means of course when it does compile you are guaranteed that you can never fail in this way.

Quote

Some RTOSs expect you to create memory pools at compile time that are handled much better than simple malloc()/free() but you are still in the same boat of having fixed, limited resources from which to allocate.

If you allocate from any pool (or heap) you will always have to be excessively careful as things like fragmentation can easily catch you. FreeRTOS e.g. has a safe heap implementation which has malloc but no free, that allows you to call malloc during init to initialize all the memory you need, and get determinisitc failure if you do not have enough memory which is easy to debug. If you have a free then making it safe is substantially more difficult because you will fail at the point of maximum memory pressure and this will likely be a race condition which may have low probability of hapening during your testing. 

The best course of action is to design the system in such a way that it does not compile when you do not have sufficient resources, that way there is no guessing game.

Quote

The real answer is to have proper error checking and give the user a meaningful hint so that they know why they are no longer seeing the expected behaviour.

Proper error checking only gets you so far. I have seen a case where the init function of the UART (which was the only user interface) failed to allocate memory (linker settings were wrong of course), but the point is that error checking would not have helped much in that case. I have also seen similar cases where there is no way to recover especially in bootloaders.

 

 

 

Link to comment
Share on other sites

  • Member

If you look at the MPLAB Code Configurator timeout driver you will find that each "timer" that you want to create has a structure similar to this:

struct timeout {
	int (*callback)(void *argument);
	struct timeout *next;
	// other housekeeping things
}

And an API that takes such a structure. 

void timeout_addTimer(struct timeout *myTimer);

struct timeout myTimer = {myCallback,0};

void main(void)
{
	timeout_addTimer(&myTimer);
}

Note: the function names are similar in purpose to MCC but not identical.

The function addTimer will accept the completely allocated timer, finish initializing the "private" data and insert this into the list of timers.  The memory is "owned" and "provided" by the user but the driver will manage it.  Of course this driver will also work with a heap if that is desired.

void main(void)
{
	struct timeout *myTimer = malloc(sizeof(struct timeout));
	myTimer.callback = myCallback;
	timeout_addTimer(myTimer);


	// after some time has passed
	free(myTimer);
}

In both cases the application is 100% responsible for the memory and can do the most suitable thing for the circumstances.

Just to be "complete" here are two other options:

The stack:

void main(void)
{
	struct timeout myTimer = {myCallback,0};
	timeout_addTimer(&myTimer);

	while(1)
	{
		// do my application
	}
}
// This is safe because myTimer will never go out of scope.  
// However, if you use this technique in a function that will exit, 
// the memory will go out of scope on the stack and there will be
// a difficult to debug crash when the timer expires.

Static

void foo(void)
{
	static struct timeout myTimer = {myCallback,0};

	timeout_addTimer(&myTimer);

}

Now the timer will be in scope because the variable is static.  Of course you may have a new problem that addTimer could be called with this timer already in the timer list.  That may break the linked list if the list is not pre-scanned for preexisting timers.

In EVERY case I do prefer static allocation that is the responsibility of the application.  That assures the following:

  1. Code fails during compile/link time so it is easier to debug.
  2. Memory usage appears in the data segment so it is tracked by ALL compilers as RAM usage giving an accurate measure of RAM.
  3. The driver is scale able because it NEVER needs internally allocated memory nor does it need heap access.

That said, I just started using the ESP32 and it's memory is fragmented.  It has a block of RAM that is used for static variables, and the data segment.  And it has a different larger block of RAM used for the heap.  If you don't use the heap, you will have very little memory.  Good heap hygiene should be a future topic for discussion.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

 


×
×
  • Create New...