Jump to content
 

Orunmila

Member
  • Content Count

    205
  • Joined

  • Last visited

  • Days Won

    24

Orunmila last won the day on February 19

Orunmila had the most liked content!

Community Reputation

46 Excellent

About Orunmila

  • Rank
    Teacher

Recent Profile Visitors

934 profile views
  1. I had a similar choice on the CO2 valve for my fishtank, but if my sw had any glitch the tank would be carbonated like soda and everything would die from the ph drop, so I decided to leave the mechanical regulator in the line for a backup if the electronics fail for some reason. You never know what happens when the power goes out or a battery dies at the wrong time...
  2. I used to do a fair amount of optimization using linear programming, it was a bit of a pain to code up so I had hoped this would make that easier ...
  3. Provided nothing went wrong in your analysis that bit being set means that somewhere something did a software reset, you just have to figure out where this happened. N9WXU gave some possible causes of the RESET instruction being used, but it could really be anywhere. I have even seen this happen with a bad function pointer jumping to a location which contained const data. As a last resort you can search through the program memory for the RESET instruction (you can do this in MPLAB using the memory views) and set a breakpoint at every location containing the reset instruction. That way you should be able to catch it in the debugger and figure out where the reset is coming from.
  4. This is really cool! Can this tool generate C code that I can use in my applications or does it just solve the problem for me?
  5. Glad you found it helpful JG2018, Let me try and answer your questions. This part is just math and the magic of logarithms. The important identity here is that logx(y) = log10(x)/log10(y), and we know that log2(x) will give us how many bits we need to represent x, but our calculators do not do log2 instead they do log10 , which means we have to take log10(x) / log10(2) to get the number of bits we need to represent the number. The short answer is yes because the leakage contributes to a measurement error, and we measure the ADC error as a proportion of LSB size as we described above. How the leakage contributes is a lot more involved and depends on the construction of the ADC and the methods it uses for sampling. Section 7 above shows the simplified circuit diagram of the ADC input path and you can see where Microchip places the leakage current specified in the datasheet. If you calculate the network currents when this circuit is attached to your circuit under measurement you can determine how much of an effect the leakage will have to your measurements. If the source impedance is small the bulk of the current you feed into the pin will end up going into the sampling capacitor and the leakage will have a small effect, but of course the higher your source impedance is the lower the current into the sampling capacitor and the more of an effect the leakage current will have. We can use the PIC18F47K40 example above since the leakage current is indicated in the simplified diagram in section 7 above. If we have 10k of source impedance and we have an additional 1k of internal impedance on the ADC charge path we have at 3V approx 272uA charging the capacitor. The pin leakage is specified at max to be 125nA, which means that 125nA/272uA is the proportion of the charge current lost to our measurement. In this example it contributes 0.0459% error. For a 10-bit ADC this would mean roughly 0.45 LSB of error (calculated as 0.0459% of the full range which is 2^10). That explains the comment in the device datasheet that if your source impedance is more than 10k this will result in an additional error of more than 0.5 LSB which means it will start contributing to your measurement error. To see how this turns into a larger error we can do the same math with a source impedance of 100k instead. This divides the current by 10 and consequently increases the error to 4.59 LSB of error. This means you can still take the measurements but you will potentially have almost 5 LSB of additional error to contend with. And you cannot calibrate out the leakage as it varies with voltage and temperature and the spec already has it varying from 5nA to 125nA depending on these parameters and also the process variation.
  6. Looks like it gave the segfault upon running? And yes that is what I would expect because on a PC your code is trying to write to memory which should not permit writes. On an embedded system the behavior will depend a lot on the underlying system. Some systems will actually crash in some way, others, like XC8 based PIC microcontrollers, actually copy the read-only section into RAM so the code will actually work. This is why this is so dangerous, the behavior depends on the target and the toolchain and when this is one day tried on another system it could be a real challenge to figure out the mistake because it is so easily masked.
  7. Thanks for pointing out the mistake, we have updated the text accordingly.
  8. @zakaster - I have posted a more comprehensive answer to the blog here https://www.microforum.cc/blogs/entry/49-a-question-about-structures-strings-and-pointers/
  9. In the comments of our blog on structures in c a member asked me a specific question about what they observed. As this code is a beautiful example of a number of problems we often see we thought it a good idea to make an entry just to discuss this as there really is a lot going on here. We will cover the following: What allocation and freeing of memory on the stack means and the lifetime of objects In which direction the stack usually grows (note - the C standard does not contain the word "stack" so this is compiler-specific) Another look at deep vs. shallow copies of c strings inside structures In order to keep this all generic, I am going to be using the LLVM compiler on my MAC to do all my examples. The examples are all standard C and you can play with the code on your favorite compiler, but since the details of memory allocation are not mandated by the C standard your results may not look exactly like mine. I will e.g. show how the results I get changes when I modify the optimization levels. The Question The OP @zakasterwas asking this: Here is the code snippet they provided: struct person { char* name; }; void get_person(struct person* p) { char new_name[20]; // on stack, gets freed when function returned printf("input new name for person:"); scanf("%s", &new_name); p->name = new_name; printf("address of new_name = %p\n", &new_name[0]); } void eg_test_copy2(void) { struct person p = {"alex"}; get_person(&p); printf("p name = %s\n", p.name); char dummy[20] = { 0 }; printf("address of dummy = %p\n", &dummy[0]); printf("p name = %s\n", p.name); } Variable Allocation When you declare a variable the compiler will only reserve a memory location to be used by the variable. This process will not actually clear the memory unless the variable has static linkage, the standard states that only variables with static linkage (in simple terms this means global variables) shall be initialized to 0. If you want a variable to be initialized you have to supply an initializer. What actually happens before your main function starts running is that something generally referred to as "c-init" will run. This is a bit of code that will do the work needed by the C standard before your code runs, and one of the things it will do is to clear, usually using a loop, the block of memory which will contain statically linked variables. Other things that may be in here are setting up interrupt vectors and other machine registers and of course copying the initial values of global variables that do have initializers over the locations reserved for these variables. When a variable goes "out of scope" the memory is no longer reserved. This simply means that it is free for others to use, it does not mean that the memory is cleared when it is no longer reserved. This is very important to note. This phenomenon often leads to developers testing their code after having a pointer that points to memory which is no longer reserved, and the code seems to work fine until the new owner of that part of memory modifies it, then the code inexplicably breaks! No, it was actually broken all along and you just got lucky that the memory was not used at the time you were accessing this unreserved piece of memory! The classic way this manifests can be seen in our first test (test1) below. #include <stdio.h> char* get_name() { char new_name[20]; // on stack, gets freed when function returned printf("Enter Name:"); scanf("%s", new_name); return new_name; } int main(void) { char* theName; theName = get_name(); printf("\r\nThe name was : %s\r\n", theName); return 0; } I compile and run this and get : > test1 Enter Name:Orunmila The name was : Orunmila Note: Let me mention here that I was using "gcc test1.c -O3" to compile that, when I use the default optimization or -O1 it prints junk instead. When you do something which is undefined in the C standard the behavior will not be guaranteed to be the same on all machines. So I can easily be fooled into thinking this is working just fine, but it is actually very broken! On LLVM I actually get a compiler warning when I compile that as follows: test1.c:7:12: warning: address of stack memory associated with local variable 'new_name' returned [-Wreturn-stack-address] return new_name; ^~~~~~~~ 1 warning generated. Did I mention that I do love LLVM?! We can quickly see how this breaks down if we call the function more than once in a row like this (test2): #include <stdio.h> char* get_name() { char new_name[20]; // on stack, gets freed when function returned printf("Enter Name:"); scanf("%s", new_name); return new_name; } int main(void) { char* theName; char* theSecondName; theName = get_name(); theSecondName = get_name(); printf("\r\nThe first name was : %s\r\n", theName); printf("The second name was : %s\r\n", theSecondName); return 0; } Now we get the following obviously wrong behavior Enter Name:N9WXU Enter Name:Orunmila The first name was : Orunmila The second name was : Orunmila This happens because the declarations of theName and theSecondName in the code only reserve enough memory to store a pointer to a memory location. When the function returns it does not actually return the string containing the name, it only returns the address of the string, the name of the memory location which used to contain the string inside of the function get_name(). At the time when I print the name, the memory is no longer reserved, but as nobody else has used it since I called the function (I did perform any other operation which makes use of the stack in other words). The code is still printing the name, but both name pointers are pointing the same location in memory (which is actually just a coincidence, the compiler would have been within its rights to place the two in different locations). If you call a function that has a local variable between fetching the names and printing them the names will be overwritten by these variables and it will print something which looks like gibberish instead of the names I was typing. We will leave it to the reader to play with this and see how/why this breaks. I would encourage you to also add this to the end, these print statements will clearly show you where the variables are located and why they print the same thing - you will notice that the values of both pointers are the same! printf("Location of theName : %p\r\n", &theName); // This prints the location of the first pointer printf("Location of theSecondName : %p\r\n", &theSecondName); // This prints the location of the second pointer printf("Value of theName : %p\r\n", theName); // This prints the value of the first pointer printf("Value of theSecondName : %p\r\n", theSecondName); // This prints the value of the second pointer This all should answer the question asked, which was "I can still access the old name, is this weird?". The answer is no, this is nor weird at all, but it is undefined and if you called some other functions in between you would see the memory which used to hold the old name being overwritten in weird and wonderful ways as expected. How does the stack grow? Now that we have printed out some pointers this brings us to the next question. Our OP noticed that "the address for `dummy` actually starts after the address of new_name + 4 bytes x 20". We need to be careful here, the C standard requires pointers to be byte-addressable, which means that the address being 20x4 away makes no sense by itself, and in this case it is a pure coincidence. A couple of things should be noted here: The stack usually grows downwards in memory The size of a char[20] buffer will always be 20 and never 4x20 (specified in section 6.5.3.4 of the C99 standard) In the example question the address of new_name was at 0x61FD90, which is actually smaller than 0x61FDE0, and in other words it was placed on the stack AFTER dummy. Here is a diagram which shows a typical layout that a C compiler may choose to use. The reason there was a gap of 80 between the pointers was simply due to the way the compiler decided to place the variables on the stack. It was probably creating some extra space on the stack for passing parameters around and this just happened to be exactly 60 bytes, which resulted in a gap of 80. The C standard only defines the scope of the variables, it does not mandate how the compiler must place them in memory. This can even vary for the same compiler when you add more code as the linker may move things around and will probably change when you change the optimization settings for the compiler. I did some tests with LLVM and if I look at the addresses in the example they will differ significantly when I am using optimization O1, but when I set it to O3 the difference between the two pointers is exactly 20 bytes for the example code. Getting back to Structures and Strings Looking at the intent of the OP's code we can now get back to how structures and strings work in C. With our interface like this struct person { char* name; }; void get_person(struct person* p); What we have is a struct which very importantly does NOT contain a string, it only contains the address of a string. That person struct will reserve (typically) the 4 bytes of RAM required to store a 32-bit address which will be the location where a string exists in memory. If you use it like this you will most often find that the address of "name" will be exactly the same as the address of the person struct you are passing in, so if our OP tested the following this would have been clear: struct person p = {"alex"}; printf("Address of p = %p\n", &p); printf("Address of p.name = %p\n", &p.name); These two addresses must be the same because the struct has only one member! When we want to work with a structure that contains the name of a person we have 2 choices and they both have pro's and con's. Let the struct contain a pointer and use malloc to allocate memory for the string on the heap. (not recommended for embedded projects!) Let the struct contain an array of chars that can contain the name. For option 1 the declaration is fine, but the getName function would have to look as follows: void get_person(struct person* p) { char* new_name = malloc(20); // on heap, so remember to check if it returns NULL ! printf("input new name for person:"); scanf("%s", new_name); p->name = new_name; printf("address of new_name = %p\n", new_name); } Of course, now you have to check and handle the case where we run out of memory and malloc returns NULL, we also have to be cognisant of heap fragmentation and most importantly we now have to be very careful to ensure that the memory gets freed or we will have a memory leak! For option 2 the structure and the function has to change to something like the following: struct person { char name[20]; } void get_person(struct person* p) { printf("input new name for person:"); scanf("%s", p->name); printf("address of p->name = %p\n", p->name); } Of course now we use 20 bytes of memory regardless how long the name is, but on the upside we do not have to worry about freeing the memory, when the instance goes out of scope the compiler will take care of that for us. Also now we can assign one person struct to another which will actually copy the entire string and we still have the option of passing it by reference by using the address of the object! Conclusion Be careful when using C strings in structures, there are a lot of ways these can get you into trouble. Memory leaks and shallow copies, where you make a copy of the pointer but not the string, are very likely to catch you sooner rather than later.
  10. Your example is actually quite complex and it contains a number of really typical and very dangerous mistakes. I think it will take up way too much space here in the comments to answer this properly so I will write another blog just about your question here and post that over the weekend when I get some time to put it together. A hint to what is wrong with the code: The buffer you are using in get_person() is allocated on the stack, but the space is made available for use after the function returns. After this point you have a struct which contains a pointer to the buffer on the stack, but the memory no longer belongs to you and when you call any other function it will get overwritten. There is no rule that the compiler must allocate variables on the stack consecutively, and actually XC8 uses a "compiled stack" which means that variables are placed permanently where they will not overlap. You can probably get behavior closer to what you expect if you change the stack model to use a software stack. The last time this happened to someone in my team they were calling get_person for the first name, and then for the second name, and after calling it twice they tried to print out both names and both the structs had the same string, if you call it many times all the people will have the last name you entered. Try this with your code: struct person p1 = {"nobody"}; struct person p2 = {"nobody"}; get_person(&p1); get_person(&p2); printf("p1 name = %s\n", p1.name); printf("p2 name = %s\n", p2.name); You will enter a new name in get_person for each one, and after that the printing should not print nobody but the 2 different names you have entered, and both names should be different. Let me know if that behaves as you expected? Read my upcoming blog on stack usage and pointers to see why 🙂
  11. Thanks for pointing that out! I looked over the code 3 times before I realized you were referring to the Table 🙂 And yes I agree there are a lot of ways to optimize what happens before the loop, we decided not to even bother with making that smaller as it did not form part of our focus here. For example I also noticed that the Start/MCC generated code for the AVR was actually quite large, more than 4x what the size was for the PIC code - in fact it took more than 800 instructions to just make a loop that toggles a pin. In ASM I can do all that in almost 100x less, but if this was going to be a reasonably fast read we had to be laser focussed. The idea in the longer run is to add a lot of small posts that focus on one specific aspect every time, and we will be sure to cover the advantages of OUT in the future sometime based on your recommendation.
  12. We just started a new blog which aims to create some benchmarks for comparing performance of small microcontrollers. This is something that we always needed to do ourselves as part of processor selection, so we decided to share our experiments and results here for everybody to use.
  13. Comparing raw pin toggling speed AVR ATmega4808 vs PIC 16F15376 Toggling a pin is such a basic thing. After all, we all start with that Blinky program when we bring up a new board. It is actually a pretty effective way to compare raw processing speed between any two microcontrollers. In this, our first Toe-to-toe showdown, we will be comparing how fast these cores can toggle a pin using just a while loop and a classic XOR toggle. First, let's take a look at the 2 boards we used to compare these cores. These were selected solely because I had them both lying on my desk at the time. Since we are not doing anything more than toggling a pin we just needed an 8-bit AVR core and an 8-bit PIC16F1 core on any device to compare. I do like these two development boards though, so here are the details if you want to repeat this experiment. In the blue corner, we have the AVR, represented by the ATmega4808, sporting an AtMega core (AVRxt in the instruction manual) Clocking at a maximum of 20MHz. We used the AVR-IOT WG Development Board, part number AC164160. This board can be obtained for $29 here: https://www.microchip.com/Developmenttools/ProductDetails/AC164160 Compiler: XC8 v2.05 (Free) In the red corner, we have the PIC, represented by the 16F15376, sporting a PIC16F1 Enhanced Midrange core. Clocking at a maximum of 32MHz. We used the MPLAB® Xpress PIC16F15376 Evaluation Board, part number DM164143. This board can be obtained at $12 here: https://www.microchip.com/developmenttools/ProductDetails/DM164143 Compiler: XC8 v2.05 (Free) Results This is what we measured. All the details around the methodology we used and an analysis of the code follows below and attached you will find all the source code we used if you want to try this at home. The numbers in the graph are pin toggling frequency in kHz after it has been normalized to a 1MHz CPU clock speed. How we did it (and some details about the cores) Doing objective comparisons between 2 very different cores is always hard. We wanted to make sure that we do an objective comparison between the cores which you can use to make informed decisions on your project. In order to do this, we had to deal with the fact that the maximum clock speed of these devices is not the same and also that the fundamental architecture of these two cores is very different. In principle, the AVR is a Load-store Architecture machine with a 1 stage pipeline. This basically means that all ALU operations have to be performed between CPU registers and the RAM is used to load from and store results to. The PIC, on the other hand, uses a Register Memory Architecture, which means in short that some ALU operations can be performed on RAM locations directly and that the machine has a much smaller set of registers. On the PIC all instructions are 1 word in length, which is 14-bits wide, while the data bus is 8-bits in size and all results will be a maximum of 8-bits in size. On the AVR instructions can be 16-bit or 32-bit wide which results in different execution times depending on the instruction. Both processors have a 1 stage pipeline, which means that the next instruction is fetched while the current one is being executed. This means branching causes an incorrect fetch and results in a penalty of one instruction cycle. One major difference is that the AVR, due to its Load-store Architecture, is capable of completing the instruction within as little as just one clock cycle. When instructions need to use the data bus they can take up to 5 clock cycles to execute. Since the PIC has to transfer data over the bus it takes multiple cycles to execute an instruction. In keeping with the RISC paradigm of highly regular instruction pipeline flow, all instructions on the PIC take 4 clock cycles to execute. All of this just makes it tricky and technical to compare performance between these processors. What we decided to do is rather take typical tasks we need the CPU to perform which occurs regularly in real programs and simply measure how fast each CPU can perform these tasks. This should allow you to work backwards from what your application will be doing during maximum throughput pressure on the CPU and figure out which CPU will perform the best for your specific problem. Round 1: Basic test For the first test, we used virtually the same code on both processors. Since both of these are supported by MCC it was really easy to get going. We created a blank project for the target CPU Fired up MCC Adjusted the clock speed to the maximum possible Clicked in the pin manager to make a single pin on PORTC an output Hit generate code. After this all we added was the following simple while loop: PIC AVR while (1) { LATC ^= 0xFF; } while (1) { PORTC.OUT ^= 0xFF; } The resulting code produced by the free compilers (XC8 v2.05 in both cases) was as follows, interestingly enough both loops had the same number of instructions (6 in total) including the loop jump. This is especially interesting as it will show how the execution of a same-length loop takes on each of these processors. You will notice that without optimization there is some room for improvement, but since this is how people will evaluate the cores at first glance we wanted to go with this. PIC AVR .tg {border-collapse:collapse;border-spacing:0;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg .tg-0lax{text-align:left;vertical-align:top} Address Hex Instruction 07B3 30FF MOVLW 0xFF 07B4 00F0 MOVWF __pcstackCOMMON 07B5 0870 MOVF __pcstackCOMMON, W 07B6 0140 MOVLB 0x0 07B7 069A XORWF LATC, F 07B8 2FB3 GOTO 0x7B3 Address Hex Instruction 017D 9180 LDS R24, 0x00 017E 0444 017F 9580 COM R24 0180 9380 STS 0x00, R24 0181 0444 0182 CFFA RJMP 0x17D We used my Saleae logic analyzer to capture the signal and measure the timing on both devices. Since the Saleae is thresholding the digital signal and the rise and fall times are not always identical you will notice a little skew in the measurements. We did run everything 512x slower to confirm that this was entirely measurement error, so it is correct to round all times to multiples of the CPU clock in all cases here. PIC AVR Analysis For the PIC The clock speed was 32MHz. We know that the PIC takes 4 clock cycles to execute one instruction, which gives us an expected instruction rate of one instruction every 125ns. Rounding for measurement errors we see that the PIC has equal low and high times of 875ns. That is 7 instruction cycles for each loop iteration. To verify if this makes sense we can look at the ASM. We see 6 instructions, the last of which is a GOTO, which we know will take 2 instruction cycles to execute. Using that fact we can verify that the loop repeats every 7 instruction cycles as expected (7 x 125ns = 875ns.) For the AVR The clock speed was 20MHz. We know that the AVR takes 1 clock cycle per instruction, which gives us an expected instruction rate of one instruction every 50ns. Rounding for measurement errors we see that the AVR has equal low and high times of 400ns. That is 8 instruction cycles for each loop iteration. To verify if this makes sense we again look at the ASM. We see 4 instructions, the last of which is an RJMP, which we know will take 2 instruction cycles to execute. We also see one LDS which takes 3 cycles because it is accessing sram, and one STS instruction which will each take 2 cycles and a Complement instruction which takes 1 more. Using those facts we can verify that the loop should repeat every 8 instruction cycles as expected (8 x 50ns = 400ns.) Comparison Since the 2 processors are not running at the same clock speed we need to do some math to get a fair comparison. We think 2 particular approaches would be reasonable. Compare the raw fastest speed the CPU can do, this gives a fair benchmark where CPU's with higher clock speeds get an advantage. Normalize the results to a common clock speed, this gives us a fair comparison of capability at the same clock speed. In the numbers below we used both methods for comparison. The numbers AVR PIC Notes Clock Speed 20MHz 32MHz Loop Speed 400ns 875ns Maximum Speed 2.5Mhz 1.142MHz Loop speed as a toggle frequency Normalized Speed 125kHz 35.7kHz Loop frequency normalized to a 1MHz CPU clock ASM Instructions 4 6 Loop Code Size 12 bytes 12 bytes 4 instructions 6 words 10.5 bytes 6 instructions Due to the nuances here we compared this 3 ways Total Code Size 786 bytes 101 words 176.75 bytes Round 2: Expert Optimized test For the second round, we tried to hand-optimize the code to squeeze out the best possible performance from each processor. After all, we do not want to just compare how well the compilers are optimizing, we want to see what is the absolute best the raw CPU's can achieve. You will notice that although optimization doubled our performance, it made little difference to the relative performance between the two processors. For the PIC we wrote to LATC to ensure we are in the right bank, and pre-set the W register, this means the loop reduces to just a XORF and a GOTO. For the AVR we changed the code to use the Toggle register instead doing an XOR of the OUT register for the port. The optimized code looked as follows. PIC AVR LATC = 0xFF; asm ("MOVLW 0xFF"); while (1) { asm ("XORWF LATC, F"); } asm ("LDI R30,0x40"); asm ("LDI R31,0x04"); asm ("SER R24"); while (1){ asm ("STD Z+7,R24"); } The resulting ASM code after these changes now looked as follows. Note we did not include the instructions outside of the loop here as we are really just looking at the loop execution. PIC AVR .tg {border-collapse:collapse;border-spacing:0;} .tg td{font-family:Arial, sans-serif;font-size:14px;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg th{font-family:Arial, sans-serif;font-size:14px;font-weight:normal;padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;border-color:black;} .tg .tg-0lax{text-align:left;vertical-align:top} Address Hex Instruction 07C1 069A XORWF LATC, F 07C2 00F0 GOTO 0x7C1 Address Hex Instruction 0180 8387 STD Z+7,R24 0181 CFFE RJMP 0x180 Here are the actual measurements: PIC AVR Analysis For the PIC we do not see how we could improve on this as the loop has to be a GOTO which takes 2 cycles and 1 instruction is the least amount of work we could possibly do in the loop so we are pretty confident that this is the best we can do, and when measuring we see 3 instruction cycles which we think is the limit here. Note: N9WXU did suggest that we could fill all the memory with XOR instructions and let it loop around forever and in doing so save the GOTO, but we would still have to set W to FF every second instruction to have consistent timing, so this would still be 2 instructions per "loop" although it would use all the FLASH and execute in 250ns. Novel as this idea was, since that means you can do nothing else we dismissed that idea as not representative. For the AVR we think we are also at the limits here. The toggle register lets us toggle the pin in 1 clock cycle which cannot be beaten, and the RJMP unavoidably adds 2 more. We measure 3 cycles for this. AVR PIC Notes Clock Speed 20MHz 32MHz Loop Speed 150ns 375ns Maximum Speed 6.667Mhz 2.667MHz Loop speed as a toggle frequency Normalized Speed 333.3kHz 83.3kHz Loop frequency normalized to a 1MHz CPU clock ASM Instructions 2 2 Loop Code Size 4 bytes 4 bytes 2 words 3.5 bytes At this point, we can do a raw comparison of absolute toggle frequency performance after the hand optimization. Comparing this way gives the PIC the advantage of running at 32MHz while the AVR is limited to 20MHz. Interestingly the PIC gains a little as expected, but the overall picture does not change much. The source code can be downloaded here: PIC MPLAB-X Project file MicroforumToggleTestPic16f1.zip AVR MPLAB-X Project file MicroforumToggleTest4808.zip What next? For our next installment, we have a number of options. We could add more cores/processors to this test of ours, or we can take a different task and cycle through the candidates on that. We could also vary the tools by using different compilers and see how they stack up against each other and across the architectures. Since our benchmarks will all be based on real-world tasks it should not matter HOW the CPU is performing the task or HOW we created the code, the comparison will simply be how well the job gets done. Please do post any ideas or requests in the comments and we will see if we can either improve this one or oblige with another Toe-to-toe comparison. Updates: This post was updated to use the 1 cycle STD instruction instead of the 2 cycle STS instruction for the hand-optimized AVR version in round 2
  14. I think with these drivers it always depends on what you need and if this is a good match for your requirements. If you say that this seems very complicated it sounds like this driver probrably is doing a lot more than you need, so my first question would be "what do you expect instead"? I know N9WXU has talked to much of this already and also said it depends on whether you require a ring buffer. What I can offer you is what the thinking was for this implementation. For this driver the requirements were: We have to be able to receive data on interrupt, this ensures we never miss any bytes due to data overrun (next byte received before prior byte is read out). The application does NOT want to process the data inside of the ISR, so we need to store the data temporarily We need to be able to store multiple bytes, which means we may have 7 or 8 interrupts before we process the data. This means the application can safely take much longer to process the data before we lose data. If you serial port is running 115200 it means one character arrives every 87us. If we have a buffer of 16 bytes it means we only need to service the serial port at least once every 1.4ms to be sure we never lose any data, this extra time can be very important to us, and we can make the buffer bigger to get even more time. If this matches your situation you need to do the following. In the ISR you need to: 1. Store the data in a temp buffer // use this default receive interrupt handler code eusart2RxBuffer[eusart2RxHead++] = RCREG2; 2. Handle wrapping of the buffer when you reach the end of the array if(sizeof(eusart2RxBuffer) <= eusart2RxHead) { eusart2RxHead = 0; } 3. And keep track of how many bytes we have ready for processing eusart2RxCount++; When the app processes the data you need to: 1. If you try to read 1 byte before the data is available you need to wait until there is something to read uint8_t readValue = 0; while(0 == eusart2RxCount) { } 2. Remove one byte from the ring buffer and again handle the wrapping at the end of the array readValue = eusart2RxBuffer[eusart2RxTail++]; if(sizeof(eusart2RxBuffer) <= eusart2RxTail) { eusart2RxTail = 0; } 3. Reduce the number of bytes in the buffer by one (we need to disable the ISR or a collision can happen here), and then return the retrieved byte PIE3bits.RC2IE = 0; eusart2RxCount--; PIE3bits.RC2IE = 1; return readValue; I think this is kind of as simple as you can ever do this without reducing the functionality. Of course if you just wanted to without interrupts read the single byte all you need is to return RCREG, which BTW is what you will get if you uncheck the interrupt checkbox. Also if you do not like all of the ring buffer stuff you can enable interrupts and replace the ISR with the same thing you get when you do not need a buffer, then it is simpler but more likely to lose data. PS. I did not describe the eusart2RxLastError line as I cannot see all the code for that here and I cannot remember the details about that one line. What it is doing is updating a global (eusart2RxLastError) to indicate if there was any error. From the looks of this code snippet that part of the code may have some bugs in it as the last error is not updated after the byte is read out, but I may just be missing the start of the ISR ...
×
×
  • Create New...