Jump to content
 

N9WXU

Member
  • Content Count

    84
  • Joined

  • Last visited

  • Days Won

    27

N9WXU last won the day on January 2

N9WXU had the most liked content!

Community Reputation

41 Excellent

1 Follower

About N9WXU

  • Rank
    Contributor

Recent Profile Visitors

644 profile views
  1. Hey, I just noticed that there are some over-clock options. Here is the result when clocked at 960MHz. I could not get it to run at 1GHz. They did warn that cooling was required.
  2. More Data! I just got a Teensy 4 and it is pretty fast. Compiling it in "fastest" and 600Mhz provides the following results. Strangely compiling it in "faster" provides the slightly better results. (6ns) This is pretty fast but I was expecting a bit more performance since it is 6x faster than the Teensy 3.2 tested before. There is undoubtedly a good reason for this performance, and I expect pin toggling to be limited by wait states in writing to the GPIO peripherals. In any case this is still a fast result.
  3. When comparing CPU's and architectures it is also a good idea to compare the frameworks and learn how the framework will affect your system. In this article I will be comparing a number of popular Arduino compatible systems to see how different "flavors" of Arduino stack up in the pin toggling test. When I started this effort, I thought it would be a straight forward demonstration of CPU efficiency, clock speed and compiler performance on the one side against the Arduino framework implementation on the other. As is often the case, if you poke deeply into even the most trivial of systems you will always find something to learn. As I look around my board stash I see that there are the following Arduino compatible development kits: Arduino Nano Every (ATMega 4809 @ 20MHz AVR Mega) Mini Nano V3.0 (ATMega 328P @ 16MHz AVR) RobotDyn SAMD21 M0-Mini (ATSAMD21G18A @ 48MHz Cortex M0+) ESP-12E NodeMCU (ESP8266 @ 80MHz Tenselica) Teensy 3.2 (MK20DX256VLH7 @ 96MHz Cortex M4) ESP32-WROOM-32 (ESP32 @ 240MHz Tenselica) And each of these kits has an available Arduino framework. Say what you will about the Arduino framework, there are some serious advantages to using it and a few surprises. For the purpose of this testing I will be running one program on every board. I will use vanilla "Arduino" code and make zero changes for each CPU. The Arduino framework is very useful for normalizing the API to the hardware in a very consistent and portable manner. This is mostly true at the low levels like timers, PWM and digital I/O, but it is very true as you move to higher layers like the String library or WiFi. Strangely, there are no promises of performance. For instance, every Arduino program has a setup() function where you put your initialization and a loop() function that is called very often. With this in mind it is easy to imagine the following implementation: extern void setup(void); extern void loop(void); void main(void) { setup(); while(1) { loop(); } } And in fact when you dig into the AVR framework you find the following code in main.cpp int main(void) { init(); initVariant(); #if defined(USBCON) USBDevice.attach(); #endif setup(); for (;;) { loop(); if (serialEventRun) serialEventRun(); } return 0; } There are a few "surprises" that really should not be surprises. First, the Arduino environment needs to be initialized (init()), then the HW variant (initVariant()), then we might be using a usb device so get USB started (USBDevice.attach()) and finally, the user setup() function. Once we start our infinite loop. Between calls to the loop function the code maintains the serial connection which could be USB. I suppose that other frameworks could implement this environment a little bit differently and there could be significant consequences to these choices. The Test For this test I am simply going to initialize 1 pin and then set it high and low. Here is the code. void setup() { pinMode(2,OUTPUT); } void loop() { digitalWrite(2,HIGH); digitalWrite(2,LOW); } I am expecting this to make a short high pulse and a slightly longer low pulse. The longer low pulse is to account for the extra overhead of looping back. This is not likely to be as fast as the pin toggles Orunmila did in the previous article but I do expect it to be about half as fast. Here are the results. The 2 red lines at the bottom are the best case optimized raw speed from Orunmila's comparison. That is a pretty interesting chart and if we simply compare the data from the ATMEGA 4809 both with ASM and Arduino code, you see a 6x difference in performance. Let us look at the details and we will summarize at the end. Nano 328P So here is the first victim. The venerable AVR AT328P running 16MHz. The high pulse is 3.186uS while the low pulse is 3.544uS making a pulse frequency of 148.2kHz. Clearly the high and low pulses are nearly the same so the extra check to handle the serial ports is not very expensive but the digitalWrite abstraction is much more expensive that I was anticipating. Nano Every The Nano Every uses the much newer ATMega 4809 at 20Mhz. The 4809 is a different variant of the AVR CPU with some additional optimizations like set and clear registers for the ports. This should be much faster. The high pulse is 1.192uS and the low pulse is 1.504uS. Again the pulses are almost the same size so the additional overhead outside of the loop function must be fairly small. Perhaps it is the same serial port test. Interestingly, one of the limiting factors of popular Arduino 3d printer controller projects such as GRBL is the pin toggle rate for driving the stepper motor pulses. A 4809 based controller could be 2x faster for the same stepper code. Sam D21 Mini M0 Now we are stepping up to an ARM Cortex M0 at 48Mhz. I actually expect this to be nearly 2x performance as the 4809 simply because the instructions required to set pins high and low should be essentially the same. Wow! I was definitely NOT expecting the timing to get worse than the 4809. The high pulse width is 1.478uS and the low pulse width is 1.916uS making the frequency 294.6kHz. Obviously toggling pins is not a great measurement of CPU performance but if you need fast pin toggling in the Arduino world, perhaps the SAMD21 is not your best choice. Teensy 3.2 This is a NXP Cortex M4 CPU at 96 MHz. This CPU is double the clock speed as the D21 and it is a M4 CPU which has lots of great features, though those features may not help toggle pins quickly. Interesting. Clearly this device is very fast as shown by the short high period of only 0.352uS. But, this framework must be doing quite a lot of work behind the scenes to justify the 2.274uS of loop delay. Looking a little more closely I see a number of board options for this hardware. First, I see that I can disable the USB. Surely the USB is supported between calls to the loop function. I also see a number of compiler optimization options. If I turn off the USB and select the "fastest" optimizations, what is the result? Teensy 3.2, No USB and Fastest optimizations Making these two changes and re-running the same C code produces this result: That is much better. It is interesting to see the compiler change is about 3x faster for this test (measured on the high pulse) and the lack of USB saves about 1uS in the loop rate. This is not a definitive test of the optimizations and probably the code grew a bit, but it is a stark reminder that optimization choices can make a big difference. ESP8266 The ESP8266 is a 32-bit Tenselica CPU. This is still a load/store architecture so its performance will largely match ARM though undoubtedly there are cases where it will be a bit different. The 8266 runs at 80Mhz so I do expect the performance to be similar to the Teensy 3.2. The wildcard is the 8266 framework is intended to support WiFI so it is running FreeRTOS and the Arduino loop is just one thread in the system. I have no idea what that will do to our pin toggle so it is time to measure. Interesting. It is actually quite slow and clearly there is quite a bit of system house-keeping happening in the main loop. The high pulse is only 0.948uS so that is very similar to Nano Every at 1/4th the clock speed. The low pulse is simply slow. This does seem to be a good device for IoT but not for pin toggling. ESP32 The ESP32 is a dual core very fast machine, but it does run the code out of a cache. This is because the code is stored in a serial memory. Of course our test is quite short so perhaps we do not need to fear the cache miss. Like the ESP8266, the Arduino framework is built upon a FreeRTOS task. But this has a second CPU and lots more clock speed so lets look at the results: Interesting, the toggle rate is about 2x the Teensy while the clock speed is about 3x. I do like how the pulses are nearly symmetrical. A quick peek at the source code for the framework shows the Arduino running as a thread but the thread updates the watchdog timer and the serial drivers on each pass through the loop. Conclusions It is very educational to make measurements instead of assumptions when evaluating an MCU for your next project. A specific CPU may have fantastic specifications and even demonstrations but it is critical to include the complete development system and code framework in your evaluation. It is a big surprise to find the 16MHz AVR328P can actually toggle a pin faster than the ESP8266 when used in a basic Arduino project. The summary graph at the top of the article is duplicated here: In this graph, the Pin Toggling Speed is actually only 1/(the high period). This was done on purpose so only the pin toggle efficiency is being compared. In the test program, the low period is where the loop() function ends and other housekeeping work can take place. If we want to compare the CPU/CODE efficiency, we should really normalize the pin toggling frequency to a common clock speed. We can always compensate for inefficiency with more clock speed. This graph is produced by dividing the frequency by the clock speed and now we can compare the relative efficiencies. That Cortex M4 and its framework in the Teensy 3.2 is quite impressive now. Clearly the ESP-32 is pretty good but using its clock speed for the win. The Mega 4809 has a reasonable framework just not enough clock speed. All that aside, the ASM versions (or even a faster framework) could seriously improve all of these numbers. The poor ESP8266 is pretty dismal. So what is happening in the digitalWrite() function that is making this performance so slow? Put another way, what am I getting in return for the low performance? There are really 3 reasons for the performance. Portability. Each device has work to adapt to the pin interface so the price of portability is runtime efficiency Framework Support. There are many functions in the framework that could be affected by the writing to the pins so the digitalWrite function must modify other functions. Application Ignorance. The framework (and this function) cannot know how the system is constructed so they must plan for the worst. Let us look at the digitalWrite for the the AVR void digitalWrite(uint8_t pin, uint8_t val) { uint8_t timer = digitalPinToTimer(pin); uint8_t bit = digitalPinToBitMask(pin); uint8_t port = digitalPinToPort(pin); volatile uint8_t *out; if (port == NOT_A_PIN) return; // If the pin that support PWM output, we need to turn it off // before doing a digital write. if (timer != NOT_ON_TIMER) turnOffPWM(timer); out = portOutputRegister(port); uint8_t oldSREG = SREG; cli(); if (val == LOW) { *out &= ~bit; } else { *out |= bit; } SREG = oldSREG; } Note the first thing is a few lookup functions to determine the timer, port and bit described by the pin number. These lookups can be quite fast but they do cost a few cycles. Next we ensure we have a valid pin and turn off any PWM that may be active on that pin. This is just safe programming and framework support. Next we figure out the output register for the update, turn off the interrupts (saving the interrupt state) set or clear the pin and restore interrupts. If we knew we were not using PWM (like this application) we could omit the turnOffPWM function. If we knew all of our pins were valid we could remove the NOT_A_PIN test. Unfortunately all of these optimizations require knowledge of the application which the framework cannot know. Clearly we need new tools to describe embedded applications. This has been a fun bit of testing. I look forward to your comments and suggestions for future toe-to-toe challenges. Good Luck and go make some measurements. PS: I realize that this pin toggling example is simplistic at best. There are some fine Arduino libraries and peripherals that could easily toggle pins much faster than the results shown here. However, this is a simple Apples to Apples test of identical code in "identical" frameworks on different CPU's so the comparisons are valid and useful. That said, if you have any suggestions feel free to enlighten us in the comments.
  4. The two functions you see are both halves of a ring buffer driver. The first function unloads the UART receive buffer and puts the bytes into the array eusart2RXBuffer. This array is indexed by eusart2RXHead. The head is always incremented and it rolls over when it reaches the maximum value. This receiving function creates a basic ring buffer insert that sacrifices error handling for speed. There are four possible errors that can occur. UART framing error. If a bad UART signal arrives the UART will abort reception with a framing error. It can be important to know if framing errors have occurred, and it is critical that the framing error bit be cleared if it gets set. The UART receiver is overrun. This happens if a third byte begins before any bytes are removed from the UART. With an ISR unloading the receiver this is generally not a real threat but if the baudrate is very high, and/or interrupts are disabled for too long, it can be a problem. The ring buffer head overwrites the tail. The oldest bytes will be lost but worse, the tail is not "pushed" ahead so the next read will return the newest data and then the oldest data. That can be a strange bug to sort out. It is better to add a check for head == tail and then increment the tail in that instance. This error is perhaps an extension of #3. The eusart2RxCount variable keeps track of the bytes in the buffer. This makes the while loop at the beginning of the read function much more efficient (probably 2 instructions on a PIC16). However if there is a head-tail collision, the the count variable will be too high which will later cause a undetected underrun in the read function. The second function is to be called from your application to retrieve the data captured by the interrupt service routine. This function will block until data is available. If you do not want to block, there are other functions that indicate the number of bytes available. The read function does have a number of lines of code, but it is a very efficient ring buffer implementation which extends the UART buffer size and helps keep UART receive performance high. That said, not all UART applications require a ring buffer. If you turn off the UART interrupts, you should get simple polling code that blocks for a character but does not add any buffers. The application interface should be identical (read) there will simply be no interrupt or buffers supporting the read function.
  5. Excellent points and I think we are almost completely in agreement. I have only four complaints with these helper macros and two of them are relatively minor. Consider the following snapshot from ATMEL Studio in a SAMD21 project the <CTRL><SPACE> pattern works great when you know the keyword. If you simply start with SERCOM, you get a number of matches that are not UART related. ALL the choices will compile. See the BAUD value placed in the CTRLA register... Some of these choices are used like macros...And others are not (CMODE) Placing an invalid value in these macros is completely reasonable and will compile. MISRA 19.7 disallows function-like macros. (though this is not really an issue because it does not apply in this situation.) So, these constructs are handy because they help prevent a certain sort of bookkeeping error related to sorting out all of these offsets. But the constructs allow range errors and semantic errors. Since we are talking about the SERCOM and I am using the SAMD21 in my example. Here is what START produces. hri_sercomusart_write_CTRLA_reg( SERCOM0, 1 << SERCOM_USART_CTRLA_DORD_Pos /* Data Order: enabled */ | 0 << SERCOM_USART_CTRLA_CMODE_Pos /* Communication Mode: disabled */ | 0 << SERCOM_USART_CTRLA_FORM_Pos /* Frame Format: 0 */ | 0 << SERCOM_USART_CTRLA_SAMPA_Pos /* Sample Adjustment: 0 */ | 0 << SERCOM_USART_CTRLA_SAMPR_Pos /* Sample Rate: 0 */ | 0 << SERCOM_USART_CTRLA_IBON_Pos /* Immediate Buffer Overflow Notification: disabled */ | 0 << SERCOM_USART_CTRLA_RUNSTDBY_Pos /* Run In Standby: disabled */ | 1 << SERCOM_USART_CTRLA_MODE_Pos); /* Operating Mode: enabled */ This is completely different than the other examples provided. This style is defined in hri_sercom_d21.h and a word to the wise, DO NOT BROWSE THIS INSIDE OF ATMEL START, This file must be huge as the page is completely frozen as it populates the source viewer. So I do like the construct, but often it does not help me very much because I must still go through the entire register and decide my values. When I string these values together any mistakes made will be sorted at once. All that aside, I don't really care much about the HW initialization. I want it to be short, to the point, and perfectly clear about what is placed in each register. In a typical project, only 1% of the code is some sort of HW work as you develop the application specific HW interfaces and bring up your board. Once these are tested, you are not likely to spend any more time on them. In a 9 month project, I expect to spend 2 weeks in the HW functions so if a magic value is clear and to the point, use it. If you want to construct your values with logical operations. Go for it.
  6. But in all of your examples you are not telling me why you are doing that bit of work. I cannot possibly determine if there is a bug if I don't know why you are configuring the SERCOM with that particular value. How about simply saying: void configureSerialForMyDataLink(void) { // datalink specifications found in specification 4.3.2 // using SERCOM0 as follows: // - Alternate Pin x,y,z // - 9600 baud // - half duplex // - SamD21 datasheet page 26 for specifics SERCOM0 = <blah blah blah>; } Now you know why. You have a function that has a clear purpose. And if the link is invalid, you can see the intent. The specifics of the bits are in the datasheet and clearly referenced. No magic here. As for the special access mode for performance... inline void SERCOM0_WRITE(uint32_t ControllOffset, uint32_t Value) { // Accessing the SERCOM via DFP offsets for high performance (* (uint32_t*) (0x42000400 + ControlOfset)) = Value; } Now a future engineer has a handy helper and the details are nicely removed. And an interested engineer can debug it because the intent is clear. Obviously you need to be a DFP expert (or have the datasheet) to understand/edit it. But no magic. But the application should NEVER use this helper. It should be buried in the HAL. The first function is much more clear for the HAL because it conveys application level intent. i.e. the Application will rarely care about the SERCOM and will always care about its DataLink. If I port the code to something without a SERCOM, the application will still need a DataLink so this function will simply be refilled with something suitable for the other CPU. The application remains unchanged.
  7. As tempting as it is to duplicate the datasheet in your code, there is much to dislike about this strategy. Single Source of Truth should be the manufacturers datasheet... Not what you copied into the code. The signal to noise ratio of the code will suck, making debugging more challenging. The register description is not always enough so this slippery slope will have you copying the entire peripheral chapter. Your future maintainer will not have your knowledge of the device so they will need the datasheet anyway.
  8. Comments describing the desired affect of the register value is ideal. So here is an example. void startMotorControlPWM(void) { // configuring TMR2 for 8-bit and 8.3khz PWM // See Requirements section 4.3.2.1 T2CON = 0xXX; // TMR2 1:1 prescaler PR2 = 0x4F; // reload at 6-bits (2 bits implied) produces 8-bit PWM CCP1CON = 0xXX; // Using CCP1 for PWM with TMR2 CCPR1L = 0xXX; // initial PWM at 25% for a slow motor rotation. } Note that magic numbers are used... but the explanations make the intent clear. And the necessary requirements are referenced for trace ability. When your future self is debugging, you can verify the intent to the requirements and then verify the magic numbers against the datasheet. Making changes during debugging would be done right here with the magic numbers and that will be clear and easy because the datasheet register maps will be handy. Once this code is working, the top level function name is easy and clear.
  9. Great points. So perhaps there are some unique considerations to the magic number story when dealing directly with HW. I too have frequently seen the comments diverge from the code and your simple "fix" definitely demonstrates that point. Switching to some sort of macro or enumerations and "building" the constant for the register is only a partial help. For example: INTCON = _INTCON_IOCIF_MASK | _INTCON_IOCIE_MASK | _ADCON0_nDONE_MASK; This example is a bit contrived because Microchip now includes the register name and the bit name and the type of constant (mask). However I still see code where various datasheet constants are used but the language provides no guarantee that the constant belongs to a specific register. Note the bug above is ORing in the ADCON0 constant with the INTCON constants. This will compile clean and might even work if the bit happens to be correct. These sorts of constructs in the HW code are really too tactical. We get caught up with the details setting some bit in a register and that should NEVER the goal of the function we are writing. void startPWM(void) { <blah blah blah> } If my function is startPWM, then it is clear what the need is. If I decided to use the CCP as that PWM, or a timer interrupt, either choice will work. Now I have the beginnings of a hardware abstraction layer. After much debate, MCC chose the following pattern: void EPWM1_Initialize (void) { // Set the PWM to the options selected in PIC10 / PIC12 / PIC16 / PIC18 MCUs // CCP1M P1A,P1C: active high; P1B,P1D: active high; DC1B 3; P1M single; CCP1CON = 0x3C; // CCP1ASE operating; PSS1BD0 low; PSS1AC0 low; CCP1AS0 disabled; ECCP1AS = 0x00; // P1RSEN automatic_restart; P1DC0 0; PWM1CON = 0x80; // STR1D P1D_to_port; STR1C P1C_to_port; STR1B P1B_to_port; STR1A P1A_to_CCP1M; STR1SYNC start_at_begin; PSTR1CON = 0x01; // CCPR1L 127; CCPR1L = 0x7F; // CCPR1H 0; CCPR1H = 0x00; // Selecting Timer2 CCPTMRS0bits.C1TSEL = 0x0; } This code has both magic numbers and an explanation. The reasons were as follows: Maximum performance. Slamming in a full value is always the fastest choice. Explain the choice with a comment. Magic values are fast and simple. The machine generated comment will be correct to the datasheet but will not convey "why". We can't know why so this is the next best thing. MCC is NOT a hardware abstraction layer but it does give a quick setup to many common functions. If you are really writing a HAL targeting your application, you will have more application defined interface functions and what happens inside the interface functions is very isolated. I would argue that if you tried to edit those functions WITHOUT the datasheet open, you must really know that part. ONe example is clearing an interrupt flag on a PIC16 is a simple as writing a 0 to the bit. Clearing the interrupt flag on an AVR requires writing a 1 to the bit. No amount of special macros and enumerations will make the code look "correct" to a both AVR freak and a PIChead. Each choice is correct for the architecture for sound technical reasons but you should not expect the HW behavior to be as obvious as which bit to select. So I would argue that even unsupported Magic Numbers in the hardware abstraction layer is perfectly acceptable. Please reference the datasheet chapter & verse so I can see what you were doing to the hardware. A bit of explanation documenting what the function is accomplishing and any clever bits in the HW you were taking advantage of would be very helpful. As a parting note, consider the CLC. There are LOTS of bits and the WHY is critical. Once you have drawn the schematic of the CLC circuit and decided how the bits are going to be set, just put the values in the right spots and don't worry about trying to create special macros. Good Luck. PS: Don't assume that I am 100% for magic numbers in the code. I just don't like avoiding them when they are the shortest, fastest code, and (once the datasheet is open) the easiest to debug.
  10. I have been told many times that we should avoid the use of magic numbers in our code. That is numbers like these: CMCON0 = 0x5d; for(x = 0; x < 29; x ++) { // do something clever with x } The rational is that we don't really know what 0x5D means when it is written to CMCON0 or why we should be stopping at 28 in the for loop. However if you made the code like this: #define CMCON0_PWM_CONFIGURATION <PWM_MAGIC_NUMBERS> // See Datasheet Chapter 3.4 on PWM configuration #define MAXIMUM_PLAYER_COUNT 29 CMCON0 = CMCON0_PWM_CONFIGURATION; for(x = 0; x < MAXIMUM_PLAYER_COUNT; x ++) { // do something with each player } Clearly this code is much more readable. However, there is a time when this practice is simply crazy. For example, If you only configure the PWM in a single place, you may not want to hunt around for the exact value placed in CMCON because you have the datasheet open in front of you so the right place for the exact value is right there on that line. Putting the value in one place to use in another place just makes more work. Or what about this example: // Found in a configuration header file #define ZERO_VALUE 0 #define ONE_VALUE 1 #define TWO_VALUE 2 #define THREE_VALUE 3 #define FOUR_VALUE 4 #define FIVE_VALUE 5 #define SIX_VALUE 6 #define SEVEN_VALUE 7 // Found later in the C code switch(device_number) { case ZERO_VALUE: <blah blah blah> break; case ONE_VALUE: <ETC> } I would LOVE a rational explanation on why doing this is little more than cargo culting a coding style. I have seen this sort of sillyness in many other places. // is this better? delay_ms(1000); // or this delay_ms(ONE_SECOND); // or perhaps this inline void delay_One_Second(void) { delay_ms(1000); } At the end of the day, I want to see numbers where they make sense. The MOST important reason for using the NO_MAGIC_NUMBER pattern is really NOT to clarify the numbers, but really to make the code DRY. For example consider this: #define MESSAGE_TIMEOUT 1000 // later delay_ms(MESSAGE_TIMEOUT); // or even better void delay_message_timeout(void); If you are going to use the MESSAGE_TIMEOUT in a large number of places and especially if you may be tweaking the value, then putting the number in one place is only sane. But the BEST way is to make the function because that is the MOST clear when you read the final code and it creates a very nice point of abstraction for any number of delay methods. The biggest argument FOR the anti-magic number pattern is really about documentation. In a simple way, the proponents of this pattern are trying to follow the "comments are a smell" pattern but usually without realizing it. In this way, a standin for the number can be used that makes the purpose clear. This is a powerful point and should be considered carefully. But, sometimes, the clearest value to put in the code is actually the MAGIC number. AND, if you don't understand that number, you have no business editing that part of the code. As always, be prudent and understand the needs of the programmers who will follow. That programmer is likely to be you. Good Luck.
  11. I am building an integrated audio interface for a Baofeng UV-5R hand-held radio. Primarily this is to get a packet radio APRS system up and running. Traditionally, this seems to be accomplished with a crazy collection of adapters and hand crafted interface cables. As I was looking for a better solution, I discovered the Teensy family of ARM microcontrollers. (actually Arduino high performance ARM's built on NXP Cortex M4's) The important part is actually the library support. Out of the box I was able to get a USB Audio device up and map the ADC and DAC to the input and output. This could all be configured with some simple "patch cord" wiring. So I created the "program" above. This combines the 2 audio channels from USB into a single DAC and also passes the data to an RMS block. This causes the audio to play on the DAC at 44.1khz. It also keeps a running RMS value available for your own code. More on this later. Next the ADC data is duplicated into both channels back through USB and on to the PC (Raspberry Pi). Again an RMS block is present here as well. Now pressing EXPORT produces this little block of "code" Which gets pasted into the Arduino IDE at the beginning of the program (before your setup & main. /* * A simple hardware test which receives audio on the A2 analog pin * and sends it to the PWM (pin 3) output and DAC (A14 pin) output. * * This example code is in the public domain. */ #include <Audio.h> #include <Wire.h> #include <SPI.h> #include <SD.h> #include <SerialFlash.h> #include <Audio.h> #include <Wire.h> #include <SPI.h> #include <SD.h> #include <SerialFlash.h> // GUItool: begin automatically generated code AudioInputUSB usb1; //xy=91,73.00001907348633 AudioInputAnalog adc1; //xy=153.00000381469727,215.00003242492676 AudioMixer4 mixer1; //xy=257.00000762939453,72.00002670288086 AudioAnalyzeRMS rms2; //xy=427.0000114440918,266.000036239624 AudioOutputUSB usb2; //xy=430.00001525878906,217.00003242492676 AudioOutputAnalog dac1; //xy=498.00009536743164,72.00002670288086 AudioAnalyzeRMS rms1; //xy=498.00001525878906,129.0000295639038 AudioConnection patchCord1(usb1, 0, mixer1, 0); AudioConnection patchCord2(usb1, 1, mixer1, 1); AudioConnection patchCord3(adc1, 0, usb2, 0); AudioConnection patchCord4(adc1, 0, usb2, 1); AudioConnection patchCord5(adc1, rms2); AudioConnection patchCord6(mixer1, dac1); AudioConnection patchCord7(mixer1, rms1); // GUItool: end automatically generated code const int LED = 13; void setup() { // Audio connections require memory to work. For more // detailed information, see the MemoryAndCpuUsage example AudioMemory(12); pinMode(LED,OUTPUT); } void loop() { // Do nothing here. The Audio flows automatically if(rms1.available()) { if(rms1.read() > 0.25) { digitalWrite(LED,HIGH); } else { digitalWrite(LED,LOW); } } // When AudioInputAnalog is running, analogRead() must NOT be used. } And Viola!. The PC sees this as an audio device and the LED blinks when the audio starts. PERFECT. Now, the LED will be replaced with the push to talk (PTT) circuit and the audio I/O will connect to the Baofeng through some filters. A single board interface to the radio from a Raspberry Pi that does not require 6 custom cables, and a 3 trips to E-BAY. Now I am waiting for my PCB's from dirtypcb.com This entire bit of work is for my radio system that is being installed on the side of my house. Here is the box: Inside is a Raspberry Pi 3, a Baofeng UV-5R for 2M work, 3 RTL dongles for receiving 1090MHz ADS-B, 978MHz ADS-B, 137MHz Satellite weather, GPS and 1 LoRaWAN 8 channel Gateway. I will write more about the configuration later if there is any interest. Good Luck.
  12. If I had a dollar for every time MPLAB puts a modal dialog box under the main window so everything is "frozen". I would retire. Put your rant below:
  13. It looks like you switched microcontrollers. I noticed the generated code was for an MSSP and now it is for an I2C peripheral. Which MCU are you using with the errors above?
  14. If you look at the MPLAB Code Configurator timeout driver you will find that each "timer" that you want to create has a structure similar to this: struct timeout { int (*callback)(void *argument); struct timeout *next; // other housekeeping things } And an API that takes such a structure. void timeout_addTimer(struct timeout *myTimer); struct timeout myTimer = {myCallback,0}; void main(void) { timeout_addTimer(&myTimer); } Note: the function names are similar in purpose to MCC but not identical. The function addTimer will accept the completely allocated timer, finish initializing the "private" data and insert this into the list of timers. The memory is "owned" and "provided" by the user but the driver will manage it. Of course this driver will also work with a heap if that is desired. void main(void) { struct timeout *myTimer = malloc(sizeof(struct timeout)); myTimer.callback = myCallback; timeout_addTimer(myTimer); // after some time has passed free(myTimer); } In both cases the application is 100% responsible for the memory and can do the most suitable thing for the circumstances. Just to be "complete" here are two other options: The stack: void main(void) { struct timeout myTimer = {myCallback,0}; timeout_addTimer(&myTimer); while(1) { // do my application } } // This is safe because myTimer will never go out of scope. // However, if you use this technique in a function that will exit, // the memory will go out of scope on the stack and there will be // a difficult to debug crash when the timer expires. Static void foo(void) { static struct timeout myTimer = {myCallback,0}; timeout_addTimer(&myTimer); } Now the timer will be in scope because the variable is static. Of course you may have a new problem that addTimer could be called with this timer already in the timer list. That may break the linked list if the list is not pre-scanned for preexisting timers. In EVERY case I do prefer static allocation that is the responsibility of the application. That assures the following: Code fails during compile/link time so it is easier to debug. Memory usage appears in the data segment so it is tracked by ALL compilers as RAM usage giving an accurate measure of RAM. The driver is scale able because it NEVER needs internally allocated memory nor does it need heap access. That said, I just started using the ESP32 and it's memory is fragmented. It has a block of RAM that is used for static variables, and the data segment. And it has a different larger block of RAM used for the heap. If you don't use the heap, you will have very little memory. Good heap hygiene should be a future topic for discussion.
  15. Sometimes I get the sad impression that embedded FW engineers only understand 1 data container, the array. The array is a powerful thing and there are many good reasons to use it but it also has some serious problems. For instance, a number of TCP/IP libraries use arrays to hold a static list of sockets. When you wish to create a new socket the library takes one of the unused sockets from the array and returns a reference (a pointer or index) so the socket can be used as a parameter for the rest of the API. It turns out that sockets are somewhat heavy things (use lots of memory) so you always want to optimize the system to have the smallest number of sockets necessary. Unfortunately, you must "pick" a reasonable number of sockets early in the development process. If you run out of sockets you must go back and recompile the library to utilize the new socket count. Now there is a dependency that is not obvious, only fails at run time and links the feature count of the product with the underlying library. You will see bugs like, "when I am using the app I no longer get notification e-mails". It turns out that this problem can be easily solved with a dynamic container. i.e. one that grows at runtime as you need it to. A brute force method would perhaps be to rely upon the heap to reallocate the array at runtime and simply give the library a pointer to an array. That will work but it inserts a heavy copy operation and the library has to be paused while the old array is migrated to the new array. I propose that you should consider a Linked List. I get a number of concerns from other engineers when I have made this suggestion so just hang tight just one moment. Concerns Allocating the memory requires the heap and my application cannot do that. Traversing the list is complicated and requires recursion. We cannot afford the stack space. A linked list library is a lot of code to solve this problem when a simple array can manage it. The linking pointers use more memory. If you have a new concern, post it below. I will talk about these first. Concern #1, Memory allocation I would argue that a heap is NOT required for a linked list. It is simply the way computer science often teaches the topic. Allocate a block of memory for the data. place the data in the block of memory. Data is often supplied as function parameters. insert the block into the list in the correct place. Computer science courses often teach linked lists and sorting algorithms at the same time so this process forms a powerful association. However, what if the process worked a little differently.j Library Code -> Define a suitable data structure for the user data. Include a pointer for the linked list. User Code -> Create a static instance of the library data structure. Fill it with data. User Code -> Pass a reference to the data structure to the library. Library Code -> insert the data structure into the linked list. If you follow this pattern, the user code can have as many sockets or timers or other widgets as it has memory for. The library will manage the list and operate on the elements. When you delete an element you are simply telling the library to forget but the memory is always owned by the user application. That fixes the data count dependency of the array. Concern #2, Traversing the list is complex and recursive. First, Recursion is always a choice. Just avoid it if that is a rule of your system. Every recursive algorithm can be converted to a loop. .Second, Traversing the list is not much different than an array. The pointer data type is larger so it does take a little longer. struct object_data { int mass; struct object_data *nextObject; }; int findTheMassOfTheObjects(struct object_data *objectList) { thisObject = objectList; while(thisObject) { totalMass += thisObject->mass; thisObject = thisObject->nextObject; } printf("The mass of all the objects is %d grams\n", totalMass); return totalMass; } So here is a quick example. It does have the potential of running across memory if the last object in the list does NOT point at NULL. So that is a potential pitfall. Concern #3, A linked list library is a lot of code Yes it is. Don't do that. A generic library can be done and is a great academic exercise but most of the time the additional next pointers and a few functions to insert and remove objects are sufficient. The "library" should be a collection of code snippets that your developers can copy and paste into the code. This will provide reuse but break the dependency on a common library allowing data types to change, or modifications to be made. Concern #4, A linked list will use more memory It is true that the linked list adds a pointer element to the container data structure. However, this additional memory is probably much smaller than the "just in case" additional memory of unused array elements. It is probably also MUCH better than going back and recompiling an underlying library late in the program and adding a lot more memory so the last bug will not happen again. A little history The linked list was invented by Allen Newell, Cliff Shaw and Herbert Simon. These men were developing IPL (Information Processing Language) and decided that lists were the most suitable solution for containers for IPL. They were eventually awarded a Turing Award for making basic contributions to AI, Psychology of Human Cognition and list processing. Interestingly IPL was developed for a computer called JOHNIAC which had a grand total of 16 kbytes of RAM. Even with only 16KB IPL was very successful and linked lists were determined to be the most suitable design for that problem set. Most of our modern microcontrollers have many times that memory and we are falling back on arrays more and more often. If you are going to insist on an array where a linked list is a better choice, you can rest easy knowing that CACHE memory works MUCH better with arrays simply because you can guarantee that all the data is in close proximity so the entire array is likely living in the cache. Good Luck P.S. - The timeout driver and the TCP library from Microchip both run on 8-bit machines with less than 4KB of RAM and they both use linked lists. Check out the code in MCC for details.
×
×
  • Create New...