Jump to content
 

RTCOUNTER example - (solved) Caused by Silicon Errata


KM1

Recommended Posts

Hello,

I have run into a strange (for me) issue with the rtcounter module as provided by MCC and shown by the example program in the blog area of this site. I am using MPLabX and XC8, both up-to-date versions and I am trying the example on a pic18F47k40 xpress board. The issue is I set an output (RA4) in main after initializing the pic and then enter the while(1) loop where the rtcount_callNextCallback(); is called. The output now is turning on and off. The off duration is typically ~60u seconds, and on time can be varied with changes to the timer interrupt settings (I use a 1mS timer0 in 16 bit mode) and the number sent to the callback. Approximately every 3mS, the off time increases such that that cycle (only) is approx 50% duty cycle.

Turning the output on or off in the callback routine has no effect - I initially wanted to toggle on and off at a speed visible to the eye. Commenting out the callback in the main loop stops the output from going low.

I have checked errata and used the data sheet to confirm register settings, I tried moving the current-limited LED from RA4 to RA2, and I have studied the example program looking for obvious differences that may cause this behaviour.

I would appreciate thoughts and/or suggestions.

Keith
   

Link to comment
Share on other sites

  • Member

Oh my, I have run it in the simulator and this is definitely broken. Let me debug it and I will get back to you with a clear answer on how to fix this and what is going on.

EDIT: OK. RTCOUNTER is designed for the timer to represent the bottom bits of a concatenated timer value. This ONLY works if the timer runs from 0 and overflows at FFFF. This means that you cannot set the timer period to 1ms the way that you did there, you can ONLY set the timer period to MAXIMUM which means the timer will run from 0x0000 to 0xFFFF, forming the bottom 16 bits of the concatenated timer.

I cannot test the code with TMR0 in the simulator as the simulator has a bug which does not allow TMR0 to work in 16-bit mode, and I do not have a 47K40 on me to test it on the actual hardware 😞

What you should see is this:

1. TMR0L and TMR0H should count upwards from 0x0000. 
2. When you call create timer it should take the current value of (TMR0H << 8 + TMR0L) and add 1000 to this, that should give you something like 1222 the first cycle.
3. At this point every time you call rtcount_callNextCallback() it should compare the current timer value obtained by ReadTimer() to 1222, it should do nothing until the timer exceeds 1222, then it should call the callback once and add 1000 to the timer absoluteltimeout, which should now be 2222, and the cycle should repeat.

Please check if this is what you see on the hardware, we can fix the period when this is working correctly. This should have a period of 8ms per tick, so 1000 should go about 8 seconds ...

What happened in your case in 16-bit mode with the period set to 1ms is that the rtcount_callNextCallback() routine ALWAYS thought that there was a timeout as it sets the timeout to be at something like 222, but because you reload the timer with a value much larger than this the routine always thinks the timer has timed out and it toggles the i/o pin on every pass through the while(1)

Link to comment
Share on other sites

  • Member

To confirm above I changed your code to use TMR0 in 8-bit mode so that I can test it in the simulator. With the prescaler I set the period to 1.024ms, so it is not quite perfect, but close enough for a test (2.4% error).

I attach the project with these settings, this works fine in the simulator.

xpress1.zip

Link to comment
Share on other sites

Hi,

I am seeing the issue if  " rtcount_callNextCallback();" is called in While loop.

while (1)
    {
        // Add your application code
       //IO_RA1_SetLow();
        // Check if the next timer has expired, call its callback from here if it did
        rtcount_callNextCallback();
    }

I tested through debugger as well and see that after 5 round inside " rtcount_callNextCallback();" it Resets.

I wanted to attach the RA4 output signal but I am not able to figure it out how.

By default Stack Overflow is enabled,

#pragma config STVREN = ON    // Stack Full/Underflow Reset Enable bit->Stack full/underflow will cause Reset

If I disable it then RA4 output is "Low". I am not able to pinpoint to the exact line which is causing the issue. however, I will continue on this and update. 

 

 

 

Link to comment
Share on other sites

30 minutes ago, Orunmila said:

Hey Prasad, that should use almost no stack space? Very strange. Can you check the reset bits to see if it really is a stack overflow?

 

 

I checked the "PCON0" register There was no Stack Over/Underflow but the "nRMLR" bit was low which means MCLR Reset. Default value as per datasheet should be PCON0 = 0011110x, the obtained value was 0011010x (STKOVF  STKUNF  WDTWV RWDT  RMCLR  RI  POR  BOR)

Even with STVEN enabled if we comment " rtcount_callNextCallback();"  device doesn't Reset.  We need to test " rtcount_callNextCallback();" function completely.

 

Link to comment
Share on other sites

Hi KM1,

If we load RTCOUNTER is loads the "Timer0 without interrupt" but, i see that the interrupt is enabled in the code. 

Could you please give some details on why Interrupt is enabled.

Also, I see EUSART and SPI-MASTER code in it if you share some details on this then it may be helpful.

Link to comment
Share on other sites

11 hours ago, Prasad said:

Hi KM1,

If we load RTCOUNTER is loads the "Timer0 without interrupt" but, i see that the interrupt is enabled in the code. 

Could you please give some details on why Interrupt is enabled.

Also, I see EUSART and SPI-MASTER code in it if you share some details on this then it may be helpful.

I enabled the timer0 interrupt because it appeared to be a requirement to run the rtcount example. I don't understand how the callback would use the interrupt routine without my enabling the interrupts?

The eusart is to be used for debugging and the spi code is to allow the use of an ethernet interface module which is the primary goal for this project. I have not done anything with either as I was testing the rtcount module first. I very much like the approach used in the recount example.

Keith

Link to comment
Share on other sites

  • Member

The full errata can be downloaded here : http://ww1.microchip.com/downloads/en/DeviceDoc/PIC18F27-47K40-Silicon-Errata-and-Data-Sheet-Clarification-80000713E.pdf

If this is the problem then this is quite a serious problem as everyone using C with those boards will be caught by this! It would mean that they ran a production batch of those boards with rev A2 silicon!

C

Link to comment
Share on other sites

I narrow down the issue. The controller is restarting due to "Stack Overflow".
 
line number 220 code is causing the issue,
      reschedule = timer->callbackPtr(timer->payload);
 
It looks like its a recursive which causing the issue but, I was not able to understand the part of the code how it works in a given time.
 
We tried by adding "NULL" pointer check to that line that worked but, I feel like as an application or function of API it may not but, it just saves from restarting the controller.
       if(timer->callbackPtr != NULL)
                reschedule = timer->callbackPtr(timer->payload);
 
I have attached the snapshot of the StackOverflow Flag setting.  
 
BTW. looks like its a simulator bug on showing nMLCR bit has low every time. I tested the voltage on the nMLCR it's not low nor dropping while executing the program.

line220 causing StackOV.jpg

Link to comment
Share on other sites

Thank you  both for your very helpful suggestions and the time you are putting in to sort out this issue. I will be out of the office today and tomorrow, but will have time at the end of the week to do some more investigating at my end.

Keith

Link to comment
Share on other sites

  • Member

@Prasad, that pointer cannot be null according to the C code, it is initialized at startup. I think you just found the problem! Good detective work!

Of course if you check for null there and it is indeed null then the device will not reset but also the callback will never be called so the problem is not yet solved...

Let’s try out that errata, I think that will fix it. If not let’s keep investigating how it is possible for this pointer to be null. It is either not initialized (the errata could cause this) or it is overwritten woth null at some point. Which one is it?

 

Link to comment
Share on other sites

  • Member
4 hours ago, ric said:

You don't need to edit powerup.as if you tell the compiler about the " NVMREG" errata.

https://www.microchip.com/forums/m969418.aspx

 

 

Oh wow I did not realize this was implemented as an errata! That would by far be the best option! I am going to try it out and upload the project here, then KM1 should be able to test when he gets back!

 

Link to comment
Share on other sites

  • Member

Here you go - I have compiled this with the +NVMREG option, and confirmed that this writes the appropriate bits before doing initialization of the function pointers through the TBLRD instruction.

KM1, I will leave this here for you to test. Uploading the project zip as well as the HEX I compiled.

I did find a host of bugs in the RTCOUNTER when you are using TIMER0 in 8-bit mode! It is clear this combination needs some work. 

1. The left shift in readTimer was still <<16 even if the timer was 8-bit causing all kinds of havoc
2. The definition of the global variable g_rtcounterH was 16-bit limiting us to 24bit range, but RTCOUNTER_CONCATENATE_TIMER_TICKS was defined specifying a 16-bit range so the math was all wonky

I set the timer to have a tick of 1us which makes the math easy, and then got 1s by setting the timer reschedule value to 1,000,000, which should make the output toggle every 1s, or come on every 2s ...

For the blog post we used TIMER1 and for that implementation everything seemed to be correct.

Also I did not run it long enough to rebase (wrap) so that may also have issues, that should happen after 4000 seconds, which is longer than it sounds ...

 

 

xpress1.zip

Link to comment
Share on other sites

  • Member

Good job tracking this down folks.  It sounds like the templates need a little help for the 8-bit case and probably for the 32-bit case (running this on a PIC32).  Is there any news on getting the templates on a public repository (GITHUB) so we can contribute fixes and help out?

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

 


×
×
  • Create New...