Jump to content
 
  • 0

Fast C parity calculations on PIC16F


ric

Question

  • Member

This is my attempt to convert the blazingly fast assembler routine for calculating parity into a C function.

The original comes from here: https://www.microchip.com/forums/m4762.aspx

Unfortunately Microchip have killed off the board where the original discussion took place at http://asp.microchip.com/webboard/wbpx.dll/~DevTools/read?21443,5

#include <xc.h>

//returns 0x00 (even parity) or 0x80 (odd parity)
unsigned char parity(volatile unsigned char dat)    //("volatile" is required because no C code reads the parameter)
{
    asm("swapf parity@dat,w");    //assume correct bank is still selected
    asm("xorwf parity@dat,w");    //W has 8 bits reduced to 4 with same parity
    asm("addlw 41h");   // bit 1 becomes B0^B1 and bit 7 becomes B6^B7
    asm("iorlw 7Ch");   // for carry propagation from bit 1 to bit 7
    asm("addlw 2");     // Done! the parity bit is bit 7 of W
    asm("andlw 80h");   // set NZ if odd parity, and leave 00 or 80 in W
    asm("return");
    return 1;           //dummy instruction to defeat "no return value" error
}

void main(void) {
    unsigned char idx=0;
    while(1)
    {
        PORTA = parity(idx);
        idx++;
    }
}

I'm not sure if there's a cleaner way to suppress the "no return value" error, without generating extra code.

Link to comment
Share on other sites

9 answers to this question

Recommended Posts

  • Member

A suggested C macro is almost as good, but adds an extra temporary storage location for the first calculation

#define PARITY(b)   ((((((b)^(((b)<<4)|((b)>>4)))+0x41)|0x7C)+2)&0x80)

It also generates a "warning: (752) conversion to shorter data type" on the macro invocation.

Link to comment
Share on other sites

  • Member
10 minutes ago, Orunmila said:

What happens when you do "return WREG" as the last line? I think in PRO mode on XC8 that may work and you can remove the return asm line. I can check tomorrow if this works, you can probably check it right now 🙂

That was what I did first, and was horrified by the code generated by XC8 2.0 in free mode.

That was my attempt to force it to do it efficiently.

XC8 v.1.34 in Pro  mode only has one pointless instruction:

   309                           ;main.c: 41: return WREG;
   310  07FE  0809               	movf	9,w	;volatile
   311  07FF  0008               	return

Std and Free mode have an extra pointless instruction:

   320  07FD  0020               	movlb	0	; select bank0
   321  07FE  0809               	movf	9,w	;volatile
   322  07FF  0008               	return

and XC8 2.0 in C90 Free mode (Opt 0) is a bit ordinary:

  1218  07D4                     ;main.c: 41: return WREG;
  1219                           	movlb 0	; select bank0
  1220  07D4  0020               	movf	(9),w	;volatile
  1221  07D5  0809               	goto	l9
  1222  07D6  2FD7               	
  1223                           l594:	
  1224  07D7                     	line	42
  1225                           	
  1226                           l9:	
  1227  07D7                     	return

C90 Free / Opt 1 is same as v1.x in free mode

   541                           ;main.c: 41: return WREG;
   542  07D9  0020               	movlb	0	; select bank0
   543  07DA  0809               	movf	9,w	;volatile
   544  07DB  0008               	return

C99 mode generated the same code.

(And that macro generates horrible code if you are not in Pro mode...)

Link to comment
Share on other sites

  • Member

I have been playing for a while with coding functions in ASM which can be called from C to cater specifically to cases like these.

One way to get the code to be exactly what you need it to be like this case here is to add a .S file to your project and code it like this. This will be accessible as a C function of type 4217 (passes in 1 byte in W, returns 1 byte in W)

I made a small change there to what you had because you assumed the parameter was NOT passed in W, while with XC8 this is the normal way it will be done and if it is passed in W the code seems like it would fail. I just declared a local var and moved w into it as you assumed was happening during the call.

If you want to save that extra instruction and 1 byte of ram you can remove the parameter altogether, and code the ASM code to read it from the source symbol.

#include <xc.inc>
GLOBAL _calcParity        ; make _calcParity globally accessible
SIGNAT _calcParity,4217   ; tell the linker how it should be called

GLOBAL    calcParity@tmp
    
PSECT cstackCOMMON,class=COMMON,delta=1,space=1
calcParity@tmp ds 1
    
; everything following will be placed into the mytext psect
PSECT mytext,global,class=CODE,delta=2

; our routine to calculate the parity
_calcParity:

; W is loaded by the calling function;
movwf calcParity@tmp
swapf calcParity@tmp,w   ; assume correct bank is still selected
xorwf calcParity@tmp,w   ; W has 8 bits reduced to 4 with same parity
addlw 41h                ; bit 1 becomes B0^B1 and bit 7 becomes B6^B7
iorlw 7Ch                ; for carry propagation from bit 1 to bit 7
addlw 2                  ; Done! the parity bit is bit 7 of W
andlw 80h                ; set NZ if odd parity, and leave 00 or 80 in W

; the result is already in the required location (W)so we can ; just return immediately
return

 

You can then call that from C like so :

#include <xc.h>
#include <stdint.h>

uint8_t calcParity(uint8_t dat);

void main(void) {
    volatile uint8_t c = 0;
    
    c = calcParity(0x01);
    c = calcParity(0x11);
    c = calcParity(0x03);
    c = calcParity(0x13);
}


Not sure if this helps you.

We have started a conversation with the compiler team to see what they think about optimizing out movf w,w since that never makes sense (although it is a trick used to clear status flags in some cases...). Lots of people still on Vacation until Monday, so we will have to be patient for their input.


 

Link to comment
Share on other sites

  • Member
1 hour ago, Orunmila said:

One way to get the code to be exactly what you need it to be like this case here is to add a .S file to your project and code it like this. This will be accessible as a C function of type 4217 (passes in 1 byte in W, returns 1 byte in W)

I made a small change there to what you had because you assumed the parameter was NOT passed in W, while with XC8 this is the normal way it will be done and if it is passed in W the code seems like it would fail. I just declared a local var and moved w into it as you assumed was happening during the call.

Very nice.

You're right about the bug though. I think I got conned by C99 mode with no optimisation. Normally the parameter is just in W, so you do need to create a temporary variable and save it as you did.

Is your tmp variable essentially static? i.e. are other functions at the same level able to overlay that location?

 

Quote

If you want to save that extra instruction and 1 byte of ram you can remove the parameter altogether, and code the ASM code to read it from the source symbol.

Now THIS sounds interesting. How would you do that?

This function does not alter the scratch location, it just needs to be able to read it.

Link to comment
Share on other sites

  • Member

I am still learning how the linker interprets things. This is using the compiled stack and the linker should have all the information it needs to re-use the location but I have no idea if it will. 

EDIT: Actually the more I look at that the more I think it will not re-use the memory :(. It is just saying reserve 1 byte of common ram, not associated with the function call.

I have been playing with proxy functions, and I was hoping that once they fix inlining in the compiler that this would give a good path to making ASM functions with a proxy wrapper like this.

This way the C function gets all the bells and whistles that C provides in terms of stack management, and you can use SW or compiled stack if you please, or even just make this function reentrant, and it all still works. It gets really complicated when you inline the c function and use it more than once, but if you could coax the linker into inlining the ASM function since it is called only once ever, this would be a really neat solution. 

As things stand now you end up with an extra call and return which is really 4 instuctions penalty, so if you are doing this for a speed optimization that sucks, if you are doing it to access some fancy ASM code then it works nicely.

uint8_t asm_calcParity();

uint8_t calcParity(uint16_t c)
{
    volatile uint16_t  tmp;  // This generates no code but reserves the temp stack space
    tmp = c;                 // This simply does a MOVWF tmp
    
    return asm_calcParity(); // And since this is returning what another function returned in W it should just generate a CALL and RETURN
}

Then the ASM just becomes this:

#include <xc.inc>
GLOBAL _asm_calcParity        ; make _calcParity globally accessible
SIGNAT _asm_calcParity,4217   ; tell the linker how it should be called
   
GLOBAL calcParity@tmp
    
; everything following will be placed into the mytext psect
PSECT mytext,global,class=CODE,delta=2

; our routine to calculate the parity
_asm_calcParity:

; W is loaded by the calling function;
swapf calcParity@tmp,w   ; assume correct bank is still selected
xorwf calcParity@tmp,w   ; W has 8 bits reduced to 4 with same parity
addlw 41h                ; bit 1 becomes B0^B1 and bit 7 becomes B6^B7
iorlw 7Ch                ; for carry propagation from bit 1 to bit 7
addlw 2                  ; Done! the parity bit is bit 7 of W
andlw 80h                ; set NZ if odd parity, and leave 00 or 80 in W

; the result is already in the required location (W)so we can ; just return immediately
return

 

 

Link to comment
Share on other sites

  • Member

I just had a naughty thought.

If you restrict usage to enhanced devices, presumably our routine is allowed to trash WREG, STATUS and BSR.

What if we just used BSR as our temporary scratch register? It's a core register, so doesn't need banking to access (obviously!), and is automatically preserved when interrupts are serviced. The remaining four instructions are all immediate, so not affected by BSR.

Edit: Scrub that. BSR is only 5 or 6 bits, never 8, so can't be used for our purpose which requires all 8 bits.

Link to comment
Share on other sites

  • Member

Be careful, you have to tell the compiler if you are going to affect BSR specifically, it tracks the banks and will optimize to minimize bank switches but this can get you into real trouble!

What I like about the proxy/trampoline type approach I posted before is that it forces the banks to be correct because of the compiler managed local variables.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

 


×
×
  • Create New...