• 0

Fast C parity calculations on PIC16F

Question

• Member

This is my attempt to convert the blazingly fast assembler routine for calculating parity into a C function.

The original comes from here: https://www.microchip.com/forums/m4762.aspx

Unfortunately Microchip have killed off the board where the original discussion took place at http://asp.microchip.com/webboard/wbpx.dll/~DevTools/read?21443,5

```#include <xc.h>

//returns 0x00 (even parity) or 0x80 (odd parity)
unsigned char parity(volatile unsigned char dat)    //("volatile" is required because no C code reads the parameter)
{
asm("swapf parity@dat,w");    //assume correct bank is still selected
asm("xorwf parity@dat,w");    //W has 8 bits reduced to 4 with same parity
asm("addlw 41h");   // bit 1 becomes B0^B1 and bit 7 becomes B6^B7
asm("iorlw 7Ch");   // for carry propagation from bit 1 to bit 7
asm("addlw 2");     // Done! the parity bit is bit 7 of W
asm("andlw 80h");   // set NZ if odd parity, and leave 00 or 80 in W
asm("return");
return 1;           //dummy instruction to defeat "no return value" error
}

void main(void) {
unsigned char idx=0;
while(1)
{
PORTA = parity(idx);
idx++;
}
}```

I'm not sure if there's a cleaner way to suppress the "no return value" error, without generating extra code.

Recommended Posts

• Member

A suggested C macro is almost as good, but adds an extra temporary storage location for the first calculation

`#define PARITY(b)   ((((((b)^(((b)<<4)|((b)>>4)))+0x41)|0x7C)+2)&0x80)`

It also generates a "warning: (752) conversion to shorter data type" on the macro invocation.

Share on other sites

• Member

What happens when you do "return WREG" as the last line? I think in PRO mode on XC8 that may work and you can remove the return asm line. I can check tomorrow if this works, you can probably check it right now 🙂

Share on other sites

• Member
10 minutes ago, Orunmila said:

What happens when you do "return WREG" as the last line? I think in PRO mode on XC8 that may work and you can remove the return asm line. I can check tomorrow if this works, you can probably check it right now 🙂

That was what I did first, and was horrified by the code generated by XC8 2.0 in free mode.

That was my attempt to force it to do it efficiently.

XC8 v.1.34 in Pro  mode only has one pointless instruction:

```   309                           ;main.c: 41: return WREG;
310  07FE  0809               	movf	9,w	;volatile
311  07FF  0008               	return```

Std and Free mode have an extra pointless instruction:

```   320  07FD  0020               	movlb	0	; select bank0
321  07FE  0809               	movf	9,w	;volatile
322  07FF  0008               	return```

and XC8 2.0 in C90 Free mode (Opt 0) is a bit ordinary:

```  1218  07D4                     ;main.c: 41: return WREG;
1219                           	movlb 0	; select bank0
1220  07D4  0020               	movf	(9),w	;volatile
1221  07D5  0809               	goto	l9
1222  07D6  2FD7
1223                           l594:
1224  07D7                     	line	42
1225
1226                           l9:
1227  07D7                     	return```

C90 Free / Opt 1 is same as v1.x in free mode

```   541                           ;main.c: 41: return WREG;
542  07D9  0020               	movlb	0	; select bank0
543  07DA  0809               	movf	9,w	;volatile
544  07DB  0008               	return```

C99 mode generated the same code.

(And that macro generates horrible code if you are not in Pro mode...)

Share on other sites

• Member

Sadly, free mode is embarrassing and pro mode is expensive.  It has been for a long time.

Share on other sites

• Member

I have been playing for a while with coding functions in ASM which can be called from C to cater specifically to cases like these.

One way to get the code to be exactly what you need it to be like this case here is to add a .S file to your project and code it like this. This will be accessible as a C function of type 4217 (passes in 1 byte in W, returns 1 byte in W)

I made a small change there to what you had because you assumed the parameter was NOT passed in W, while with XC8 this is the normal way it will be done and if it is passed in W the code seems like it would fail. I just declared a local var and moved w into it as you assumed was happening during the call.

If you want to save that extra instruction and 1 byte of ram you can remove the parameter altogether, and code the ASM code to read it from the source symbol.

```#include <xc.inc>
GLOBAL _calcParity        ; make _calcParity globally accessible
SIGNAT _calcParity,4217   ; tell the linker how it should be called

GLOBAL    calcParity@tmp

PSECT cstackCOMMON,class=COMMON,delta=1,space=1
calcParity@tmp ds 1

; everything following will be placed into the mytext psect
PSECT mytext,global,class=CODE,delta=2

; our routine to calculate the parity
_calcParity:

; W is loaded by the calling function;
movwf calcParity@tmp
swapf calcParity@tmp,w   ; assume correct bank is still selected
xorwf calcParity@tmp,w   ; W has 8 bits reduced to 4 with same parity
addlw 41h                ; bit 1 becomes B0^B1 and bit 7 becomes B6^B7
iorlw 7Ch                ; for carry propagation from bit 1 to bit 7
addlw 2                  ; Done! the parity bit is bit 7 of W
andlw 80h                ; set NZ if odd parity, and leave 00 or 80 in W

; the result is already in the required location (W)so we can ; just return immediately
return```

You can then call that from C like so :

```#include <xc.h>
#include <stdint.h>

uint8_t calcParity(uint8_t dat);

void main(void) {
volatile uint8_t c = 0;

c = calcParity(0x01);
c = calcParity(0x11);
c = calcParity(0x03);
c = calcParity(0x13);
}```

Not sure if this helps you.

We have started a conversation with the compiler team to see what they think about optimizing out movf w,w since that never makes sense (although it is a trick used to clear status flags in some cases...). Lots of people still on Vacation until Monday, so we will have to be patient for their input.

Share on other sites

• Member
1 hour ago, Orunmila said:

One way to get the code to be exactly what you need it to be like this case here is to add a .S file to your project and code it like this. This will be accessible as a C function of type 4217 (passes in 1 byte in W, returns 1 byte in W)

I made a small change there to what you had because you assumed the parameter was NOT passed in W, while with XC8 this is the normal way it will be done and if it is passed in W the code seems like it would fail. I just declared a local var and moved w into it as you assumed was happening during the call.

Very nice.

You're right about the bug though. I think I got conned by C99 mode with no optimisation. Normally the parameter is just in W, so you do need to create a temporary variable and save it as you did.

Is your tmp variable essentially static? i.e. are other functions at the same level able to overlay that location?

Quote

If you want to save that extra instruction and 1 byte of ram you can remove the parameter altogether, and code the ASM code to read it from the source symbol.

Now THIS sounds interesting. How would you do that?

This function does not alter the scratch location, it just needs to be able to read it.

Share on other sites

• Member

I am still learning how the linker interprets things. This is using the compiled stack and the linker should have all the information it needs to re-use the location but I have no idea if it will.

EDIT: Actually the more I look at that the more I think it will not re-use the memory :(. It is just saying reserve 1 byte of common ram, not associated with the function call.

I have been playing with proxy functions, and I was hoping that once they fix inlining in the compiler that this would give a good path to making ASM functions with a proxy wrapper like this.

This way the C function gets all the bells and whistles that C provides in terms of stack management, and you can use SW or compiled stack if you please, or even just make this function reentrant, and it all still works. It gets really complicated when you inline the c function and use it more than once, but if you could coax the linker into inlining the ASM function since it is called only once ever, this would be a really neat solution.

As things stand now you end up with an extra call and return which is really 4 instuctions penalty, so if you are doing this for a speed optimization that sucks, if you are doing it to access some fancy ASM code then it works nicely.

```uint8_t asm_calcParity();

uint8_t calcParity(uint16_t c)
{
volatile uint16_t  tmp;  // This generates no code but reserves the temp stack space
tmp = c;                 // This simply does a MOVWF tmp

return asm_calcParity(); // And since this is returning what another function returned in W it should just generate a CALL and RETURN
}```

Then the ASM just becomes this:

```#include <xc.inc>
GLOBAL _asm_calcParity        ; make _calcParity globally accessible
SIGNAT _asm_calcParity,4217   ; tell the linker how it should be called

GLOBAL calcParity@tmp

; everything following will be placed into the mytext psect
PSECT mytext,global,class=CODE,delta=2

; our routine to calculate the parity
_asm_calcParity:

; W is loaded by the calling function;
swapf calcParity@tmp,w   ; assume correct bank is still selected
xorwf calcParity@tmp,w   ; W has 8 bits reduced to 4 with same parity
addlw 41h                ; bit 1 becomes B0^B1 and bit 7 becomes B6^B7
iorlw 7Ch                ; for carry propagation from bit 1 to bit 7
addlw 2                  ; Done! the parity bit is bit 7 of W
andlw 80h                ; set NZ if odd parity, and leave 00 or 80 in W

; the result is already in the required location (W)so we can ; just return immediately
return```

Share on other sites

• Member

I just had a naughty thought.

If you restrict usage to enhanced devices, presumably our routine is allowed to trash WREG, STATUS and BSR.

What if we just used BSR as our temporary scratch register? It's a core register, so doesn't need banking to access (obviously!), and is automatically preserved when interrupts are serviced. The remaining four instructions are all immediate, so not affected by BSR.

Edit: Scrub that. BSR is only 5 or 6 bits, never 8, so can't be used for our purpose which requires all 8 bits.

Share on other sites

• Member

Be careful, you have to tell the compiler if you are going to affect BSR specifically, it tracks the banks and will optimize to minimize bank switches but this can get you into real trouble!

What I like about the proxy/trampoline type approach I posted before is that it forces the banks to be correct because of the compiler managed local variables.

Archived

This topic is now archived and is closed to further replies.

• Popular Contributors

Nobody has received reputation this week.

×

• Search