Should we be avoiding Magic Numbers...

N9WXU · November 16, 2019

I have been told many times that we should avoid the use of magic numbers in our code. That is numbers like these:

CMCON0 = 0x5d;

for(x = 0; x < 29; x ++)
{
  // do something clever with x
}

The rational is that we don't really know what 0x5D means when it is written to CMCON0 or why we should be stopping at 28 in the for loop. However if you made the code like this:

#define CMCON0_PWM_CONFIGURATION <PWM_MAGIC_NUMBERS> // See Datasheet Chapter 3.4 on PWM configuration
#define MAXIMUM_PLAYER_COUNT 29
  
CMCON0 = CMCON0_PWM_CONFIGURATION;
  
for(x = 0; x < MAXIMUM_PLAYER_COUNT; x ++)
{
  // do something with each player
}

Clearly this code is much more readable. However, there is a time when this practice is simply crazy. For example, If you only configure the PWM in a single place, you may not want to hunt around for the exact value placed in CMCON because you have the datasheet open in front of you so the right place for the exact value is right there on that line. Putting the value in one place to use in another place just makes more work.

Or what about this example:

// Found in a configuration header file
#define ZERO_VALUE 0
#define ONE_VALUE 1
#define TWO_VALUE 2
#define THREE_VALUE 3
#define FOUR_VALUE 4
#define FIVE_VALUE 5
#define SIX_VALUE 6
#define SEVEN_VALUE 7


// Found later in the C code
switch(device_number)
{
  case ZERO_VALUE:
  	<blah blah blah>
     break;
  case ONE_VALUE:
  <ETC>
}

I would LOVE a rational explanation on why doing this is little more than cargo culting a coding style.

I have seen this sort of sillyness in many other places.

// is this better?
delay_ms(1000);

// or this
delay_ms(ONE_SECOND);

// or perhaps this
inline void delay_One_Second(void) { delay_ms(1000); }

At the end of the day, I want to see numbers where they make sense. The MOST important reason for using the NO_MAGIC_NUMBER pattern is really NOT to clarify the numbers, but really to make the code DRY. For example consider this:

#define MESSAGE_TIMEOUT 1000


// later
delay_ms(MESSAGE_TIMEOUT);


// or even better
void delay_message_timeout(void);

If you are going to use the MESSAGE_TIMEOUT in a large number of places and especially if you may be tweaking the value, then putting the number in one place is only sane. But the BEST way is to make the function because that is the MOST clear when you read the final code and it creates a very nice point of abstraction for any number of delay methods.

The biggest argument FOR the anti-magic number pattern is really about documentation. In a simple way, the proponents of this pattern are trying to follow the "comments are a smell" pattern but usually without realizing it. In this way, a standin for the number can be used that makes the purpose clear. This is a powerful point and should be considered carefully. But, sometimes, the clearest value to put in the code is actually the MAGIC number. AND, if you don't understand that number, you have no business editing that part of the code.

As always, be prudent and understand the needs of the programmers who will follow. That programmer is likely to be you.

Good Luck.

holdmybeer · November 19, 2019

On 11/16/2019 at 5:46 AM, N9WXU said:

CMCON0 = 0x5d;

Best example for unreadable code, you have to have the datasheet open and do some calculations, which might be the first source of bugs. To avoid that, I often find engineers putting comments on top of that line to write down their calculation:

// comparator configuration
// bit 6 = C1OUT, bit 4 = C1INV => 0101
// bit 3 = CIS, switch to RA0/AN0
// CM<2:0> = 101, one independent comparator
CMCON0 = 0x5d;

However, this is there the trouble really begins, because the comment can now easily desynchronize with the actual code, because someone had to fix something in a hurry:

// comparator configuration
// bit 6 = C1OUT, bit 4 = C1INV => 0101
// bit 3 = CIS, switch to RA0/AN0
// CM<2:0> = 101, one independent comparator
CMCON0 = 0x55;

Where is the bug now? Does the CIS bit have to be switched on according to the comments? Why did the author disable it? Should the comment be updated or left as it is, so that someone with knowledge about it can work on that later?

In my opinion, putting the magic number into a #define MAGIC is just a useless layer which simply moves the problem, but may satisfy (company?) guidelines..

So for this particular case of register initialization, I love using Device Family Packs (DFP):

SERCOM4_REGS->USART_INT.SERCOM_CTRLA =
  SERCOM_USART_INT_CTRLA_MODE(1) |
  SERCOM_USART_INT_CTRLA_RXPO(3) |
  SERCOM_USART_INT_CTRLA_TXPO(1) |
  SERCOM_USART_INT_CTRLA_DORD(1);

Essentially this is a datasheet in a header file and the manufacturer already did the hard work of defining every bit along with a mask and a name.

You still have to know your business if you going to edit this line, but you don't have to fiddle with bare numbers. And the best thing: It all boils down to a constant number before compiling starts.

N9WXU · November 19, 2019

Great points. So perhaps there are some unique considerations to the magic number story when dealing directly with HW.

I too have frequently seen the comments diverge from the code and your simple "fix" definitely demonstrates that point.

Switching to some sort of macro or enumerations and "building" the constant for the register is only a partial help. For example:

INTCON = _INTCON_IOCIF_MASK | _INTCON_IOCIE_MASK | _ADCON0_nDONE_MASK;

This example is a bit contrived because Microchip now includes the register name and the bit name and the type of constant (mask). However I still see code where various datasheet constants are used but the language provides no guarantee that the constant belongs to a specific register. Note the bug above is ORing in the ADCON0 constant with the INTCON constants. This will compile clean and might even work if the bit happens to be correct.

These sorts of constructs in the HW code are really too tactical. We get caught up with the details setting some bit in a register and that should NEVER the goal of the function we are writing.

void startPWM(void)
{
	<blah blah blah>
}

If my function is startPWM, then it is clear what the need is. If I decided to use the CCP as that PWM, or a timer interrupt, either choice will work. Now I have the beginnings of a hardware abstraction layer.

After much debate, MCC chose the following pattern:

void EPWM1_Initialize (void)
{
    // Set the PWM to the options selected in PIC10 / PIC12 / PIC16 / PIC18 MCUs 
    
    // CCP1M P1A,P1C: active high; P1B,P1D: active high; DC1B 3; P1M single; 
    CCP1CON = 0x3C;
    
    // CCP1ASE operating; PSS1BD0 low; PSS1AC0 low; CCP1AS0 disabled; 
    ECCP1AS = 0x00;
    
    // P1RSEN automatic_restart; P1DC0 0; 
    PWM1CON = 0x80;
    
    // STR1D P1D_to_port; STR1C P1C_to_port; STR1B P1B_to_port; STR1A P1A_to_CCP1M; STR1SYNC start_at_begin; 
    PSTR1CON = 0x01;
    
    // CCPR1L 127; 
    CCPR1L = 0x7F;
    
    // CCPR1H 0; 
    CCPR1H = 0x00;
    
    
    // Selecting Timer2
    CCPTMRS0bits.C1TSEL = 0x0;
}

This code has both magic numbers and an explanation. The reasons were as follows:

Maximum performance. Slamming in a full value is always the fastest choice.
Explain the choice with a comment. Magic values are fast and simple. The machine generated comment will be correct to the datasheet but will not convey "why". We can't know why so this is the next best thing.

MCC is NOT a hardware abstraction layer but it does give a quick setup to many common functions. If you are really writing a HAL targeting your application, you will have more application defined interface functions and what happens inside the interface functions is very isolated. I would argue that if you tried to edit those functions WITHOUT the datasheet open, you must really know that part. ONe example is clearing an interrupt flag on a PIC16 is a simple as writing a 0 to the bit. Clearing the interrupt flag on an AVR requires writing a 1 to the bit. No amount of special macros and enumerations will make the code look "correct" to a both AVR freak and a PIChead. Each choice is correct for the architecture for sound technical reasons but you should not expect the HW behavior to be as obvious as which bit to select.

So I would argue that even unsupported Magic Numbers in the hardware abstraction layer is perfectly acceptable. Please reference the datasheet chapter & verse so I can see what you were doing to the hardware. A bit of explanation documenting what the function is accomplishing and any clever bits in the HW you were taking advantage of would be very helpful.

As a parting note, consider the CLC. There are LOTS of bits and the WHY is critical. Once you have drawn the schematic of the CLC circuit and decided how the bits are going to be set, just put the values in the right spots and don't worry about trying to create special macros.

Good Luck.

PS: Don't assume that I am 100% for magic numbers in the code. I just don't like avoiding them when they are the shortest, fastest code, and (once the datasheet is open) the easiest to debug.

Orunmila · November 19, 2019

I do not like the comments describing the magic number's value, I much prefer the comments describing the register here, to help you not having to go to the datasheet.

The comments should make the static information like the register design clear here so that the reader of the code can understand what the number means by just reading the comment. This is subtly different from the comment claiming the meaning. If you do this the comments will not go out of sync with the code and also help you if you want to change the value to something it is not currently set to.

Comments should describe WHY we are doing things, these motivations should be design, assumptions or external constraints.

N9WXU · November 19, 2019

Comments describing the desired affect of the register value is ideal.

So here is an example.

 void startMotorControlPWM(void)
{
  // configuring TMR2 for 8-bit and 8.3khz PWM
  // See Requirements section 4.3.2.1
  T2CON = 0xXX; // TMR2 1:1 prescaler
  PR2 = 0x4F; // reload at 6-bits (2 bits implied) produces 8-bit PWM
  CCP1CON = 0xXX; // Using CCP1 for PWM with TMR2
  CCPR1L = 0xXX; // initial PWM at 25% for a slow motor rotation.
}

Note that magic numbers are used... but the explanations make the intent clear. And the necessary requirements are referenced for trace ability.

When your future self is debugging, you can verify the intent to the requirements and then verify the magic numbers against the datasheet. Making changes during debugging would be done right here with the magic numbers and that will be clear and easy because the datasheet register maps will be handy. Once this code is working, the top level function name is easy and clear.

Orunmila · November 19, 2019

If there is an explanation of the numbers right there they are not rightfully magical anymore 🙂

I can embrace that kind of description, as long as you have the discipline to keep the comments and the values in sync.

But I still prefer describing the register instead of the value, if I want to change the prescaler or the CCP1CON value I now need to reach for the datasheet (although the convenient reference will tell me where!) I would prefer having the entire CCP1CON described here but especially on 32bit machines this can be entirely unpractical and then I will be OK with that scheme.

N9WXU · November 19, 2019

As tempting as it is to duplicate the datasheet in your code, there is much to dislike about this strategy.

Single Source of Truth should be the manufacturers datasheet... Not what you copied into the code.
The signal to noise ratio of the code will suck, making debugging more challenging.
The register description is not always enough so this slippery slope will have you copying the entire peripheral chapter.
Your future maintainer will not have your knowledge of the device so they will need the datasheet anyway.

holdmybeer · November 20, 2019

I wouldn't duplicate the datasheet either. But isn't describing the calculation already the same? To clarify, both suggestions in an example:

Describe the calculation:

// comparator configuration
// bit 6 = C1OUT, bit 4 = C1INV => 0101
// bit 3 = CIS, switch to RA0/AN0
// CM<2:0> = 101, one independent comparator
CMCON0 = 0x5d;

Describe the register:

// comparator configuration
// bit 7 = C2OUT
// bit 6 = C1OUT
// bit 5 = C2INV
// bit 4 = C1INV
// bit 3 = CIS
// bit <2:0> = comparator mode
CMCON0 = 0x5d;

The latter has the advantage of not getting out of sync, but is a tedious and error-prone work. And if you do the work, why not describe it with #define and put into a header file to re-use it for other projects?

The datasheet is just some not-so-formal syntax to describe something we need to compile in our brains to C source code each time we like to do something with peripheral registers. On 8-bit machines, everything can be easily calculated manually. But by Murphy's law, you will make an error.

The manufacturer as a single point of truth (and errata) should supply every developer with usable DFPs, which already contain tested and verified definitions to use. In fact, this is the formal version of the datasheet.

32-bit world now, what does this mean? Do I need to write comments (flavor A or B from above) for every register now? The SERCOM module has more than 10 registers, EVSYS summary table is several pages long..

SERCOM0->USART.CTRLA.reg = 0x40310006;

Wait, I've already used the DFPs to access the register:

// accessing SERCOM, base address 0x42000400, offset for CTRLA is 0x00
(* (uint32_t*) (0x42000400 + 0x00)) = 0x40310006;

😉

I don't see the point why I as a customer should bother this these numbers for the sake of clarity. In fact, the left side of the statement already is a symbol to access and nobody complains or starts to comment about register and offset calculations, so why would we for the right side?

N9WXU · November 20, 2019

But in all of your examples you are not telling me why you are doing that bit of work. I cannot possibly determine if there is a bug if I don't know why you are configuring the SERCOM with that particular value. How about simply saying:

void configureSerialForMyDataLink(void)
{
    // datalink specifications found in specification 4.3.2
    // using SERCOM0 as follows:
    //  - Alternate Pin x,y,z
    //  - 9600 baud
    //  - half duplex
    //  - SamD21 datasheet page 26 for specifics
	SERCOM0 = <blah blah blah>;
}

Now you know why. You have a function that has a clear purpose. And if the link is invalid, you can see the intent. The specifics of the bits are in the datasheet and clearly referenced. No magic here.

As for the special access mode for performance...

inline void SERCOM0_WRITE(uint32_t ControllOffset, uint32_t Value)
{
  	// Accessing the SERCOM via DFP offsets for high performance
	(* (uint32_t*) (0x42000400 + ControlOfset)) = Value;
}

Now a future engineer has a handy helper and the details are nicely removed. And an interested engineer can debug it because the intent is clear. Obviously you need to be a DFP expert (or have the datasheet) to understand/edit it. But no magic.

But the application should NEVER use this helper. It should be buried in the HAL. The first function is much more clear for the HAL because it conveys application level intent. i.e. the Application will rarely care about the SERCOM and will always care about its DataLink. If I port the code to something without a SERCOM, the application will still need a DataLink so this function will simply be refilled with something suitable for the other CPU. The application remains unchanged.

holdmybeer · November 20, 2019

Well, yes. My code example didn't show the WHY, because the point was to discuss about the avoidance of magic numbers as a mean to be an end in itself.

This is non-exclusive:

void UART_init(void) {
	// Initialize SERCOM link
	// for details, see datasheet (DS60001507E) p. 944, 34.8.1
	// use DORD to change bit order, choose pins with RXPO/TXPO
    SERCOM4_REGS->USART_INT.SERCOM_CTRLA =
      SERCOM_USART_INT_CTRLA_MODE(1) |
      SERCOM_USART_INT_CTRLA_RXPO(3) |
      SERCOM_USART_INT_CTRLA_TXPO(1) |
      SERCOM_USART_INT_CTRLA_DORD(1);
}

Since this is mostly precompiler stuff, no speed or code size is sacrificed. DFPs help me to write human-readable register initializations without making a calculation error in my mind. If someone else wants to change this later (switch RXPO for example), he doesn't have to decipher the actual value of RXPO from my magic value (first source of error) and doesn't have to recipher it into a new magic value (second source). This "someone else" could also be me after having an argument with my project leader.

And finally I don't need to maintain clarifying comments. I could read the statement aloud and would be still comprehensible what I'm trying to accomplish.

The IDE also helps a lot with getting the right symbols. The XC32 team has done a nice job having a logic in the names, so if I start typing "SERCOM_UART_" and hit <CTRL> - <SPACE>, I'll get a selection which fits, so I don't need to remember the exact name. If I'm unsure if this is the right value, I just hold <CTRL> and left click on the symbol. MPLABX then jumps right into the include header, letting me verify my decision:

#define SERCOM_USART_INT_CTRLA_RXPO_Pos       _U_(20)                                              /**< (SERCOM_USART_INT_CTRLA) Receive Data Pinout Position */
#define SERCOM_USART_INT_CTRLA_RXPO_Msk       (_U_(0x3) << SERCOM_USART_INT_CTRLA_RXPO_Pos)        /**< (SERCOM_USART_INT_CTRLA) Receive Data Pinout Mask */
#define SERCOM_USART_INT_CTRLA_RXPO(value)    (SERCOM_USART_INT_CTRLA_RXPO_Msk & ((value) << SERCOM_USART_INT_CTRLA_RXPO_Pos))

N9WXU · November 21, 2019

Excellent points and I think we are almost completely in agreement.

I have only four complaints with these helper macros and two of them are relatively minor.

Consider the following snapshot from ATMEL Studio in a SAMD21 project

the <CTRL><SPACE> pattern works great when you know the keyword. If you simply start with SERCOM, you get a number of matches that are not UART related.
ALL the choices will compile. See the BAUD value placed in the CTRLA register...
Some of these choices are used like macros...And others are not (CMODE)
Placing an invalid value in these macros is completely reasonable and will compile.
MISRA 19.7 disallows function-like macros. (though this is not really an issue because it does not apply in this situation.)

So, these constructs are handy because they help prevent a certain sort of bookkeeping error related to sorting out all of these offsets.

But the constructs allow range errors and semantic errors.

Since we are talking about the SERCOM and I am using the SAMD21 in my example. Here is what START produces.

	hri_sercomusart_write_CTRLA_reg(
	    SERCOM0,
	    1 << SERCOM_USART_CTRLA_DORD_Pos           /* Data Order: enabled */
	        | 0 << SERCOM_USART_CTRLA_CMODE_Pos    /* Communication Mode: disabled */
	        | 0 << SERCOM_USART_CTRLA_FORM_Pos     /* Frame Format: 0 */
	        | 0 << SERCOM_USART_CTRLA_SAMPA_Pos    /* Sample Adjustment: 0 */
	        | 0 << SERCOM_USART_CTRLA_SAMPR_Pos    /* Sample Rate: 0 */
	        | 0 << SERCOM_USART_CTRLA_IBON_Pos     /* Immediate Buffer Overflow Notification: disabled */
	        | 0 << SERCOM_USART_CTRLA_RUNSTDBY_Pos /* Run In Standby: disabled */
	        | 1 << SERCOM_USART_CTRLA_MODE_Pos);   /* Operating Mode: enabled */

This is completely different than the other examples provided. This style is defined in hri_sercom_d21.h and a word to the wise, DO NOT BROWSE THIS INSIDE OF ATMEL START, This file must be huge as the page is completely frozen as it populates the source viewer.

So I do like the construct, but often it does not help me very much because I must still go through the entire register and decide my values. When I string these values together any mistakes made will be sorted at once.

All that aside, I don't really care much about the HW initialization. I want it to be short, to the point, and perfectly clear about what is placed in each register. In a typical project, only 1% of the code is some sort of HW work as you develop the application specific HW interfaces and bring up your board. Once these are tested, you are not likely to spend any more time on them. In a 9 month project, I expect to spend 2 weeks in the HW functions so if a magic value is clear and to the point, use it. If you want to construct your values with logical operations. Go for it.

holdmybeer · November 21, 2019

I agree with you complaints. You can mitigate the probability of making errors, but you can't get it to zero.

2 hours ago, N9WXU said:

the <CTRL><SPACE> pattern works great when you know the keyword. If you simply start with SERCOM, you get a number of matches that are not UART related.

ALL the choices will compile. See the BAUD value placed in the CTRLA register...

Some of these choices are used like macros...And others are not (CMODE)

Placing an invalid value in these macros is completely reasonable and will compile.

MISRA 19.7 disallows function-like macros. (though this is not really an issue because it does not apply in this situation.)

1. True, you have to continue typing to at least SERCOM_[register]_ to get more meaningful suggestions. The good thing is, there is a pattern in all header files. In doubt, follow to the definition in the header files, they are relatively small.

2. It will compile, as any magical number will compile. But you will at least have a chance to find a suspicious looking symbol by carefully reading through the statement.

3. This is done much better with the new DFPs. Have a close look in the XC32 compiler directory. You'll find an /include directory AND an /include_mcc! The first directory is the legacy code. By default, the new definitions will be used.

4. Similar to 2.

5. That is a great point, it is indeed not MISRA compliant. I'd guess that an exception is made then, but it would be interesting how this is handled in larger companies.

Don't mix up Atmel START (ASF libraries) with DFPs. I personally don't like to use START, too much layer-over-layer. If you use it, you have to fully rely on the code underneath all the layers. And often enough, there is a bug hidden in some hri function. At this point, ASF doesn't save time anymore.

Should we be avoiding Magic Numbers...

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Popular Contributors