How To Use Registers In Embedded C
When working with peripherals, we need to be able to read and write to the device'southward internal registers. How nosotros accomplish this in C depends on whether we're working with memory-mapped IO or port-mapped IO. Port-mapped IO typically requires compiler/language extensions, whereas memory-mapped IO can be accommodated with the standard C syntax.
Embedded "Hello, World!"
We all know the embedded equivalent of the "Hello, world!" programme is flashing the LED, so true to form I'm going to use that as an case.
The examples are based on a STM32F407 flake using the GNU Arm Embedded Toolchain .
The STM32F4 uses a port-based GPIO (General Purpose Input Output) model, where each port can manage 16 physical pins. The LEDS are mapped to external pins 55-58 which maps internally onto GPIO Port D pins 8-11.
Flashing the LEDs
Flashing the LEDs is fairly straightforward, at the port level in that location are only two registers we are interested in.
- Mode Annals – this defines, on a pin-by-pin basis what its function is, e.g. we want this pin to carry as an output pin.
- Output Information Register – Writing a '
1
' to the appropriate pivot will generate voltage and writing a '0
' will ground the pin.
Mode Register (MODER)
Each port pin has four modes of operation, thus requiring ii configuration bits per pin (pin 0 is configured using fashion bits 0-1, pin 2 uses mode bits two-3, so on):
-
00
Input -
01
Output -
10
Alternative function (details configured via other registers) -
eleven
Analogue
So, for example, to configure pivot 8 for output, we must write the value 01 into $.25 16 and 17 in the MODER register (that is, flake xvi => i, scrap 17 => 0).
Output Data Annals (ODR)
In the Output Data Annals (ODR) each fleck represents an I/O pin on the port. The bit number matches the pin number.
If a pin is set to output (in the MODER register) then writing a 1 into the advisable bit will drive the I/O pin high. Writing 0 into the appropriate bit will drive the I/O pin low.
There are xvi IO pins, but the annals is 32bits wide. Reserved $.25 are read as '0'.
Port D Addresses
The accented addresses for the MODER and ODR of Port D are:
- MODER –
0x40020C00
- ODR –
0x40020C14
Pointer access to registers
Typically when nosotros access registers in C based on memory-mapped IO we apply a arrow note to 'trick' the compiler into generating the correct load/shop operations at the absolute address needed.
Then for the Port D nosotros might run into something along the lines of (I'll continue the code brief and use magic numbers) for simplicity):
#include <stdint.h> volatile uint32_t* const portd_moder = (uint32_t*) 0x40020C00; volatile uint32_t* const portd_odr = (uint32_t*) 0x40020C14; extern void slumber(uint32_t ms); // use systick to busy-await int main(void) { uint32_t moder = *portd_moder; moder |= (1 << xvi); moder &= ~(one << 17); *portd_moder = moder; while(1) { *portd_odr |= (1 << 8); // led-on slumber(500); *portd_odr &= ~(i << eight); // led-off sleep(500); } }
Alternatively we may see the registers defined using the pre-processors, e.yard.
#include <stdint.h> #define PORTD_MODER (*((volatile uint32_t*) 0x40020C00)) #ascertain PORTD_ODR (*((volatile uint32_t*) 0x40020C14)) extern void sleep(uint32_t ms); // use systick to busy-wait int principal(void) { uint32_t moder = PORTD_MODER; moder |= (1 << sixteen); moder &= ~(i << 17); PORTD_MODER = moder; while(1) { PORTD_ODR |= (1 << 8); // led-on sleep(500); PORTD_ODR &= ~(1 << 8); // led-off sleep(500); } }
There is a misconception amongst many C programmers that the pointer model is less efficient than the #define
model. With C99 and mod compilers this is not the case, they will generate identical code (C99 allows for the complier to optimise away const
objects).
Enabling Port D
We are missing one final step; each peripheral on the the STM32F407 is clock gated. The clock point does not accomplish the peripheral until we tell it to do then by fashion of setting a scrap in a specific register. Past default, clock signals never reach peripherals that are not in apply, thus saving power.
To enable the clock to reach the GPIO port D the GPIODEN (GPIO D Enable) bit (chip 3) of the AHB1ENR (AMBA High-operation Omnibus 1 Enable) register in the RCC (Reset and Clock Command) peripheral needs setting.
#include <stdint.h> volatile uint32_t* const portd_moder = (uint32_t*) 0x40020C00; volatile uint32_t* const portd_odr = (uint32_t*) 0x40020C14; volatile uint32_t* const rcc_ahb1enr = (uint32_t*) 0x40023830; extern void sleep(uint32_t ms); // use systick to busy-wait int main(void) { *rcc_ahb1enr |= (1 << 3); // enable PortD'south clock uint32_t moder = *portd_moder; moder |= (1 << 16); moder &= ~(1 << 17); *portd_moder = moder; while(one) { *portd_odr |= (1 << 8); // led-on sleep(500); *portd_odr &= ~(1 << viii); // led-off sleep(500); } }
Using structs
The code and then far works simply fine, just has a number of shortcomings.
First, to back up multiple IO ports we would accept to define a prepare of pointers for each set of registers for each port, e.g.:
volatile uint32_t* const porta_moder = (uint32_t*) 0x40020000; volatile uint32_t* const porta_odr = (uint32_t*) 0x40020014; volatile uint32_t* const portb_moder = (uint32_t*) 0x40020400; volatile uint32_t* const portb_odr = (uint32_t*) 0x40020414; volatile uint32_t* const portc_moder = (uint32_t*) 0x40020800; volatile uint32_t* const portc_odr = (uint32_t*) 0x40020014; volatile uint32_t* const portd_moder = (uint32_t*) 0x40020C00; volatile uint32_t* const portd_odr = (uint32_t*) 0x40020C14; volatile uint32_t* const porte_moder = (uint32_t*) 0x40021000; volatile uint32_t* const porte_odr = (uint32_t*) 0x40021014;
Considering the port actually has 10 dissimilar registers nosotros may desire to access, this involves a lot of repetition. Where there is repetition, uncomplicated to make, only difficult to track down bugs can pitter-patter in (did you spot the deliberate mistake?).
In improver, and more significantly, we can see that the port's ODR is ever 0x14
bytes outset from the MODER. The MODER is always at offset 0x00
from the port address (this the MODER is also the port'south base address).
In Software Engineering science terms we'd view this separate annunciation of related pointers
as a lack of cohesion in the code. I of our goals is to strive for high cohesion, thus grouping things together that should naturally be together (as change furnishings them all).
struct Overlay
The full register layout for the STM32F4 GPIO port is shown below:
Past using a struct to define the relative memory offsets, we can get the compiler to generate all the correct accost accesses relative to the base of operations address.
#include <stdint.h> typedef struct { uint32_t MODER; // way annals, get-go: 0x00 uint32_t OTYPER; // output blazon register, start: 0x04 uint32_t OSPEEDR; // output speed register, start: 0x08 uint32_t PUPDR; // pull-upwardly/pull-down register, outset: 0x0C uint32_t IDR; // input data register, offset: 0x10 uint32_t ODR; // output data register, first: 0x14 uint32_t BSRR; // scrap set/reset register, commencement: 0x18 uint32_t LCKR; // configuration lock register, kickoff: 0x1C uint32_t AFRL; // GPIO alternating part registers, offset: 0x20 uint32_t AFRH; // GPIO alternate function registers, offset: 0x24 } GPIO_t;
Now we define the pointer equally before, but this time using the struct type rather than a uint32_t
:
volatile GPIO_t* const portd = (GPIO_t*)0x40020C00;
Finally we can utilize it equally before, simply this time utilize struct-pointer dereferencing to access the individual registers:
int chief(void) { *rcc_ahb1enr |= (1 << 3); // enable PortD'south clock uint32_t moder = portd->MODER; moder |= (one << 16); moder &= ~(1 << 17); portd->MODER = moder; while (1) { portd->ODR |= (one << 8); // led-on slumber(500); portd->ODR &= ~(i << 8); // led-off sleep(500); } }
At present when we access the ODR via the statement:
portd->ODR |= (i << 8); // led-on
the compiler can calculate the relative offset (0x14) of the ODR member relative to the base of operations address of the arrow (0x40020C00).
This ways that we simply need ane pointer per port rather than ten, e.g.
volatile GPIO_t* const porta = (GPIO_t*)0x40020000; volatile GPIO_t* const portb = (GPIO_t*)0x40020400; volatile GPIO_t* const portc = (GPIO_t*)0x40020800; volatile GPIO_t* const portd = (GPIO_t*)0x40020C00; volatile GPIO_t* const porte = (GPIO_t*)0x40021000;
Alternatively nosotros could do the same with #define
due south;
#define PORTA ((volatile GPIO_t*) 0x40020000) #ascertain PORTB ((volatile GPIO_t*) 0x40020400) #ascertain PORTC ((volatile GPIO_t*) 0x40020800) #define PORTD ((volatile GPIO_t*) 0x40020C00) #define PORTE ((volatile GPIO_t*) 0x40021000)
Note in the #define
s the leading '*
' every bit a dereference has been dropped, and then admission to the annals is coded thus:
PORTD->ODR |= (1 << 8); // led-on
If we left the dereference in:
#define PORTD (*((volatile GPIO_t) 0x40020C00))
the code would be:
PORTD.ODR |= (1 << viii); // led-on
It's a matter of style, the generated instructions are the same.
Code Comparison
Then how does the struct code expression compare to our original pointer code (compiled with optimisation flag -Og
):
Original code
$ arm-none-eabi-objdump -d -S primary.o ... *portd_odr |= (1 << 8); // led-on 1a: 4c0b ldr r4, [pc, #44] ; (48 <principal+0x48>) 1c: 6823 ldr r3, [r4, #0] 1e: f443 7380 orr.w r3, r3, #256 ; 0x100 22: 6023 str r3, [r4, #0] ...
The assembler code does the following:
- Load the value 0x40020C14 into r4
- Read the contents of 0x40020C14 [r4 + 0] every bit a 32-scrap value into r3
- Or 0x100 with the contents of r3 (prepare bit eight)
- Store r3 as a 32-bit value at accost 0x40020C14
Comparing this to the struct admission:
$ arm-none-eabi-objdump -d -S main.o ... portd->ODR |= (1 << 8); // led-on 1a: 4c0a ldr r4, [pc, #40] ; (44 <master+0x44>) 1c: 6963 ldr r3, [r4, #20] 1e: f443 7380 orr.w r3, r3, #256 ; 0x100 22: 6163 str r3, [r4, #xx] ...
So how does this differ? only in the apply of an offset-load:
- Load the value 0x40020C00 into r4
- Read the contents of 0x40020C14 [r4 + 20] as a 32-bit value into r3
- Or the value 0x100 with the contents of r3
- Shop r3 as a 32-bit value at address 0x40020C14 – [r4 + 0x14]
This code demonstrates that, from a size and operation perspective, in that location is no difference between the two approaches (at to the lowest degree for the Arm).
Note: An Arm load (ldr
) pedagogy with or without a secondary starting time takes 2-cycles.
Caveats
Before rush off and refactor legacy code to now use structs there are a couple of factors nosotros are relying on, which may vary from compiler to compiler.
First, what can we be sure of?
- The offset of the first struct member is always 0x0 from the objects accost (this is not guaranteed in C++ only unremarkably is the case).
- The compiler cannot reorder the members, and so OTYPER volition e'er come at a higher address in retentivity than MODER and at a lower than OSPEEDR.
However, we cannot guarantee that the compiler volition non introduce padding between members, as the standard states:
At that place may be unnamed padding within a structure object, but non at its first.
Then nosotros cannot guarantee that address of OTYPER is equal to the accost of MODER + 4 bytes.
That said, in practical terms, with modern compilers, it is unlikely to exist a problem (for this code). Padding tends to occur when a information member crosses its natural boundary (i.e. a 32-bit type is non word aligned). e.g.
typedef struct { int a; char b; int c; } Padding_t;
would probable render a outcome of 12 from sizeof(Padding_t);
because iii paddings bytes
are added after char b
to align the int c
definition.
Mitigating the chance
The obvious, and nearly straightforward, approach is to ensure you lot have a unit of measurement test that checks the size of the generated structure, eastward.g.
void test_GPIO_t_struct_size(void) { TEST_ASSERT_EQUAL(40, sizeof(GPIO_t)); }
Alternatively, ane of the compelling reasons to use C11 is the introduction of static_assert[link]
, e.m.
int main(void) { static_assert(sizeof(GPIO_t) == 40, "padding in GPIO_t present"); }
This is a compile-time check; if padding was present, then the following compiler error is generated:
src/master.c: In function 'master': src/primary.c:87:iii: error: static assertion failed: "padding in GPIO_t present" static_assert(sizeof(GPIO_t) == 40, "padding in GPIO_t present"); ^
If you're not using C11 (I've nevertheless to run across an embedded C project using it) then a terminal arroyo is to try and ensure no padding is nowadays past requesting the compiler 'pack' the struct to the about optimal memory model.
This is always a compiler-specific request, which may be done through #pragma
s. Notwithstanding GCC uses its own 'attribute' approach instead of pragmas.
Defining the construction with the attribute 'packed
' volition normally remove any potential padding, e.thousand.
typedef struct { uint32_t MODER; // fashion register, offset: 0x00 uint32_t OTYPER; // output type annals, starting time: 0x04 uint32_t OSPEEDR; // output speed annals, outset: 0x08 uint32_t PUPDR; // pull-up/pull-down register, first: 0x0C uint32_t IDR; // input data register, outset: 0x10 uint32_t ODR; // output data register, commencement: 0x14 uint32_t BSRR; // chip set/reset register, offset: 0x18 uint32_t LCKR; // configuration lock annals, offset: 0x1C uint32_t AFRL; // alternate function registers, offset: 0x20 uint32_t AFRH; // alternate function registers, outset: 0x24 } __attribute__((packed)) GPIO_t; typedef struct { int a; char b; int c; } __attribute__((packed)) Padding_t; int principal(void) { static_assert(sizeof(GPIO_t) == 40, "padding in GPIO_t present"); static_assert(sizeof(Padding_t) == ix, "padding in Padding_t present"); }
Unaligned access can cause a whole host of issues and functioning issues, and then be extremely conscientious using packing.
Vendor Supplied Headers
On most modern microcontrollers you are likely to find headers provided with register definitions already supplied. Many years agone Arm introduced the
Cortex Micro-controller Software Interface Standard (CMSIS). As office of the standard information technology is expected that between Arm and the Vendor, annals definitions will be supplied.
For example, ST supply a series for headers for their STM32 family of microcontrollers. Searching out the ST provided file stm32f407xx.h
you volition observe definitions for all peripheral included in the 407 variant.
On line 544 of this header file (based on version V2.1.0) yous will find the following definition:
typedef struct { __IO uint32_t MODER; /*!< GPIO port mode annals, Address offset: 0x00 */ __IO uint32_t OTYPER; /*!< GPIO port output type register, Address first: 0x04 */ __IO uint32_t OSPEEDR; /*!< GPIO port output speed register, Address offset: 0x08 */ __IO uint32_t PUPDR; /*!< GPIO port pull-up/pull-down register, Address offset: 0x0C */ __IO uint32_t IDR; /*!< GPIO port input information register, Accost outset: 0x10 */ __IO uint32_t ODR; /*!< GPIO port output data register, Address get-go: 0x14 */ __IO uint16_t BSRRL; /*!< GPIO port fleck prepare/reset depression register, Address offset: 0x18 */ __IO uint16_t BSRRH; /*!< GPIO port bit set/reset high register, Address kickoff: 0x1A */ __IO uint32_t LCKR; /*!< GPIO port configuration lock register, Address start: 0x1C */ __IO uint32_t AFR[2]; /*!< GPIO alternate part registers, Address offset: 0x20-0x24 */ } GPIO_TypeDef;
This is a slightly unlike interpretation of the register layout from earlier, notably:
- The BSRR has been carve up into two xvi-bit register (BSRRL and BSRRH)
- The AFR has been combined into an array of two elements (rather than a High and Depression).
At that place could be a risk of padding betwixt BSRRL and BSRRH, only unlikely and does not occur hither.
The __IO
macro but maps onto volatile
. In that location is a macro for __I
(volatile const) to define 'read merely' access (there is a __O
(volatile) to indicate 'write only' access – only this tin can't exist enforced in C).
Farther down in the file (line 1130):
#ascertain GPIOD ((GPIO_TypeDef *) GPIOD_BASE)
Over again, some other slight difference in the code is the choice to put the volatile directive in the struct rather than at the pointer definition.
The RCC struct definition is on line 615 with the #define
on line 1137.
The CMSIS code to drive the LED is:
#include "stm32f407xx.h" #include "timer.h" int master(void) { RCC->AHB1ENR = (1 << three); uint32_t moder = GPIOD->MODER; moder |= (1 << 16); moder &= ~(1 << 17); GPIOD->MODER = moder; while (1) { GPIOD->ODR |= (one << 8); // led-on sleep(500); GPIOD->ODR &= ~(one << 8); // led-off sleep(500); } }
In summary
Programs are decomposed into modules in several ways of which one is called during the blueprint process (assuming blueprint happens!). The choice of decomposition has a disquisitional effect on the architecturel and thus the product's quality attributes such as maintainability, reliability, modifiability, and testability of the last system.
Cohesion is one of the most important concepts in software decomposition. High cohesion is fundamental to good pattern principles and patterns, guiding separation of concerns and maintainability.
Using a struct-based model for device access improves cohesion through good abstraction models, making code easier to understand and maintain.
In the side by side article I shall commencement to compare the relative merits and consequences of using the #define
model verse the pointer model.
- About
- Latest Posts
Niall Cooling
Co-Founder and Managing director of Feabhas since 1995.
Niall has been designing and programming embedded systems for over 30 years. He has worked in different sectors, including aerospace, telecomms, government and banking.
His current interest lie in IoT Security and Active for Embedded Systems.
Source: https://blog.feabhas.com/2019/01/peripheral-register-access-using-c-structs-part-1/
Posted by: turnerwhangs.blogspot.com
0 Response to "How To Use Registers In Embedded C"
Post a Comment