Sunday, 14 February 2016

Microchip/Atmel ARM Cortex M4 Microcontroller Core Circuit and Bare-Metal Programming from the Ground Up (Update: 21 April 2020)


This article describes the process of building a micro-controller (MCU) core circuit for Microchip/Atmel ATSAM4S family of ARM Cortex-M4 MCU, and the basic codes needed for the micro-controller to run. The core circuit can be used for prototyping and product development. Here we will be using the AtmelStudio IDE (Integrated Development Environment) for code development.
 
* Note: This article was first published on Feb 2016, subsequently in the same year Atmel was acquired by Microchip Technology. 
 

1.0 The Hardware

A micro-controller chip typically requires some external components or circuits to work properly.  The minimum are:

1. Voltage regulator circuit.  This is optional, as we can connect the power supply pins of the micro-controller directly to the battery or bench top power supply.  However having one will ensure a stable operating voltage to the micro-controller.  It will also protect the controller from excessively high input voltage or reverse supply polarity that risk damaging the chip.
2. External oscillator circuit. The micro-controller requires a clock signal to function, this clock signal is supplied by an oscillator circuit.  Most modern micro-controller have built-in oscillator (typically RC circuit based), but these are not as stable or accurate as a crystal (or MEMS) oscillator.
3. Decoupling capacitors.  These capacitors are connected between the VCC and GND pins, and filter out any voltage fluctuations between VCC and GND. Modern micro-controllers usually have multiple positive supply connections, for instance there is a VCC for the digital input/output buffers, VCC for the core logic circuits and VCC for on-chip analog circuits like PLL (phase-locked loop), ADC (analog-to-digital converter) and DAC. Thus, decoupling capacitors must be provided for all these positive supplies pins.
4. Reset network and programming/debug port.  Usually the micro-controller has a hardware reset pin.  Typically this pin has to be pulled to logic high (VCC) within a short period upon power up.  An RC network is used to achieve this.  For loading the program into the micro-controller flash memory, various approaches can be used.  The ATSAM4S supports programming via JTAG, SWD (Serial wire debug) or using a USB/UART port with a boot loader software in the micro-controller ROM called the SAM-BA (SAM Boot Assistant) monitor.  The user needs to use a standard Atmel programmer to load the SAM-BA monitor into the micro-controller.  Atmel also supplies a user interface program running on PC (supporting both Windows and Linux) that can talk to the SAM-BA monitor.  Once loaded in the micro-controller, the SAM-BA monitor uses the micro-controller onboard USB interface or UART0 interface for communication with the PC software.  Refer to the datasheet of each chip for more information.  Here we will use the SWD approach to program the Atmel ATSAM4S micro-controller.  For further information please refer to the datasheet of the specific ATSAM4S.

A basic circuit that implements the 4 elements above is as shown in Figure 1.  The chip used is ATSAM4SD16, an ARM Cortex-M4 MCU in TQFP 64 package.  I also include two LEDs (D1 and D2) to be used for general purpose indicators. D1 is usually used as a 'heart beat' indicator, showing the the micro-controller is running at the correct timing, and D2 is used to indicate activities in communication with external devices. Integrated circuit (IC) U2 is a 3.3 Volts output LDO (low drop-out) linear voltage regulator.  Terminal J2 is the power supply input terminal, where supply voltage between 3.6 to 12 Volts can be connected to it. In Figure 1, X1, C8 and C9 form part of the external crystal oscillator circuit used to generate the clock signal for the micro-controller. R1 and R3 forms the reset circuit, and capacitors C1, C4, C5, C6 and C7 are the decoupling capacitors.  The RC network R5 and C11 is used to provide a clean 3.3V reference voltage to the internal ADC (analog-to-digital converter) in the ATSAM4S micro-controller.

Figure 1 - The Core circuit for ATSAM4S MCU.


We do not need to build a PCB for the above schematic.  The photos (Figures 2 and 3) below show the top and bottom view of an implementation using QFP adapter board (or breakout board) and veroboards.  The voltage regulator IC, piezo-electric crystal, program/debug header and the LEDs are implemented on the veroboard, while the capacitors can be soldered directly on the QFP adapter board. For example, C1, C4, C5, C6, C7 and C11 should be soldered as near to the micro-controller pins as possible to improve the decoupling effectiveness.  Here I am using SMD version for the capacitors (0603 size), although normal through-hole version are usable as well.  The 3.3V low drop-out (LDO) voltage regulator can be any type, as long as the tolerance is within 5% of 3.3V and it is capable of supplying a minimum of 100 mA of current, the part used here is the LP2985 series from Texas Instruments.  A conductive copper tape (here I use the brand 3M) is used to implement a ground plane underneath the micro-controller as the chip is running at high frequency.  At 120 MHz clock this is still optional but the ground plane enable easy connection of of the micro-controller GND pins and capacitors to ground.

Figure 2 - Top View of the core circuit.  Notice a CMOS camera board is connected to the core circuit.


Figure 3 - Bottom View.


2.0 The Software Framework

I am going to summarize the steps in creating the software framework for Atmel ATSAM4S series of ARM Cortex-M micro-controllers.  We will focus on the ATSAM4SD16 (Cortex-M4), a Cortex-M4 MCU in 64 pins TQFP package.  The integrated development environment (IDE) is Atmel Studio Version 7.0.X (now called Microchip Studio for AVR and SAM devices) running on Windows OS platform, with GNU C-compiler tool chain.  For this project we will need two device support packs.  The device support packs are software construct such as header files for C program that allows the user to conveniently access the hardware register and bits in Cortex-M micro-controller (when referring to the address of the register) and other settings pertinent to the compiler. The first device support pack is the ARM’s CMSIS (Cortex-M Software Interface Standards) provided by ARM Inc. to access the registers and bits in the Cortex-M MCUs core (https://developer.arm.com/embedded/cmsis).  The CMSIS codes are developed as open source project by ARM Inc. and contains a few modules.  Here we will only be concern with CMSIS Core.  When we install the Atmel Studio software, the GNU tools and CMSIS modules should have been installed along with it.  At the time of writing this, the latest version of CMSIS is Version 5.0.1.  The second device support pack is provided by the chip manufacturer Atmel (or Microchip now as Atmel is bought over by Microchip Technology), it allows the user to access the peripheral control registers of the micro-controller.  Typically ARM Inc. provides the core design of the MCU, while the company that actually manufacture the chip implements the extra hardware peripheral such as UART, I2C, SPI, USB, Ethernet blocks etc.  Thus the chip manufacturer would also provide another set of files that contain the declaration of all the hardware peripheral register address and bits assignment, which for Atmel case it is called DFP (I think this stands for device file package).  At the time of writing this, I am using SAM4S_DFP Version 1.0.56.  Sometimes the newer device support pack may not be compatible with older version (due to naming convention) so we have to be careful on this.
 
Finally, the hardware tool I am using to program the Cortex-M MCUs is the Atmel ICE (https://www.microchip.com/en-us/development-tool/ATATMEL-ICE). Nowadays, Microchip's programmers such as MPLAB Snap, Pickit 4 etc also support AVR and SAM devices.  Other third-party programmers/debuggers like the J-Link Debug Probe by Segger can also be used.  


The ATSAM4SD16 Project
1. First, we fire up Atmel Studio and create a new project (Figure 1).
Figure 1.

2. In the new project window, select the “GCC C Executable Project”, set the project path and name as required.  Here I am using the default name of GccApplication1. 
Figure 2.

3.  After this select the correct MCU, here I am using ATSAM4SD16B, a Cortex-M4 MCU in 64-pin TQFP package.
Figure 3.

4.  A project setup tab for this project is created.  Select the programmer/debugger tool, as shown in Figure 4.  Since I have already plugged the Atmel ICE into the computer USB port, the Atmel ICE should appear in the selection list.
Figure 4.

5.  Now I configure the programmer tool, here I am using SWD (serial-wire debug) protocol to program the MCU flash memory.  Make sure we select “Boot from Flash” (We can also boot from the ROM, which contains a boot-loader program called SAMBA).  Notice also that Atmel Studio also creates a few default files for the project (Figure 5).
Figure 5.

6. (Added on 21 April 2020) Make sure the two device support packs version are correct, else you can select to install the packs from Atmel server.  In older version of Atmel Studio you would hit the "Components" tab to show the device support packs.

Figure 6A


Figure 6B

7.  In Figure 7, the content of default “main.c” file is depicted.  Also shown are two initialization files called “startup_sam4s.c” and “system_sam4s.c”.

Figure 7.

The codes in “startup_sam4s.c” contains the source codes for the reset handler and other default interrupt handler.  Here handler means the Interrupt Service Routine.  “startup_sam4s.c” also contains the declaration for the handlers for the Cortex-M modules and initialization for the stack pointers for each handler.  The other file “system_sam4s.c” contains codes to setup two key modules that are critical to the operation of the ATSAM4S micro-controller operation: the PMC (Power Management Controller) and EEFC (Enhanced Embedded Flash Controller). 

PMC generates the clock signals to the Cortex-M4 core and also to all the peripheral modules in the ATSAM4S, while EEFC as the name implied, manages the on-chip flash memory.  The codes to initialize PMC and EEFC should be called by function SystemInit( ) in the main( ) function.  Since I am using my own custom circuit with external 8 MHz crystal oscillator, I am going to bypass the codes in “system_sam4s.c”.  In fact, “system_sam4s.c” can be removed from the project.


We will modify the default codes in “main.c” to include our custom initialization codes.  The listing for the modified “main.c” is shown below.  Comments on the codes explain the purpose of each setting.

Listing 1 – Custom “main.c” source codes:

/*
 * GccApplication1.c
 *
 * Created: 17/6/2018 2:09:34 PM
 * Author : User
 */
#include "sam.h"

/// Function Name       : SAM4S_Init
/// Description : This function performs further initialization of the Cortex-M
///                processor, namely:
///           1. Setup processor main oscillator and clock generator circuit.
///           2. Setup processor flash memory controller wait states.
///           3. Setup the SysTick system timer (if used).
///           4. Enables the cache controller.
///           5. Also initializes the micro-controller peripherals and
///              I/O ports to a known state. For I/O ports, all pins will
///               be set to:
///               (a) Assign to PIO module (PIOA to PIOC)
///               (b) Digital mode,
///               (c) Output and
///               (d) A logic '0'.
/// Arguments   : None
/// Return      : None
///
void SAM4S_Init()
{
// Upon reset, the internal fast RC oscillator is enabled with 4 MHz frequency selected as
// the source of MAINCK.
// Routines to enable and start-up the main crystal oscillator via the Power
// Management Controller (PMC)
        PMC->CKGR_MOR = (PMC->CKGR_MOR) | CKGR_MOR_MOSCXTST(100) | CKGR_MOR_KEY_PASSWD;  
// Main crystal oscillator start-up time,
        // 100x8=800 slow clock cycles.  Slow clock runs
        // at 32 kHz.  Note that to prevent accidental
        // write, this register requires a Key or password.
        PMC->CKGR_MOR = (PMC->CKGR_MOR) | CKGR_MOR_MOSCXTEN | CKGR_MOR_KEY_PASSWD;               
// Enable main 8 MHz crystal oscillator.
        while ((PMC->PMC_SR & PMC_SR_MOSCXTS) == 0) {}                                                    
        // Wait until the main crystal oscillator is stabilized.
        PMC->CKGR_MOR = (PMC->CKGR_MOR) | CKGR_MOR_MOSCSEL | CKGR_MOR_KEY_PASSWD;                
// Select the main crystal oscillator.
        while ((PMC->PMC_SR & PMC_SR_MOSCSELS) == 0) {}                                                  
        // Wait until main oscillator selection is done.
        PMC->CKGR_MOR = ((PMC->CKGR_MOR) & ~CKGR_MOR_MOSCRCEN) | CKGR_MOR_KEY_PASSWD;            
// Disable the on-chip fast RC oscillator.
       
       
        // Set FWS (Flash memory wait state) according to clock configuration
        // This has to be set first before we change the main clock (MCK) of the core
// to higher frequency
        // Please refer to the device datasheet on Enhanced Embedded Flash Controller (EEFC)
// on the wait state to insert depending on
        // core clock frequency
        // For fcore = 120 MHz, FWS = 5, e.g. 6 wait states.
        // For fcore = 4 MHz, FWS = 0, e.g. 1 wait state.
        // For fcore = 8-20 MHz, FWS = 1, e.g. 2 wait states.
        EFC0->EEFC_FMR = EEFC_FMR_FWS(5);
        #if defined(ID_EFC1)
        EFC1->EEFC_FMR = EEFC_FMR_FWS(5);
        #endif

        // Routines to enable PLLB and use this as main clock via the Power Management
// Controller (PMC)
PMC->CKGR_PLLBR = (PMC->CKGR_PLLBR & ~CKGR_PLLBR_PLLBCOUNT_Msk) | CKGR_PLLBR_PLLBCOUNT(100) | CKGR_PLLBR_DIVB(0) | CKGR_PLLBR_MULB(0);    // Disable PLLB first.
PMC->CKGR_PLLBR = (PMC->CKGR_PLLBR & ~CKGR_PLLBR_PLLBCOUNT_Msk) | CKGR_PLLBR_PLLBCOUNT(100) | CKGR_PLLBR_DIVB(2) | CKGR_PLLBR_MULB(30);   // Enable PLLB.
        // Here fxtal (crystal oscillator) = 8 MHz
        // Thus fin = fxtal / DIVB = 8/2 = 4 MHz
        // fPLLB = fin x MULB = 4 * 30 = 120 MHz.
        // fcore = fPLLB = 120 MHz.
        while ((PMC->PMC_SR & PMC_SR_LOCKB) == 0) {}     // Wait until PLLB is locked.
       
        // fcore = fPLLB / pre-scaler = fPLLB / 1 = fPLLB = 120 MHz.
        // Note: we can also set fPLLB to 240 MHz and set pre-scalar to 2 as follows:
        //PMC->PMC_MCKR = (PMC->PMC_MCKR & ~PMC_MCKR_PRES_Msk) | PMC_MCKR_PRES_CLK_2;    
// Set pre-scalar to divide-by-2.
        //while ((PMC->PMC_SR & PMC_SR_MCKRDY) == 0) {}          // Wait until Master Clock is ready.

        PMC->PMC_MCKR = (PMC->PMC_MCKR & ~PMC_MCKR_CSS_Msk) | PMC_MCKR_CSS_PLLB_CLK;     
// Change master clock source to PLLB.
        while ((PMC->PMC_SR & PMC_SR_MCKRDY) == 0) {}           
// Wait until Master Clock is ready.

// Disable all clock signals to non-critical peripherals as default (to save power).
// Note: Peripherals 0-7 are system critical peripheral such as Supply Controller,
// Reset Controller, Real-Time Clock/Timer, Watchdog Timer,
// Power Management Controller and Flash Controller.  The clock to these peripherals
// cannot be disabled.
        PMC->PMC_PCDR0 = 0xFFFFFF00;             // Disable clock to peripheral ID8 to ID31.
        PMC->PMC_PCDR1 = 0x0000000F;             // Disable clock to peripheral ID32 to ID34.    
       
        // Setup Port A and Port B IO ports.
        // --- Setup PIOA ---
        PMC->PMC_PCER0 |= PMC_PCER0_PID11;       // Enable peripheral clock to PIOA (ID11).
        PMC->PMC_PCER0 |= PMC_PCER0_PID12;       // Enable peripheral clock to PIOB (ID12).
       
        PIOA->PIO_PER = 0xFFFFFFFF;              // All PA1-PA32 are controlled by PIO.
        //PIOA->PIO_OER = 0xFFFFFFFF;            // Set PIOA to outputs.
        //PIOA->PIO_OWER = 0xFFFFFFFF;   // Enable output write to PIOA.
        // Set Output Write Status Register bit (if we are using ODSR to change the value of PA1-PA32)
        PIOA->PIO_OER = 0x7E7FFF;                // Set PIOA to outputs.
        PIOA->PIO_OWER = 0x7E7FFF;               // Enable output write to PIOA.
                                                                       
        PIOB->PIO_PER = 0xFFFFFFFF;              // All PB1-PB32 are controlled by PIO.
        PIOB->PIO_OER = 0xFFFFFFFF;              // Set PIOB to outputs.
        PIOB->PIO_OWER = 0xFFFFFFFF;     // Enable output write to PIOB.
        // Set Output Write Status Register bit (if we are using ODSR to change the value 
        // of PB1-PB32)                                                                   
        // The SysTick module is triggered from the output of the Master Clock (MCK)
// divided by 8.  Since MCK = fCore,
        // the timeout for SysTick = [SysTick Value] x 8 x (1/fCore).
        // For fCore = 120 MHz, SysTick Value = 100
#define __SYSTICKCOUNT  15001           // This will give a SysTick expire time of 1 msec.

        SysTick->LOAD = __SYSTICKCOUNT;  // Set reload value.
        SysTick->VAL = __SYSTICKCOUNT;   // Reset current SysTick value.
        SysTick->CTRL = SysTick->CTRL & ~(SysTick_CTRL_COUNTFLAG_Msk);   // Clear Count Flag.
        SysTick->CTRL |= SysTick_CTRL_ENABLE_Msk;        // Enable SysTick.
       
        // Enable the Cortex-M Cache Controller
        if (((CMCC->CMCC_SR) & CMCC_SR_CSTS) == 0)
// Check the CSTS value, if 0 start the Cache Controller.
        {
                CMCC->CMCC_CTRL |= CMCC_CTRL_CEN; // Enable the Cache Controller.
        }

}

int main(void)
{
    /* Initialize the SAM system */
SAM4S_Init();

    while (1)
{
  /* Put your application code here */
    }
}


Listing 1 above is the basic template to use the ATSAM4S MCU successfully with the core circuit provided.  You can put your routines within the infinite while-loop in the main( ) function.  A modification of the main( ) routine to use the SysTick system timer is shown in Listing 2.   Where the user codes will be executed roughly every 1 miliseconds.  In this example we toggle pin PA19 every 1 milisecond, thus PA19 will output a square wave at 500 Hz.

In addition we also reload the internal watchdog timer (WDT) of the ATSAM4S every time the user codes are executed.  Take note that ATSAM4S incorporates an internal WDT and it is enabled by default.  The WDT will timeout every 16 seconds and will reset the MCU if it is not periodically reloaded.  Thus we either disable the WDT (which is not a good idea) or reload it within the 16 seconds period.

Listing 2main( ) routines that is executed at regular time interval.

int main(void)
{
    int  nCount = 0;
    /* Initialize the SAM system */
SAM4S_Init();

    while (1)
{
// --- Check SysTick until time is up, then update each process's timer ---
        if ((SysTick->CTRL & SysTick_CTRL_COUNTFLAG_Msk) > 0)           
// Check if SysTick counts to 0 since the last read.
    {
        /* Put your application code here */

        // --- Start of user routines ---

        WDT->WDT_CR = (WDT->WDT_CR) | WDT_CR_WDRSTT | WDT_CR_KEY_PASSWD;  
        // Reload the Watchdog Timer.

    
        nCount++;
        if (nCount > 1)  {
             nCount = 0; 
                 PIOA->PIO_ODSR &= ~PIO_ODSR_P19;  // Clear PA19.
        }
        else  {
             PIOA->PIO_ODSR |= PIO_ODSR_P19;   // Set PA19.
        }
        // --- End of user routines ---
}
    }
}

The complete project setup of the basic software framework for ATSAM4S MCU in Atmel Studio is shown in Figure 8.  You can now build the project and load it into the MCU.  If the codes are successfully loaded into the ATSAM4S MCU, you can probe pin PA19 with an oscilloscope and a 500 Hz square wave should be visible.


Figure 8.


3.0 Files


The files located in the Github site contains the schematic in PDF format, an explanation of the connection from the programmer to the program/debug port of the micro-controller and sample codes.  Here I am using the Atmel ICE for programming the chip.  The codes in the Github extend the concept above into a simple cooperative scheduler to support multiple con-current user tasks.



References