|author||Kristofer Jonsson <firstname.lastname@example.org>||Tue Jun 15 17:51:58 2021 +0200|
|committer||Kristofer Jonsson <email@example.com>||Tue Jun 15 17:53:54 2021 +0200|
Using ref kernels Tensorflow reference kernels are bit exact and should be used by the run_platform.py script to generate the expected OFM data. Change-Id: I90e688e753e5330aaaf9002abed23df0493ff99b
Arm(R) Ethos(TM)-U core platform is provided as an example of how to produce a firmware binary for a given target platform. This software is primarily intended for guidance, to demonstrate how to boot up a firmware binary and how to run an inference on an Arm Ethos-U compatible platform.
This repository contains target specific files, like linker scripts. Target agnostic software components are provided in the core_software repository.
The Arm(R) Corstone(TM)-300 is a reference design of how to to build a secure System on Chip (SoC). A fixed virtual platform (FVP) of the Arm Corstone-300 including the Arm Ethos-U can be downloaded from the Ecosystem page at developer.arm.com.
Building with default settings requires CMake for the configuration and make for building. This will produce an elf file which can be run on the FVP.
$ cmake -B build/corstone-300 targets/corstone-300 $ cd build/corstone-300 $ make
It is also possible to build with a different toolchain.
$ cmake -B build/corstone-300 targets/corstone-300 -DCMAKE_TOOLCHAIN_FILE=$PWD/cmake/toolchain/arm-none-eabi-gcc.cmake $ cd build/corstone-300 $ make
Please see README_WINDOWS.md for additional information regarding building on a Windows system.
Assuming that the Corstone-300 FVP has been downloaded, installed and placed in the PATH variable. Then the software binaries can be tested like this.
Individual applications can also be run directly with the FVP, for example like this.
$ FVP_Corstone_SSE-300_Ethos-U55 applications/freertos/freertos.elf
The files needed to get started for Corstone-300 can be found on developer.arm.com.
Follow the documentation in the downloaded archive to setup the board with the Corstone-300 FPGA bit files.
The built files can then be ran by adapting the steps in chapter '10 Software', using the extracted binary files from the build process. This is needed for the bootloader on the FPGA to be able to load the memories.
TOTALIMAGES: 2 IMAGE0ADDRESS: 0x02000000 IMAGE0UPDATE: AUTO IMAGE0FILE: \SOFTWARE\10000000.bin; sram binary IMAGE1ADDRESS: 0x0c000000 IMAGE1UPDATE: AUTO IMAGE1FILE: \SOFTWARE\70000000.bin ; ddr binary
The mapping between the Cortex-M55 memory space and the addresse the FPGA MMC bootloader needs is found in section '9.6 MCC Memory mapping' of the documentation in the Corstone-300 FPGA archive. A part of the table is shown below:
| Cortex-M55 | MMC Bootloader | |---------------|-----------------| | 0x00000000 | 0x00000000 | | 0x10000000 | 0x01000000 | | 0x60000000 | 0x08000000 | | 0x70000000 | 0x0C000000 |
The binary that the Cortex-M55 CPU expects at address 0x10000000 must therefor be written to 0x02000000.
Power up the board with the PBON and the application output will be seen on the serial console.
The Tensorflow Lite for Microcontrollers (TFLu) framework supports running multiple parallel inferences. Each parallel inference requires a TFLu arena (costs memory) and a stack (requires an RTOS). The examples provided in this repo are implemented in the application layer, which means that any RTOS could be used.
The Ethos-U NPU driver is implemented in plain C. To enable thread safety in a multi-threading environment the driver defines a set of weak functions that the application is expected to override, providing implementations for mutex and semaphore primitives.
The sequence diagram below illustrates the call stack for a multi NPU system. Please note how the
ethosu_semaphore_* functions are implemented in the application layer. Mutexes are used for thread safety and semaphores for sleeping.
A single Cortex-M is capable of driving multiple Ethos-U. What the optimal number of Ethos-U is, that is impossible to tell without knowing which network to run or without detailed knowledge about the limitations of the embedded system.
Each parallel inference requires an arena. The arena should for optimal performance be placed in a high bandwidth low latency memory like SRAM, which is a cost that has to be considered. The size of the arena varies greatly depending on the network.
For networks that map fully to Ethos-U, the memory bandwidth might become a limiting factor. For networks that run partly in software, the Cortex-M might become the limiting factor. The placement of the TFLu model and arena (flash, DRAM, SRAM, etc) will also have a big impact on the performance.
The applications in this repo use CMSIS Device to startup the Cortex-M. The standard procedure is to copy and modify the CMSIS templates, but in this repo we have chosen to include the unmodified templates directly from CMSIS.
The sequence diagram below describes what happens after the Cortex-M reset is lifted, up until the execution enters the application
First thing that happens is that the CPU loads index 0 from the interrupt vector into the SP register and index 1 into the PC register, and then starts executing from the PC location.
Index 1 in the VTOR is referred to as the reset handler and is resposible for initializing the CPU. If the CPU for example has a FPU or MVE extension, then these are enabled.
The entry function for the compiler runtime setup varies depending on which compiler that is used. For Arm Clang this function is called
__main(), not to be confused with the application
The runtime is responsible for initializing the memory segments and setting up the runtime environment. Please refer to the compiler documentation for detailed information about the runtime setup.
init() is defined as a constructor, which will be called before the application
main(). We use this constructor to run
targetSetup() to initialize the platform.
For each target there is a
targets/<target> directory, which contains linker scripts and code needed to setup the target.
targetSetup() is implemented in this folder and is responsible for initializing drivers, configuring the MPU, enabling caches etc.
Adding a new target would involve creating a new
targets/<target> directory, providing linker scripts and implementing
Finally the runtime calls application
main(). Ideally the application code should be generic and have no knowledge about which target it is executing on.
The Arm Ethos-U core platform is provided under an Apache-2.0 license. Please see LICENSE.txt for more information.
The Arm Ethos-U project welcomes contributions under the Apache-2.0 license.
Before we can accept your contribution, you need to certify its origin and give us your permission. For this process we use the Developer Certificate of Origin (DCO) V1.1 (https://developercertificate.org).
To indicate that you agree to the terms of the DCO, you "sign off" your contribution by adding a line with your name and e-mail address to every git commit message. You must use your real name, no pseudonyms or anonymous contributions are accepted. If there are more than one contributor, everyone adds their name and e-mail to the commit message.
Author: John Doe \<firstname.lastname@example.org\> Date: Mon Feb 29 12:12:12 2016 +0000 Title of the commit Short description of the change. Signed-off-by: John Doe email@example.com Signed-off-by: Foo Bar firstname.lastname@example.org
The contributions will be code reviewed by Arm before they can be accepted into the repository.
Please see Security.
Arm, Cortex, Corstone and Ethos are registered trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere.