Arm® ML embedded evaluation kit


  • Arm® and Cortex® are registered trademarks of Arm® Limited (or its subsidiaries) in the US and/or elsewhere.
  • Arm® and Ethos™ are registered trademarks or trademarks of Arm® Limited (or its subsidiaries) in the US and/or elsewhere.
  • Arm® and Corstone™ are registered trademarks or trademarks of Arm® Limited (or its subsidiaries) in the US and/or elsewhere.
  • Arm®, Keil® and µVision® are registered trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere.
  • TensorFlow™, the TensorFlow logo, and any related marks are trademarks of Google Inc.


Before starting the setup process, please make sure that you have:

Note:: There are two Arm® Corstone™-300 implementations available for the MPS3 FPGA board - application notes AN547 and AN552. We are aligned with the latest application note AN552. However, the application built for MPS3 target should work on both FPGA packages.

Additional reading

This document contains information that is specific to Arm® Ethos™-U55 and Arm® Ethos™-U65 products. Please refer to the following documents for additional information:

To access Arm documentation online, please visit:

Repository structure

The repository has the following structure:

├── CMakeLists.txt
├── dependencies
├── docs
├── model_conditioning_examples
├── resources
├── /resources_downloaded/
├── scripts
     ├── cmake
         ├── platforms
               ├── mps3
               ├── native
               └── simple_platform
         └── ...
     └── ...
├── source
     ├── application
         ├── api
             ├── common
             └── use_case
         └── main
     ├── hal
         ├── include
         └── source
     ├── log
         └── include
     ├── math
         └── include
     ├── profiler
         └── include
       └── <usecase_name>
           ├── include
           ├── src
           └── usecase.cmake
└── tests

What these folders contain:

  • dependencies: All the third-party dependencies for this project. These are either populated by git submodule or by downloading packages in the required hierarchy. See

  • docs: Detailed documentation for this repository.

  • model_conditioning_examples: short example scripts that demonstrate some methods available in TensorFlow to condition your model in preparation for deployment on Arm Ethos NPU.

  • resources: contains ML use-cases applications resources such as input data, label files, etc.

  • resources_downloaded: created by, contains downloaded resources for ML use-cases applications such as models, test data, etc. It also contains a Python virtual environment with all the required packages installed.

  • scripts: Build and source generation scripts.

  • scripts/cmake/platforms: Platform build configuration scripts build_configuration.cmake are located here. These scripts are adding platform sources into the application build stream. The script has 2 functions:

    • set_platform_global_defaults - to set platform source locations and other build options.
    • platform_custom_post_build - to execute specific post build steps. For example, MPS3 board related script adds board specific images.txt file creation and calls bin generation command. Native profile related script compiles unit-tests.
  • source: C/C++ sources for the platform and ML applications.

    The contents of the application sub-folder is as follows:

    • application: All sources that form the core of the application. The use-case part of the sources depend on the sources themselves, such as:
      • main: Contains the main function and calls to platform initialization logic to set up things before launching the main loop. Also contains sources common to all use-case implementations.

      • api: Contains platform-agnostic API that all the use case examples can use. It depends only on TensorFlow Lite Micro and math functionality exposed by math module. It is further subdivided into:

        • common: Common part of the API. This consists of the generic code like neural network model initialisation, running an inference, and some common logic used for image and audio use cases.

        • use_case: This contains "model" and "processing" APIs for each individual use case. For example, KWS use case contains a class for a generic KWS neural network model and the "processing" API give user an easier way to drive the MFCC calculations.

NOTE: The API here is also used to export a CMSIS-pack from this repository and therefore, it is imperative to that the sources here do not depend on any HAL component or drive any platform dependent logic. If you are looking to reuse components from this repository for your application level logic, this directory should be the prime candidate.

  • hal: Contains Hardware Abstraction Layer (HAL) sources, providing a platform-agnostic API to access hardware platform-specific functions.

Note: Common code related to the Arm Ethos-U NPU software framework resides in hal/components sub-folder.

  • log: Common to all code logging macros managing log levels.

  • math: Math functions to be used in ML pipelines. Some of them use CMSIS DSP for optimized execution on Arm CPUs. It is a separate CMake project that is built into a static library libarm_math.a.

  • profiler: profiling utilities code to collect and output cycle counts and PMU information. It is a separate CMake project that is built into a static library libprofiler.a.

  • use_case: Contains the ML use-case specific logic. Stored as a separate subfolder, it helps isolate the ML-specific application logic. With the assumption that the application performs the required setup for logic to run. It also makes it easier to add a new use-case block.

  • tests: Contains the x86 tests for the use-case applications.

The HAL has the following structure:

├── CMakeLists.txt
├── include
     ├── hal.h
     ├── hal_lcd.h
     └── hal_pmu.h
└── source
    ├── components
         ├── cmsis_device
         ├── lcd
         ├── npu
         ├── npu_ta
         ├── platform_pmu
         └── stdout
    ├── hal.c
    ├── hal_pmu.c
    └── platform
        ├── mps3
        ├── native
        └── simple

HAL is built as a separate project into a static library libhal.a. It is linked with use-case executable.

What these folders contain:

  • The folders include and source/hal.c contain the HAL top-level platform API and data acquisition, data presentation, and timer interfaces.

    Note: the files here and lower in the hierarchy have been written in C and this layer is a clean C/ + boundary in the sources.

  • source/components directory contains API and implementations for different modules that can be reused for different platforms. These contain common functions for Arm Ethos-U NPU initialization, timing adapter block helpers and others. Each component produces a static library that could potentially be linked into the platform library to enable usage of corresponding modules from the platform sources. For example, most of the use-cases use NPU and timing adapter initialization. Similarly, the LCD component provides a standard LCD API used by HAL and propagated up the hierarchy. Two static library targets are provided for the LCD module - one with stubbed implementation and the other which can drive the LCD on an Arm MPS3 target. If you want to run default ML use-cases on a custom platform, you could re-use existing code from this directory provided it is compatible with your platform.

  • source/components/cmsis_device has a common startup code for Cortex-M based systems. The package defines interrupt vector table and handlers. Reset handler - starting point of our application - is also defined here. This entry point is responsible for the set-up before calling the user defined "main" function in the higher-level application logic. It is a separate CMake project that is built into a static library libcmsis_device.a. It depends on a CMSIS repo through CMSIS_SRC_PATH variable. The static library is used by platform code.

  • source/platform/mps3
    source/platform/simple: These folders contain platform specific declaration and defines, such as, platform initialisation code, peripheral memory map, system registers, system specific timer implementation and other. Platform is built from selected components and configured cmsis device. It is a separate CMake project, and is built into a static library libplatform-drivers.a. It is linked into HAL library.

Models and resources

The models used in the use-cases implemented in this project can be downloaded from:

When using Ethos-U NPU backend, Vela compiler optimizes the the NN model. However, if not and it is supported by TensorFlow Lite Micro, then it falls back on the CPU and execute.

Vela compiler

The Vela compiler is a tool that can optimize a neural network model into a version that run on an embedded system containing the Ethos-U NPU.

The optimized model contains custom operators for sub-graphs of the model that the Ethos-U NPU can accelerate. The remaining layers that cannot be accelerated, are left unchanged, and are run on the CPU using optimized, CMSIS-NN, or reference kernels provided by the inference engine.

For detailed information, see: Optimize model with Vela compiler.


This section explains the build process and intra-project dependencies, describes how to build the code sample applications from sources and includes illustrating the build options and the process.

The following graph of source modules aims to explain better intra-project code and build execution dependencies. intra-project dependencies

The project can be built for MPS3 FPGA and FVP emulating MPS3. Using default values for configuration parameters builds executable models that support the Ethos-U NPU.

For further information, please see:


This section describes how to deploy the code sample applications on the Fixed Virtual Platform (FVP) or the MPS3 board.

For further information, please see:

Implementing custom ML application

This section describes how to implement a custom Machine Learning application running on a platform supported by the repository, either an FVP or an MPS3 board.

Both the Cortex-M55 CPU and Ethos-U NPU Code Samples software project offers a way to incorporate extra use-case code into the existing infrastructure. It also provides a build system that automatically picks up added functionality and produces corresponding executable for each use-case.

For further information, please see:

Testing and benchmarking

Please refer to: Testing and benchmarking.

Memory Considerations

Please refer to:


For further information, please see:


Please refer to:


Please refer to: FAQ