commit | 3ec04ac9e38d26193e0081a8e0fa3b8b667bb688 | [log] [tgz] |
---|---|---|
author | Dwight Lidman <dwight.lidman@arm.com> | Thu Apr 30 11:54:48 2020 +0200 |
committer | Tim Hall <tim.hall@arm.com> | Thu Jun 18 17:53:52 2020 +0100 |
tree | d4c961583bbe7ff47a9a0313d72ff0871a44b72d | |
parent | 1629f331810de8ebff018259c75ee024857472e5 [diff] |
MLBEDSW-1498: Add Resize_Bilinear operator support This patch adds support for the ResizeBilinear operator. It is implemented using a 2x2 Nearest Neighbor upscale followed by a 2x2 Average Pool. Depending on the argument align_corners the output is either of shape: - (2 * M, 2 * N) when align_corners == True, or - (2 * M - 1, 2 * N - 1) when align_corners == False where (M, N) is the input shape. The padding mode is SAME when align_corners == True and VALID when align_corners == False. The argument half_pixel_centers is out of scope and is as of now ignored. Note that only upscaling by a factor of 2 is supported. Change-Id: Ia6d6d010c4f1bb13f5f839bc8d16872a626d9a3b Signed-off-by: Dwight Lidman <dwight.lidman@arm.com>
This tool is used to compile a TensorFlow Lite for Microcontrollers neural network model into an optimised version that can run on an embedded system containing an Ethos-U55 NPU.
The optimised model will contain TensorFlow Lite Custom operators for those parts of the model that can be accelerated by the Ethos-U55. Parts of the model that cannot be accelerated are left unchanged and will instead run on the Cortex-M series CPU using an appropriate kernel (such as the Arm optimised CMSIS-NN kernels).
After compilation the optimised model can only be run on an Ethos-U55 NPU embedded system.
The tool will also generate performance estimates (EXPERIMENTAL) for the compiled model.
Vela runs on the Linux operating system.
The following should be installed prior to the installation of Vela:
Before running, the Vela package must be installed along with all its dependencies. To do this, first change to the directory that contains this README.md file. Then use the command:
pip3 install -U setuptools>=40.1.0 pip3 install .
Or, if you use the pipenv
virtual environment tool:
pipenv install .
Vela is run with an input .tflite
file passed on the command line. This file contains the neural network to be compiled. The tool then outputs an optimised version with a _vela.tflite
file prefix, along with the performance estimate (EXPERIMENTAL) CSV files, all to the output directory.
If you use the pipenv
virtual environment tool then first start by spawning a shell in the virtual environment.:
pipenv shell
After which running Vela is the same regardless of whether you are in a virtual environment or not.
Example usage:
my_model.tflite
. The optimised version will be output to ./output/my_network_vela.tflite
.vela my_model.tflite
/path/to/my_model.tflite
and specify the output to go in the directory ./results_dir/
.vela --output-dir ./results_dir /path/to/my_model.tflite
vela --help
MySysConfig
settings that are described in the sys_cfg_vela.ini
system configuration file. More details can be found in the next section.vela --config sys_cfg_vela.ini --system-config MySysConfig my_model.tflite
This is used to describe various properties of the embedded system that the network will run in.
Example of a Vela system configuration file.
; File: sys_cfg_vela.ini ; The file contains two parts; a system config part and a CPU operator ; performance part. ; System config ; Specifies properties such as the core clock speed, the size and speed of the ; four potential memory areas, and for various types of data which memory area ; is used to store them. The cpu property is used to link with the CPU operator ; performance. ; The four potential memory areas are: Sram, Dram, OnChipFlash, OffChipFlash. [SysConfig.MySysConfig] npu_freq=500e6 cpu=MyCpu Sram_clock_scale=1 Sram_port_width=64 Dram_clock_scale=1 Dram_port_width=64 OnChipFlash_clock_scale=1 OnChipFlash_port_width=64 OffChipFlash_clock_scale=0.25 OffChipFlash_port_width=32 permanent_storage_mem_area=OffChipFlash feature_map_storage_mem_area=Sram fast_storage_mem_area=Sram ; CPU operator performance ; Specifies properties that are used by a linear model to estimate the ; performance for any operations that will be run on the CPU (such as those not ; supported by the NPU). Setting the intercept and slope to 0 will result in ; the operator being excluded from the performance estimation. This is the same ; as not specifying the operator. If an explicit cpu is specified rather than ; using the default then the cpu name must match the cpu specified in the ; SysConfig.<system config name> section. [CpuPerformance.MyCpuOperator] default.intercept=0.0 default.slope=1.0 MyCpu.intercept=0.0 MyCpu.slope=1.0
Contributions are accepted under Apache License 2.0. Only submit contributions where you have authored all of the code.
Vela is licensed under Apache License 2.0
Contributions are accepted under Apache-2.0. Only submit contributions where you have authored all of the code.
The Python codebase is PEP8 compliant with the exception of 120 characters line length. We run black and flake8 against the code base excluding "ethosu/vela/tflite/" and "ethosu/vela/ethos_u55_regs" directories because they are auto-generated by third party tools. Those tools are run using pre-commit framework. The configuration file is .pre-commit-config.yaml
To install pre-commit, run the following:
pipenv install -e . --dev
After the installation, pre-commit is available in the virtual environment.
To ease the development, we can run those sanity checks before committing the code. To install the git hook, run:
$ pre-commit install pre-commit installed at .git/hooks/pre-commit
The checks will be run before the commit: if one of them fails, you need to fix the code to make the checks pass.
Those checks can be run manually. This can be achievied running the following
$ pre-commit run flake8 --all-files ... $ pre-commit run black --all-files
If you don't specify anything after run, it will execute all the checks.