MLIA-650 Implement new CLI changes

Breaking change in the CLI and API: Sub-commands "optimization",
"operators", and "performance" were replaced by "check", which
incorporates compatibility and performance checks, and "optimize" which
is used for optimization. "get_advice" API was adapted to these CLI
changes.

API changes:

* Remove previous advice category "all" that would perform all three
  operations (when possible). Replace them with the ability to pass a
  set of the advice categories.
* Update api.get_advice method docstring to reflect new changes.
* Set default advice category to COMPATIBILITY
* Update core.common.AdviceCategory by changing the "OPERATORS" advice
  category to "COMPATIBILITY" and removing "ALL" enum type.
  Update all subsequent methods that previously used "OPERATORS" to use
  "COMPATIBILITY".
* Update core.context.ExecutionContext to have "COMPATIBILITY" as
  default advice_category instead of "ALL".
* Remove api.generate_supported_operators_report and all related
  functions from cli.commands, cli.helpers, cli.main, cli.options,
  core.helpers
* Update tests to reflect new API changes.

CLI changes:

* Update README.md to contain information on the new CLI
* Remove the ability to generate supported operators support from MLIA
  CLI
* Replace `mlia ops` and `mlia perf` with the new `mlia check` command
  that can be used to perform both operations.
* Replace `mlia opt` with the new `mlia optimize` command.
* Replace `--evaluate-on` flag with `--backend` flag
* Replace `--verbose` flag with `--debug` flag (no behaviour change).
* Remove the ability for the user to select MLIA working directory.
  Create and use a temporary directory in /temp instead.
* Change behaviour of `--output` flag to not format the content
  automatically based on file extension anymore. Instead it will simply
  redirect to a file.
* Add the `--json` flag to specfy that the format of the output should
  be json.
* Add command validators that are used to validate inter-dependent
  flags (e.g. backend validation based on target_profile).
* Add support for selecting built-in backends for both `check` and
  `optimize` commands.
* Add new unit tests and update old ones to test the new CLI changes.
* Update RELEASES.md
* Update copyright notice

Change-Id: Ia6340797c7bee3acbbd26601950e5a16ad5602db
47 files changed
tree: 294605295cd2624ba63e6ad3df335a2a4b2700ab
  1. .gitignore
  2. .pre-commit-config.yaml
  3. Dockerfile
  4. LICENSES/
  5. MANIFEST.in
  6. README.md
  7. RELEASES.md
  8. SECURITY.md
  9. docker/
  10. docs/
  11. pyproject.toml
  12. setup.cfg
  13. setup.py
  14. src/
  15. tests/
  16. tests_e2e/
  17. tox.ini
README.md

ML Inference Advisor - Introduction

The ML Inference Advisor (MLIA) is used to help AI developers design and optimize neural network models for efficient inference on Arm® targets (see supported targets) by enabling performance analysis and providing actionable advice early in the model development cycle. The final advice can cover supported operators, performance analysis and suggestions for model optimization (e.g. pruning, clustering, etc.).

Inclusive language commitment

This product conforms to Arm's inclusive language policy and, to the best of our knowledge, does not contain any non-inclusive language.

If you find something that concerns you, email terms@arm.com.

Releases

Release notes can be found in MLIA releases.

Getting support

In case you need support or want to report an issue, give us feedback or simply ask a question about MLIA, please send an email to mlia@arm.com.

Alternatively, use the AI and ML forum to get support by marking your post with the MLIA tag.

Reporting vulnerabilities

Information on reporting security issues can be found in Reporting vulnerabilities.

License

ML Inference Advisor is licensed under Apache License 2.0.

Trademarks and copyrights

  • Arm®, Arm® Ethos™-U, Arm® Cortex®-A, Arm® Cortex®-M, Arm® Corstone™ are registered trademarks or trademarks of Arm® Limited (or its subsidiaries) in the U.S. and/or elsewhere.
  • TensorFlow™ is a trademark of Google® LLC.
  • Keras™ is a trademark by François Chollet.
  • Linux® is the registered trademark of Linus Torvalds in the U.S. and elsewhere.
  • Python® is a registered trademark of the PSF.
  • Ubuntu® is a registered trademark of Canonical.
  • Microsoft and Windows are trademarks of the Microsoft group of companies.

General usage

Prerequisites and dependencies

It is recommended to use a virtual environment for MLIA installation, and a typical setup for MLIA requires:

  • Ubuntu® 20.04.03 LTS (other OSs may work, the ML Inference Advisor has been tested on this one specifically)
  • Python® >= 3.8
  • Ethos™-U Vela dependencies (Linux® only)

Installation

MLIA can be installed with pip using the following command:

pip install mlia

It is highly recommended to create a new virtual environment to install MLIA.

First steps

After the installation, you can check that MLIA is installed correctly by opening your terminal, activating the virtual environment and typing the following command that should print the help text:

mlia --help

The ML Inference Advisor works with sub-commands, i.e. in general a MLIA command would look like this:

mlia [sub-command] [arguments]

Where the following sub-commands are available:

  • "check": perform compatibility or performance checks on the model
  • "optimize": apply specified optimizations

Detailed help about the different sub-commands can be shown like this:

mlia [sub-command] --help

The following sections go into further detail regarding the usage of MLIA.

Sub-commands

This section gives an overview of the available sub-commands for MLIA.

check

compatibility

Default check that MLIA runs. It lists the model's operators with information about their compatibility with the specified target.

Examples:

# List operator compatibility with Ethos-U55 with 256 MAC
mlia check ~/models/mobilenet_v1_1.0_224_quant.tflite --target-profile ethos-u55-256

# List operator compatibility with Cortex-A
mlia check ~/models/mobilenet_v1_1.0_224_quant.tflite --target-profile cortex-a

# Get help and further information
mlia check --help

performance

Estimate the model's performance on the specified target and print out statistics.

Examples:

# Use default parameters
mlia check ~/models/mobilenet_v1_1.0_224_quant.tflite \
    --target-profile ethos-u55-256 \
    --performance

# Explicitly specify the target profile and backend(s) to use with --backend
mlia check ~/models/ds_cnn_large_fully_quantized_int8.tflite \
    --target-profile ethos-u65-512 \
    --performance \
    --backend "Vela" "Corstone-310"

# Get help and further information
mlia check --help

optimize

This sub-command applies optimizations to a Keras model (.h5 or SavedModel) and shows the performance improvements compared to the original unoptimized model.

There are currently two optimization techniques available to apply:

  • pruning: Sets insignificant model weights to zero until the specified sparsity is reached.
  • clustering: Groups the weights into the specified number of clusters and then replaces the weight values with the cluster centroids.

More information about these techniques can be found online in the TensorFlow documentation, e.g. in the TensorFlow model optimization guides.

Note: A Keras model (.h5 or SavedModel) is required as input to perform the optimizations. Models in the TensorFlow Lite format are not supported.

Examples:

# Custom optimization parameters: pruning=0.6, clustering=16
mlia optimize ~/models/ds_cnn_l.h5 \
    --target-profile ethos-u55-256 \
    --pruning \
    --pruning-target 0.6 \
    --clustering \
    --clustering-target 16

# Get help and further information
mlia optimize --help

Target profiles

All sub-commands require the name of a target profile as input parameter. The profiles currently available are described in the following sections.

The support of the above sub-commands for different targets is provided via backends that need to be installed separately, see Backend installation section.

Ethos-U

There are a number of predefined profiles for Ethos-U with the following attributes:

+--------------------------------------------------------------------+
| Profile name  | MAC | System config               | Memory mode    |
+=====================================================================
| ethos-u55-256 | 256 | Ethos_U55_High_End_Embedded | Shared_Sram    |
+---------------------------------------------------------------------
| ethos-u55-128 | 128 | Ethos_U55_High_End_Embedded | Shared_Sram    |
+---------------------------------------------------------------------
| ethos-u65-512 | 512 | Ethos_U65_High_End          | Dedicated_Sram |
+---------------------------------------------------------------------
| ethos-u65-256 | 256 | Ethos_U65_High_End          | Dedicated_Sram |
+--------------------------------------------------------------------+

Example:

mlia check ~/model.tflite --target-profile ethos-u65-512 --performance

Ethos-U is supported by these backends:

Cortex-A

The profile cortex-a can be used to get the information about supported operators for Cortex-A CPUs when using the Arm NN TensorFlow Lite delegate. Please, find more details in the section for the corresponding backend.

TOSA

The target profile tosa can be used for TOSA compatibility checks of your model. It requires the TOSA Checker backend.

For more information, see TOSA Checker's:

Backend installation

The ML Inference Advisor is designed to use backends to provide different metrics for different target hardware. Some backends come pre-installed with MLIA, but others can be added and managed using the command mlia-backend, that provides the following functionality:

  • install
  • uninstall
  • list

Examples:

# List backends installed and available for installation
mlia-backend list

# Install Corstone-300 backend for Ethos-U
mlia-backend install Corstone-300 --path ~/FVP_Corstone_SSE-300/

# Uninstall the Corstone-300 backend
mlia-backend uninstall Corstone-300

# Get help and further information
mlia-backend --help

Note: Some, but not all, backends can be automatically downloaded, if no path is provided.

Available backends

This section lists available backends. As not all backends work on any platform the following table shows some compatibility information:

+----------------------------------------------------------------------------+
| Backend       | Linux                  | Windows        | Python           |
+=============================================================================
| Arm NN        |                        |                |                  |
| TensorFlow    | x86_64                 | Windows 10     | Python>=3.8      |
| Lite delegate |                        |                |                  |
+-----------------------------------------------------------------------------
| Corstone-300  | x86_64                 | Not compatible | Python>=3.8      |
+-----------------------------------------------------------------------------
| Corstone-310  | x86_64                 | Not compatible | Python>=3.8      |
+-----------------------------------------------------------------------------
| TOSA checker  | x86_64 (manylinux2014) | Not compatible | 3.7<=Python<=3.9 |
+-----------------------------------------------------------------------------
| Vela          | x86_64                 | Windows 10     | Python~=3.7      |
+----------------------------------------------------------------------------+

Arm NN TensorFlow Lite delegate

This backend provides general information about the compatibility of operators with the Arm NN TensorFlow Lite delegate for Cortex-A. It comes pre-installed with MLIA.

For more information see:

Corstone-300

Corstone-300 is a backend that provides performance metrics for systems based on Cortex-M55 and Ethos-U. It is only available on the Linux platform.

Examples:

# Download and install Corstone-300 automatically
mlia-backend install Corstone-300
# Point to a local version of Corstone-300 installed using its installation script
mlia-backend install Corstone-300 --path YOUR_LOCAL_PATH_TO_CORSTONE_300

For further information about Corstone-300 please refer to: https://developer.arm.com/Processors/Corstone-300

Corstone-310

Corstone-310 is a backend that provides performance metrics for systems based on Cortex-M85 and Ethos-U. It is available as Arm Virtual Hardware (AVH) only, i.e. it can not be downloaded automatically.

TOSA Checker

The TOSA Checker backend provides operator compatibility checks against the TOSA specification.

Please, install it into the same environment as MLIA using this command:

mlia-backend install tosa-checker

Additional resources:

Vela

The Vela backend provides performance metrics for Ethos-U based systems. It comes pre-installed with MLIA.

Additional resources: