Rewrite dynamic fusion

The new version introduces the following major changes:

* Change public interface to simplify and standardize the user experience
    - Use the term "Workload" uniformly
    - Simplify operator interface to be a set of static methods:
      validate_op(), create_op()

* Separate the kernel writing into its own component (template_writer).
  This is to allow the co-development of GpuKernelWriter, and to allow
  easy replacement once GpuKernelWriter is mature.

* Optimize the core fusion algorithm used by the component graph. The
  details can be found in GpuKernelComponentGraph::fuse()

* Use Gpu instead of Cl prefixes for most of the Workload interfaces
  (except for runtime and kernel components, which have to be language specific)
  This allows the potential extension to other Gpu langauges in the
  future.

* Refactor runtime memory interface so that auxiliary tensor handling
  is separate from the user tensor passing. This is because the former
  is less stable and may require extension in the future.

* Hide source code object from the user as it is not required at the
  moment

* Deprecate the old prototype entirely by disabling it in SCons build

Resolves COMPMID-5510, COMPMID-5512, COMPMID-5513

Change-Id: If69d2362856f2de4503546b7b6cf48a525cf3079
Signed-off-by: SiCong Li <sicong.li@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8406
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
62 files changed
tree: 56468ef833726318e545043f4abcd16ad3775094
  1. .clang-format
  2. .clang-tidy
  3. .github/
  4. .gitignore
  5. .mdl.rb
  6. .pre-commit-config.yaml
  7. Android.bp
  8. CONTRIBUTING.md
  9. LICENSE
  10. README.md
  11. SConscript
  12. SConstruct
  13. SECURITY.md
  14. arm_compute/
  15. data/
  16. docs/
  17. examples/
  18. filedefs.json
  19. filelist.json
  20. include/
  21. python/
  22. scripts/
  23. src/
  24. support/
  25. tests/
  26. utils/
README.md

⚠ Important From release 22.05: 'master' branch has been replaced with 'main' following our inclusive language update, more information here.

⚠ Important From release 22.08: armv7a with Android build will no longer be tested or maintained.

Compute Library

The Compute Library is a collection of low-level machine learning functions optimized for Arm® Cortex®-A, Arm® Neoverse® and Arm® Mali™ GPUs architectures.

The library provides superior performance to other open source alternatives and immediate support for new Arm® technologies e.g. SVE2.

Key Features:

  • Open source software available under a permissive MIT license
  • Over 100 machine learning functions for CPU and GPU
  • Multiple convolution algorithms (GeMM, Winograd, FFT, Direct and indirect-GeMM)
  • Support for multiple data types: FP32, FP16, INT8, UINT8, BFLOAT16
  • Micro-architecture optimization for key ML primitives
  • Highly configurable build options enabling lightweight binaries
  • Advanced optimization techniques such as kernel fusion, Fast math enablement and texture utilization
  • Device and workload specific tuning using OpenCL tuner and GeMM optimized heuristics
RepositoryLink
Releasehttps://github.com/arm-software/ComputeLibrary
Developmenthttps://review.mlplatform.org/#/admin/projects/ml/ComputeLibrary

Documentation

Documentation

Note: The documentation includes the reference API, changelogs, build guide, contribution guide, errata, etc.

Pre-built binaries

All the binaries can be downloaded from here or from the tables below.

PlatformOperating SystemRelease archive (Download)
Raspberry Pi 4Linux 32bit
Raspberry Pi 4Linux 64bit
Odroid N2Linux 64bit
HiKey960Linux 64bit
ArchitectureOperating SystemRelease archive (Download)
armv7Linux
arm64-v8aAndroid
arm64-v8aLinux
arm64-v8.2-aAndroid
arm64-v8.2-aLinux

Pre-build binaries are generated with the following security / good coding practices related flags:

-Wall, -Wextra, -Wformat=2, -Winit-self, -Wstrict-overflow=2, -Wswitch-default, -Woverloaded-virtual, -Wformat-security, -Wctor-dtor-privacy, -Wsign-promo, -Weffc++, -pedantic, -fstack-protector-strong

Supported Architectures/Technologies

  • Arm® CPUs:

    • Arm® Cortex®-A processor family using Arm® Neon™ technology
    • Arm® Neoverse® processor family
    • Arm® Cortex®-R processor family with Armv8-R AArch64 architecture using Arm® Neon™ technology
    • Arm® Cortex®-X1 processor using Arm® Neon™ technology
  • Arm® Mali™ GPUs:

    • Arm® Mali™-G processor family
    • Arm® Mali™-T processor family
  • x86

Supported Systems

  • Android™
  • Bare Metal
  • Linux®
  • OpenBSD®
  • macOS®
  • Tizen™

Resources

How to contribute

Contributions to the Compute Library are more than welcome. If you are interested on contributing, please have a look at our how to contribute guidelines.

Developer Certificate of Origin (DCO)

Before the Compute Library accepts your contribution, you need to certify its origin and give us your permission. To manage this process we use the Developer Certificate of Origin (DCO) V1.1 (https://developercertificate.org/)

To indicate that you agree to the the terms of the DCO, you "sign off" your contribution by adding a line with your name and e-mail address to every git commit message:

Signed-off-by: John Doe <john.doe@example.org>

You must use your real name, no pseudonyms or anonymous contributions are accepted.

Public mailing list

For technical discussion, the ComputeLibrary project has a public mailing list: acl-dev@lists.linaro.org The list is open to anyone inside or outside of Arm to self subscribe. In order to subscribe, please visit the following website: https://lists.linaro.org/mailman3/lists/acl-dev.lists.linaro.org/

License and Contributions

The software is provided under MIT license. Contributions to this project are accepted under the same license.

Other Projects

This project contains code from other projects as listed below. The original license text is included in those source files.

  • The OpenCL header library is licensed under Apache License, Version 2.0, which is a permissive license compatible with MIT license.

  • The half library is licensed under MIT license.

  • The libnpy library is licensed under MIT license.

  • The stb image library is either licensed under MIT license or is in Public Domain. It is used by this project under the terms of MIT license.

Trademarks and Copyrights

Android is a trademark of Google LLC.

Arm, Cortex, Mali and Neon are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere.

Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.

Mac and macOS are trademarks of Apple Inc., registered in the U.S. and other countries.

Tizen is a registered trademark of The Linux Foundation.