commit | 3e4b193f783c2d43547123518cadd1b2a9b11055 | [log] [tgz] |
---|---|---|
author | Gunes Bayir <gunbay01@e120783.arm.com> | Sat Mar 16 23:40:39 2024 +0000 |
committer | Gunes Bayir <gunes.bayir@arm.com> | Mon Mar 18 14:45:59 2024 +0000 |
tree | 19e58f8e06d0ca698de5b735ab3a62ef478404ca | |
parent | 5a6773343f81cec47baa80dcac13bf72168bd987 [diff] |
Fix quant. gemv kernel driver by adding set_quantized_bias() arm_gemm fuses the actual bias addition with the output stage in quantized gemm. The output stage, in its very basic form, is: A_offset * B_offset - sum(A_row_i) * B_offset - sum(B_col_j) * A_offset Matrix B is usually constant (e.g. weight matrix in convolutions). Therefore, except the middle term above, the expression is constant across the same output row because the column sums of matrix B are pre-calculated. The bias is also usually constant. When it is, it makes sense to add the bias vector to the above sum and just perform a single addition on top of the output tensor. For this to happen, the column sum computation of B tensor must account for the bias. This is ensured by set_quantized_bias() method in the interface. This function passes the bias pointer and strides to arm_gemm. Gemv_pretransposed does not implement set_quantized_bias() and uses the parent function, which does nothing. Therefore, the bias is not added to the output. This causes tests to fail. Resolves: COMPMID-6928 Change-Id: Iba24fabc65fdc47edb12db6abff2fb47784c0743 Signed-off-by: Gunes Bayir <gunes.bayir@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/11310 Benchmark: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
⚠ Deprecation Notice 24.01 announcement: NCHW data format specific optimizations will gradually be removed from the code base in future releases. The implication of this is that the user is expected to translate NCHW models into NHWC in order to benefit from the optimizations.
The Compute Library is a collection of low-level machine learning functions optimized for Arm® Cortex®-A, Arm® Neoverse® and Arm® Mali™ GPUs architectures.
The library provides superior performance to other open source alternatives and immediate support for new Arm® technologies e.g. SVE2.
Key Features:
Repository | Link |
---|---|
Release | https://github.com/arm-software/ComputeLibrary |
Development | https://review.mlplatform.org/#/admin/projects/ml/ComputeLibrary |
Note: The documentation includes the reference API, changelogs, build guide, contribution guide, errata, etc.
All the binaries can be downloaded from here or from the tables below.
Platform | Operating System | Release archive (Download) |
---|---|---|
Raspberry Pi 4 | Linux® 32bit | |
Raspberry Pi 4 | Linux® 64bit | |
Odroid N2 | Linux® 64bit | |
HiKey960 | Linux® 64bit |
Architecture | Operating System | Release archive (Download) |
---|---|---|
armv7 | Linux® | |
arm64-v8a | Android™ | |
arm64-v8a | Linux® | |
arm64-v8.2-a | Android™ | |
arm64-v8.2-a | Linux® |
Please refer to the following link for more pre-built binaries:
Pre-build binaries are generated with the following security / good coding practices related flags:
-Wall, -Wextra, -Wformat=2, -Winit-self, -Wstrict-overflow=2, -Wswitch-default, -Woverloaded-virtual, -Wformat-security, -Wctor-dtor-privacy, -Wsign-promo, -Weffc++, -pedantic, -fstack-protector-strong
Arm® CPUs:
Arm® Mali™ GPUs:
x86
⚠ Important Bazel and CMake builds are experimental CPU only builds, please see the documentation for more details.
Contributions to the Compute Library are more than welcome. If you are interested on contributing, please have a look at our how to contribute guidelines.
Before the Compute Library accepts your contribution, you need to certify its origin and give us your permission. To manage this process we use the Developer Certificate of Origin (DCO) V1.1 (https://developercertificate.org/)
To indicate that you agree to the the terms of the DCO, you "sign off" your contribution by adding a line with your name and e-mail address to every git commit message:
Signed-off-by: John Doe <john.doe@example.org>
You must use your real name, no pseudonyms or anonymous contributions are accepted.
For technical discussion, the ComputeLibrary project has a public mailing list: acl-dev@lists.linaro.org The list is open to anyone inside or outside of Arm to self subscribe. In order to subscribe, please visit the following website: https://lists.linaro.org/mailman3/lists/acl-dev.lists.linaro.org/
The software is provided under MIT license. Contributions to this project are accepted under the same license.
This project contains code from other projects as listed below. The original license text is included in those source files.
The OpenCL header library is licensed under Apache License, Version 2.0, which is a permissive license compatible with MIT license.
The half library is licensed under MIT license.
The libnpy library is licensed under MIT license.
The stb image library is either licensed under MIT license or is in Public Domain. It is used by this project under the terms of MIT license.
Android is a trademark of Google LLC.
Arm, Cortex, Mali and Neon are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere.
Bazel is a trademark of Google LLC., registered in the U.S. and other countries.
CMake is a trademark of Kitware, Inc., registered in the U.S. and other countries.
Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries.
Mac and macOS are trademarks of Apple Inc., registered in the U.S. and other countries.
Tizen is a registered trademark of The Linux Foundation.
Windows® is a trademark of the Microsoft group of companies.