Vidhya Sudhan Loganathan | d646ae1 | 2018-11-19 15:18:20 +0000 | [diff] [blame] | 1 | /// |
Adnan AlSinan | abc093b | 2022-02-08 16:57:06 +0000 | [diff] [blame] | 2 | /// Copyright (c) 2017-2022 Arm Limited. |
Vidhya Sudhan Loganathan | d646ae1 | 2018-11-19 15:18:20 +0000 | [diff] [blame] | 3 | /// |
| 4 | /// SPDX-License-Identifier: MIT |
| 5 | /// |
| 6 | /// Permission is hereby granted, free of charge, to any person obtaining a copy |
| 7 | /// of this software and associated documentation files (the "Software"), to |
| 8 | /// deal in the Software without restriction, including without limitation the |
| 9 | /// rights to use, copy, modify, merge, publish, distribute, sublicense, and/or |
| 10 | /// sell copies of the Software, and to permit persons to whom the Software is |
| 11 | /// furnished to do so, subject to the following conditions: |
| 12 | /// |
| 13 | /// The above copyright notice and this permission notice shall be included in all |
| 14 | /// copies or substantial portions of the Software. |
| 15 | /// |
| 16 | /// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR |
| 17 | /// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, |
| 18 | /// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE |
| 19 | /// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER |
| 20 | /// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, |
| 21 | /// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE |
| 22 | /// SOFTWARE. |
| 23 | /// |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 24 | namespace arm_compute |
| 25 | { |
Sheri Zhang | d813bab | 2021-04-30 16:53:41 +0100 | [diff] [blame] | 26 | /** @page versions_changelogs Release Versions and Changelog |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 27 | |
| 28 | @tableofcontents |
| 29 | |
Sheri Zhang | d813bab | 2021-04-30 16:53:41 +0100 | [diff] [blame] | 30 | @section S2_1_versions Release versions |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 31 | |
| 32 | All releases are numbered vYY.MM Where YY are the last two digits of the year, and MM the month number. |
| 33 | If there is more than one release in a month then an extra sequential number is appended at the end: |
| 34 | |
| 35 | v17.03 (First release of March 2017) |
| 36 | v17.03.1 (Second release of March 2017) |
| 37 | v17.04 (First release of April 2017) |
| 38 | |
| 39 | @note We're aiming at releasing one major public release with new features per quarter. All releases in between will only contain bug fixes. |
Ramy Elgammal | fa8ff8e | 2022-08-12 16:57:10 +0100 | [diff] [blame] | 40 | @note Starting from release 22.05, 'master' branch is no longer being used, it has been replaced by 'main'. Please update your clone jobs accordingly. |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 41 | |
Sheri Zhang | d813bab | 2021-04-30 16:53:41 +0100 | [diff] [blame] | 42 | @section S2_2_changelog Changelog |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 43 | |
Viet-Hoa Do | b1f8288 | 2022-11-11 11:29:50 +0000 | [diff] [blame] | 44 | v22.11 Public major release |
| 45 | - New features: |
| 46 | - Add new experimental dynamic fusion API. |
Viet-Hoa Do | 293ab60 | 2022-11-15 10:51:26 +0000 | [diff] [blame] | 47 | - Add CPU batch matrix multiplication with adj_x = false and adj_y = false for FP32. |
Viet-Hoa Do | b1f8288 | 2022-11-11 11:29:50 +0000 | [diff] [blame] | 48 | - Add CPU MeanStdDevNorm for QASYMM8. |
| 49 | - Add CPU and GPU GELU activation function for FP32 and FP16. |
| 50 | - Add CPU swish activation function for FP32 and FP16. |
| 51 | - Performance optimizations: |
| 52 | - Optimize CPU bilinear scale for FP32, FP16, QASYMM8, QASYMM8_SIGNED, U8 and S8. |
| 53 | - Optimize CPU activation functions using LUT-based implementation: |
| 54 | - Sigmoid function for QASYMM8 and QASYMM8_SIGNED. |
| 55 | - Hard swish function for QASYMM8_SIGNED. |
| 56 | - Optimize CPU addition for QASYMM8 and QASYMM8_SIGNED using fixed-point arithmetic. |
| 57 | - Optimize CPU multiplication, subtraction and activation layers by considering tensors as 1D. |
| 58 | - Optimize GPU depthwise convolution kernel and heuristic. |
| 59 | - Optimize GPU Conv2d heuristic. |
| 60 | - Optimize CPU MeanStdDevNorm for FP16. |
| 61 | - Optimize CPU tanh activation function for FP16 using rational approximation. |
| 62 | - Improve GPU GeMMLowp start-up time. |
| 63 | - Various optimizations and bug fixes. |
| 64 | |
SiCong Li | fe1b1f6 | 2022-05-19 18:58:31 +0100 | [diff] [blame] | 65 | v22.08 Public major release |
Ramy Elgammal | 0d274b7 | 2022-08-05 13:14:57 +0100 | [diff] [blame] | 66 | - Various bug fixes. |
| 67 | - Disable unsafe FP optimizations causing accuracy issues in: |
| 68 | - \link opencl::kernels::ClDirectConv2dKernel ClDirectConv2dKernel \endlink |
| 69 | - \link opencl::kernels::ClDirectConv2dKernel ClDirectConv3dKernel \endlink |
| 70 | - @ref CLDepthwiseConvolutionLayerNativeKernel |
| 71 | - Add Dynamic Fusion of Elementwise Operators: Div, Floor, Add. |
| 72 | - Optimize the gemm_reshaped_rhs_nly_nt OpenCL kernel using the arm_matrix_multiply extension available for Arm® Mali™-G715 and Arm® Mali™-G615. |
| 73 | - Add support for the arm_matrix_multiply extension in the gemmlowp_mm_reshaped_only_rhs_t OpenCL kernel. |
| 74 | - Expand GPUTarget list with missing Mali™ GPUs product names: G57, G68, G78AE, G610, G510, G310. |
| 75 | - Extend the direct convolution 2d interface to configure the block size. |
| 76 | - Update ClConv2D heuristic to use direct convolution. |
| 77 | - Use official Khronos® OpenCL extensions: |
| 78 | - Add cl_khr_integer_dot_product extension support. |
| 79 | - Add support of OpenCL 3.0 non-uniform workgroup. |
| 80 | - Cpu performance optimizations: |
| 81 | - Add LUT-based implementation of Hard Swish and Leaky ReLU activation function for aarch64 build. |
| 82 | - Optimize Add layer by considering the input tensors as 1D array. |
| 83 | - Add fixed-format BF16, FP16 and FP32 Neon™ GEMM kernels to support variable weights. |
| 84 | - Add new winograd convolution kernels implementation and update the ACL \link arm_compute::cpu::CpuWinogradConv2d CpuWinogradConv2d\endlink operator. |
| 85 | - Add experimental support for native builds for Windows on Arm®. |
Ramy Elgammal | 966218d | 2022-08-11 16:23:22 +0100 | [diff] [blame] | 86 | - Build flag interpretation change: arch=armv8.6-a now translates to -march=armv8.6-a CXX flag instead of march=armv8.2-a + explicit selection of feature extensions. |
SiCong Li | fe1b1f6 | 2022-05-19 18:58:31 +0100 | [diff] [blame] | 87 | - Build flag change: toolchain_prefix, compiler_prefix: |
Ramy Elgammal | 0d274b7 | 2022-08-05 13:14:57 +0100 | [diff] [blame] | 88 | - Use empty string "" to suppress any prefixes. |
| 89 | - Use "auto" to use default (auto) prefixes chosen by the build script. This is the default behavior when unspecified. |
| 90 | - Any other string will be used as custom prefixes to the compiler and the rest of toolchain tools. |
| 91 | - The default behaviour when prefix is unspecified does not change, but its signifier has been changed from empty string "" to "auto". |
| 92 | - armv7a with Android build will no longer be tested or maintained. |
SiCong Li | fe1b1f6 | 2022-05-19 18:58:31 +0100 | [diff] [blame] | 93 | |
Adnan AlSinan | 2921e5b | 2022-05-16 14:30:41 +0100 | [diff] [blame] | 94 | v22.05 Public major release |
| 95 | - Various bug fixes. |
| 96 | - Various optimizations. |
| 97 | - Add support for NDK r23b. |
| 98 | - Inclusive language adjustment. Please refer to @ref S5_0_inc_lang for details. |
| 99 | - New Arm® Neon™ kernels / functions : |
| 100 | - \link opencl::kernels::ClPool3dKernel ClPool3dKernel \endlink |
| 101 | - New OpenCL kernels / functions : |
| 102 | - \link cpu::kernels::CpuPool3dKernel CpuPool3dKernel \endlink |
| 103 | - Improve the start-up times for the following OpenCL kernels: |
| 104 | - \link opencl::kernels::ClWinogradInputTransformKernel ClWinogradInputTransformKernel \endlink |
| 105 | - \link opencl::kernels::ClWinogradOutputTransformKernel ClWinogradOutputTransformKernel \endlink |
| 106 | - \link opencl::kernels::ClWinogradFilterTransformKernel ClWinogradFilterTransformKernel \endlink |
| 107 | - \link opencl::kernels::ClHeightConcatenateKernel ClHeightConcatenateKernel \endlink |
| 108 | - Decouple the implementation of the following Cpu kernels into various data types (fp32, fp16, int): |
| 109 | - \link cpu::kernels::CpuDirectConv2dKernel CpuDirectConv2dKernel \endlink |
| 110 | - \link cpu::kernels::CpuDepthwiseConv2dNativeKernel CpuDepthwiseConv2dNativeKernel \endlink |
| 111 | - \link cpu::kernels::CpuGemmMatrixAdditionKernel CpuGemmMatrixAdditionKernel \endlink |
| 112 | - \link cpu::kernels::CpuGemmMatrixMultiplyKernel CpuGemmMatrixMultiplyKernel \endlink |
| 113 | - @ref NEFuseBatchNormalizationKernel |
| 114 | - @ref NEL2NormalizeLayerKernel |
| 115 | |
Adnan AlSinan | 69854ba | 2022-02-07 15:28:56 +0000 | [diff] [blame] | 116 | v22.02 Public major release |
| 117 | - Various bug fixes. |
| 118 | - Various optimizations. |
| 119 | - Update A510 arm_gemm cpu Kernels. |
| 120 | - Inclusive language adjustment. Please refer to @ref S5_0_inc_lang for details. |
| 121 | - Improve the start-up time for the following OpenCL kernels: |
| 122 | - @ref CLScale |
| 123 | - @ref CLGEMM |
| 124 | - @ref CLDepthwiseConvolutionLayer |
| 125 | - \link opencl::kernels::ClIm2ColKernel ClIm2ColKernel \endlink |
| 126 | - \link opencl::kernels::ClDirectConv2dKernel ClDirectConv2dKernel \endlink |
| 127 | - Remove functions: |
| 128 | - CLRemap |
| 129 | - NERemap |
| 130 | - Remove padding from OpenCL kernels: |
| 131 | - \link opencl::kernels::ClDirectConv2dKernel ClDirectConv2dKernel \endlink |
| 132 | - Remove padding from Cpu kernels: |
| 133 | - \link cpu::kernels::CpuDirectConv2dKernel CpuDirectConv2dKernel \endlink |
| 134 | - Decouple the implementation of the following Cpu kernels into various data types (fp32, fp16, int): |
| 135 | - \link cpu::kernels::CpuActivationKernel CpuActivationKernel \endlink |
| 136 | - \link cpu::kernels::CpuAddKernel CpuAddKernel \endlink |
| 137 | - \link cpu::kernels::CpuElementwiseKernel CpuElementwiseKernel \endlink |
| 138 | - \link cpu::CpuSoftmaxGeneric CpuSoftmaxKernel \endlink |
| 139 | - @ref NEBoundingBoxTransformKernel |
| 140 | - @ref NECropKernel |
| 141 | - @ref NEComputeAllAnchorsKernel |
| 142 | - @ref NEInstanceNormalizationLayerKernel |
Adnan AlSinan | bb8b235 | 2022-02-14 14:30:38 +0000 | [diff] [blame] | 143 | - NEMaxUnpoolingLayerKernel |
Adnan AlSinan | 69854ba | 2022-02-07 15:28:56 +0000 | [diff] [blame] | 144 | - @ref NEMeanStdDevNormalizationKernel |
| 145 | - @ref NERangeKernel |
| 146 | - @ref NEROIAlignLayerKernel |
| 147 | - @ref NESelectKernel |
| 148 | |
Sheri Zhang | 5dda217 | 2021-10-15 19:54:17 +0100 | [diff] [blame] | 149 | v21.11 Public major release |
| 150 | - Various bug fixes. |
Gunes Bayir | 0877370 | 2021-11-05 12:34:34 +0000 | [diff] [blame] | 151 | - Various optimizations: |
| 152 | - Improve performance of bilinear and nearest neighbor Scale on both CPU and GPU for FP32, FP16, Int8, Uint8 data types |
Adnan AlSinan | abc093b | 2022-02-08 16:57:06 +0000 | [diff] [blame] | 153 | - Improve performance of Softmax on GPU for Uint8/Int8 |
Sheri Zhang | 5dda217 | 2021-10-15 19:54:17 +0100 | [diff] [blame] | 154 | - New OpenCL kernels / functions: |
| 155 | - @ref CLConv3D |
| 156 | - New Arm® Neon™ kernels / functions: |
| 157 | - @ref NEConv3D |
Gunes Bayir | 0877370 | 2021-11-05 12:34:34 +0000 | [diff] [blame] | 158 | - Support configurable build by a selected subset of operator list |
| 159 | - Support MobileBert on Neon™ backend |
| 160 | - Improve operator/function logging |
| 161 | - Remove padding from OpenCL kernels: |
| 162 | - ClPool2dKernel |
| 163 | - ClScaleKernel |
| 164 | - ClGemmMatrixMultiplyReshapedKernel |
| 165 | - Remove padding from Cpu kernels: |
| 166 | - CpuPool2dKernel |
| 167 | - Remove Y padding from OpenCL kernels: |
| 168 | - ClGemmMatrixMultiplyKernel |
| 169 | - ClGemmReshapedRHSMatrixKernel |
| 170 | - Remove legacy GeMM kernels in gemm_v1.cl |
Sheri Zhang | 5dda217 | 2021-10-15 19:54:17 +0100 | [diff] [blame] | 171 | |
Freddie Liardet | 77014ff | 2021-08-05 15:50:31 +0100 | [diff] [blame] | 172 | v21.08 Public major release |
| 173 | - Various bug fixes. |
| 174 | - Various optimizations: |
| 175 | - Improve LWS (Local-Workgroup-Size) heuristic in OpenCL for GeMM, Direct Convolution and Winograd Transformations when OpenCL tuner is not used |
| 176 | - Improve QASYMM8/QSYMM8 performance on OpenCL for various Arm® Mali™ GPU architectures |
| 177 | - Add dynamic weights support in Fully connected layer (CPU/GPU) |
| 178 | - Various performance optimizations for floating-point data types (CPU/GPU) |
| 179 | - Add a reduced core library build arm_compute_core_v2 |
| 180 | - Expose Operator API |
| 181 | - Support fat binary build for arm8.2-a via fat_binary build flag |
| 182 | - Add CPU discovery capabilities |
| 183 | - Add data type f16 support for: |
Adnan AlSinan | 6863fa0 | 2022-02-04 13:04:55 +0000 | [diff] [blame] | 184 | - CLRemapKernel |
Freddie Liardet | 77014ff | 2021-08-05 15:50:31 +0100 | [diff] [blame] | 185 | - Port the following functions to stateless API: |
| 186 | - @ref CLConvolutionLayer |
| 187 | - @ref CLFlattenLayer |
| 188 | - @ref CLFullyConnectedLayer |
| 189 | - @ref CLGEMM |
| 190 | - @ref CLGEMMConvolutionLayer |
| 191 | - @ref CLGEMMLowpMatrixMultiplyCore |
| 192 | - @ref CLWinogradConvolutionLayer |
| 193 | - @ref NEConvolutionLayer |
| 194 | - @ref NEFlattenLayer |
| 195 | - @ref NEFullyConnectedLayer |
| 196 | - @ref NEGEMM |
| 197 | - @ref NEGEMMConv2d |
| 198 | - @ref NEGEMMConvolutionLayer |
| 199 | - @ref NEGEMMLowpMatrixMultiplyCore |
| 200 | - @ref NEWinogradConvolutionLayer |
| 201 | - Remove the following functions: |
| 202 | - CLWinogradInputTransform |
| 203 | - Remove CLCoreRuntimeContext |
| 204 | - Remove ICPPSimpleKernel |
| 205 | - Rename file arm_compute/runtime/CL/functions/CLElementWiseUnaryLayer.h to arm_compute/runtime/CL/functions/CLElementwiseUnaryLayer.h |
| 206 | |
Michalis Spyrou | 27e67f0 | 2021-02-16 11:34:39 +0000 | [diff] [blame] | 207 | v21.05 Public major release |
Sheri Zhang | c2bed95 | 2021-05-06 12:12:38 +0100 | [diff] [blame] | 208 | - Various bug fixes. |
| 209 | - Various optimisations. |
| 210 | - Various documentation updates: |
Jakub Sujak | ee301b3 | 2021-06-04 09:46:08 +0100 | [diff] [blame] | 211 | - Add supported operators and corresponding Android NNAPI operators. |
| 212 | - Documentation reorg into user guide and contributor guide. |
Sheri Zhang | c2bed95 | 2021-05-06 12:12:38 +0100 | [diff] [blame] | 213 | - Add support for a global allocator for OpenCL tensors |
| 214 | - Add experimental support for [CLVK](https://github.com/kpet/clvk). |
| 215 | - Add data type S32 support for: |
| 216 | - @ref opencl::kernels::ClArithmeticKernel |
| 217 | - Add data type QASYMM8 support for: |
| 218 | - @ref CLROIPoolingLayer |
| 219 | - @ref CLROIPoolingLayerKernel |
| 220 | - @ref NEROIPoolingLayer |
| 221 | - @ref NEROIPoolingLayerKernel |
| 222 | - Add per-channel quantization support for: |
| 223 | - @ref CLDeconvolutionLayer |
| 224 | - @ref CLDirectDeconvolutionLayer |
| 225 | - @ref NEConvolutionLayer |
| 226 | - @ref NEDeconvolutionLayer |
| 227 | - Remove padding from OpenCL kernels: |
| 228 | - @ref CLL2NormalizeLayerKernel |
Gian Marco Iodice | 8155c02 | 2021-04-16 15:08:59 +0100 | [diff] [blame] | 229 | - CLDepthwiseConvolutionLayer3x3NHWCKernel |
Sheri Zhang | c2bed95 | 2021-05-06 12:12:38 +0100 | [diff] [blame] | 230 | - @ref CLNormalizationLayerKernel |
| 231 | - @ref CLNormalizePlanarYUVLayerKernel |
| 232 | - @ref opencl::kernels::ClMulKernel |
| 233 | - @ref CLReductionOperationKernel |
| 234 | - @ref CLROIPoolingLayerKernel |
| 235 | - Remove computer vision support from Arm® Neon™ backend |
| 236 | - Remove the following functions: |
Michalis Spyrou | 27e67f0 | 2021-02-16 11:34:39 +0000 | [diff] [blame] | 237 | - NEAbsoluteDifference |
| 238 | - NEAccumulate |
| 239 | - NEBox3x3 |
| 240 | - NECannyEdge |
| 241 | - NEChannelCombine |
| 242 | - NEChannelExtract |
| 243 | - NEColorConvert |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 244 | - NEConvolution |
Michalis Spyrou | 27e67f0 | 2021-02-16 11:34:39 +0000 | [diff] [blame] | 245 | - NEDerivative |
| 246 | - NEDilate |
| 247 | - NEEqualizeHistogram |
| 248 | - NEErode |
| 249 | - NEFastCorners |
| 250 | - NEGaussian3x3 |
| 251 | - NEGaussian5x5 |
| 252 | - NEGaussianPyramid |
| 253 | - NEHOGDescriptor |
| 254 | - NEHOGDetector |
| 255 | - NEHOGGradient |
| 256 | - NEHOGMultiDetection |
| 257 | - NEHarrisCorners |
| 258 | - NEHistogram |
| 259 | - NEIntegralImage |
| 260 | - NELaplacianPyramid |
| 261 | - NELaplacianReconstruct |
| 262 | - NEMagnitude |
| 263 | - NEMeanStdDev |
| 264 | - NEMedian3x3 |
| 265 | - NEMinMaxLocation |
| 266 | - NENonLinearFilter |
| 267 | - NEOpticalFlow |
| 268 | - NEPhase |
Michalis Spyrou | 27e67f0 | 2021-02-16 11:34:39 +0000 | [diff] [blame] | 269 | - NEScharr3x3 |
| 270 | - NESobel3x3 |
| 271 | - NESobel5x5 |
| 272 | - NESobel7x7 |
| 273 | - NETableLookup |
| 274 | - NEThreshold |
| 275 | - NEWarpAffine |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 276 | - NEWarpPerspectiveKernel |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 277 | - Remove all GLES kernels / functions / tests / examples |
Sheri Zhang | c2bed95 | 2021-05-06 12:12:38 +0100 | [diff] [blame] | 278 | - Remove computer vision support from CL backend |
| 279 | - Remove the following functions: |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 280 | - CLAbsoluteDifference |
| 281 | - CLAccumulate |
| 282 | - CLBox3x3 |
| 283 | - CLCannyEdge |
| 284 | - CLChannelCombine |
| 285 | - CLChannelExtract |
| 286 | - CLColorConvert |
| 287 | - CLConvolution |
| 288 | - CLDerivative |
| 289 | - CLDilate |
| 290 | - CLEqualizeHistogram |
| 291 | - CLErode |
| 292 | - CLFastCorners |
| 293 | - CLGaussian3x3 |
| 294 | - CLGaussian5x5 |
| 295 | - CLGaussianPyramid |
| 296 | - CLHOGDescriptor |
| 297 | - CLHOGDetector |
| 298 | - CLHOGGradient |
| 299 | - CLHOGMultiDetection |
| 300 | - CLHarrisCorners |
| 301 | - CLHistogram |
| 302 | - CLIntegralImage |
| 303 | - CLLaplacianPyramid |
| 304 | - CLLaplacianReconstruct |
| 305 | - CLMagnitude |
| 306 | - CLMeanStdDev |
| 307 | - CLMedian3x3 |
| 308 | - CLMinMaxLocation |
| 309 | - CLNonLinearFilter |
| 310 | - CLOpticalFlow |
| 311 | - CLPhase |
| 312 | - CLScharr3x3 |
| 313 | - CLSobel3x3 |
| 314 | - CLSobel5x5 |
| 315 | - CLSobel7x7 |
| 316 | - CLTableLookup |
| 317 | - CLThreshold |
| 318 | - CLWarpAffine |
| 319 | - CLWarpPerspective |
Ramy Elgammal | 0d274b7 | 2022-08-05 13:14:57 +0100 | [diff] [blame] | 320 | |
Georgios Pinitas | 40f51a6 | 2020-11-21 03:04:18 +0000 | [diff] [blame] | 321 | v21.02 Public major release |
Sheri Zhang | da6a6eb | 2021-01-06 11:15:06 +0000 | [diff] [blame] | 322 | - Various bug fixes. |
| 323 | - Various optimisations. |
Georgios Pinitas | 4551403 | 2020-12-30 00:03:09 +0000 | [diff] [blame] | 324 | - Upgrade C++ standard to C++14 |
| 325 | - Add macOS support |
Giorgio Arena | 1055dc1 | 2021-02-19 09:53:06 +0000 | [diff] [blame] | 326 | - Add Armv8-R AArch64 architecture support |
Sheri Zhang | da6a6eb | 2021-01-06 11:15:06 +0000 | [diff] [blame] | 327 | - Add SVE/SVE2 support for: |
Manuel Bottini | 10b3826 | 2021-02-19 18:16:44 +0000 | [diff] [blame] | 328 | - NEScaleKernel |
Sheri Zhang | da6a6eb | 2021-01-06 11:15:06 +0000 | [diff] [blame] | 329 | - @ref NEActivationLayer |
| 330 | - @ref NEArithmeticAddition |
| 331 | - @ref NEBatchNormalizationLayerKernel |
Giorgio Arena | 1055dc1 | 2021-02-19 09:53:06 +0000 | [diff] [blame] | 332 | - @ref cpu::kernels::CpuLogits1DSoftmaxKernel |
| 333 | - @ref cpu::kernels::CpuLogits1DMaxKernel |
| 334 | - @ref cpu::kernels::CpuElementwiseUnaryKernel |
Sheri Zhang | dda6914 | 2021-02-01 19:06:57 +0000 | [diff] [blame] | 335 | - Remove padding from OpenCL kernels: |
Sheri Zhang | 1efed92 | 2021-03-10 22:43:38 +0000 | [diff] [blame] | 336 | - CLDirectConvolutionLayerKernel |
Sheri Zhang | dda6914 | 2021-02-01 19:06:57 +0000 | [diff] [blame] | 337 | - @ref CLArgMinMaxLayerKernel |
| 338 | - @ref CLPadLayerKernel |
| 339 | - @ref CLROIAlignLayerKernel |
| 340 | - @ref CLRangeKernel |
Manuel Bottini | 3b131ab | 2021-02-19 18:16:44 +0000 | [diff] [blame] | 341 | - CLScaleKernel |
Sheri Zhang | dda6914 | 2021-02-01 19:06:57 +0000 | [diff] [blame] | 342 | - @ref CLSelectKernel |
| 343 | - @ref CLBitwiseKernel |
Giorgio Arena | 1055dc1 | 2021-02-19 09:53:06 +0000 | [diff] [blame] | 344 | - @ref opencl::kernels::ClFloorKernel |
Teresa Charlin | 2788609 | 2021-02-25 20:15:01 +0000 | [diff] [blame] | 345 | - CLTransposeKernel |
Giorgio Arena | 5b50f42 | 2021-02-17 11:43:05 +0000 | [diff] [blame] | 346 | - Deprecate functions in CLTuner: |
| 347 | - add_lws_to_table |
| 348 | - import_lws_table |
| 349 | - lws_table |
Sheri Zhang | da6a6eb | 2021-01-06 11:15:06 +0000 | [diff] [blame] | 350 | - Remove functions: |
Georgios Pinitas | 96b16b6 | 2020-12-01 17:41:34 +0000 | [diff] [blame] | 351 | - NELocallyConnectedLayer / CLLocallyConnectedLayer |
Georgios Pinitas | f7c5a41 | 2020-12-03 14:38:33 +0000 | [diff] [blame] | 352 | - NEIm2Col |
| 353 | - NECol2Im |
| 354 | - NEGEMMInterleave4x4 |
| 355 | - NEGEMMTranspose1xW |
Georgios Pinitas | 8c3c0e7 | 2020-12-03 20:11:53 +0000 | [diff] [blame] | 356 | - NEComputeAllAnchors / CLComputeAllAnchors |
Georgios Pinitas | ec2256b | 2020-12-03 18:51:58 +0000 | [diff] [blame] | 357 | - NEGEMMAssemblyDispatch |
Georgios Pinitas | c53266e | 2020-12-09 03:11:53 +0000 | [diff] [blame] | 358 | - NEUpsampleLayer / CLUpsampleLayer |
Sheri Zhang | da6a6eb | 2021-01-06 11:15:06 +0000 | [diff] [blame] | 359 | - Remove kernels: |
Georgios Pinitas | d308df3 | 2020-12-01 16:56:36 +0000 | [diff] [blame] | 360 | - NEGEMMMatrixVectorMultiplyKernel |
Georgios Pinitas | 96b16b6 | 2020-12-01 17:41:34 +0000 | [diff] [blame] | 361 | - NELocallyConnectedMatrixMultiplyKernel / CLLocallyConnectedMatrixMultiplyKernel |
Georgios Pinitas | c53266e | 2020-12-09 03:11:53 +0000 | [diff] [blame] | 362 | - NEUpsampleLayerKernel / CLUpsampleLayerKernel |
Gian Marco Iodice | f5aad51 | 2021-02-08 17:34:40 +0000 | [diff] [blame] | 363 | - Extend OpenCL tuner with workgroup batch size support |
| 364 | - Experimental extension for the OpenCL tuner to tune the batches of work groups distribute to compute units |
Gian Marco Iodice | 716b1be | 2021-02-10 17:33:27 +0000 | [diff] [blame] | 365 | - Add functionality to load the OpenCL GEMM heuristics at runtime |
| 366 | - The GEMM heuristic file (MLGO) can be used to update the default GEMM heuristics available for OpenCL |
Giorgio Arena | cd7d178 | 2021-02-22 14:58:37 +0000 | [diff] [blame] | 367 | - Note: there might be performance regressions against v20.08 in Inception v3 using int8 data types on Arm Mali-G77 GPUs. Currently under investigation |
Jakub Sujak | ee301b3 | 2021-06-04 09:46:08 +0100 | [diff] [blame] | 368 | - Note: data-type decoupling is in progress and experimental. Warning of unused symbols might be raised |
Georgios Pinitas | 40f51a6 | 2020-11-21 03:04:18 +0000 | [diff] [blame] | 369 | |
SiCong Li | 96209c7 | 2020-08-21 12:28:30 +0100 | [diff] [blame] | 370 | v20.11 Public major release |
morgolock | 70b1eb8 | 2020-11-24 13:54:19 +0000 | [diff] [blame] | 371 | - Various bug fixes. |
| 372 | - Various optimisations. |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 373 | - Performance regressions can be noted when executing Depthwise Convolution on Arm® Neon™ with a depth multiplier > 1 for quantized data type. |
morgolock | 0e72849 | 2020-11-20 11:03:33 +0000 | [diff] [blame] | 374 | This is planned to be resolved in 21.02 release. |
morgolock | 70b1eb8 | 2020-11-24 13:54:19 +0000 | [diff] [blame] | 375 | - Added new data type QASYMM8_SIGNED support for @ref NEROIAlignLayer. |
SiCong Li | 903f8cc | 2020-08-27 10:17:10 +0100 | [diff] [blame] | 376 | - Added new data type S32 support for: |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 377 | - NEArithmeticSubtraction |
| 378 | - NEArithmeticSubtractionKernel |
SiCong Li | bb88f89 | 2020-08-28 11:18:47 +0100 | [diff] [blame] | 379 | - @ref NEPixelWiseMultiplication |
Sheri Zhang | 1e3ab42 | 2021-03-16 17:35:08 +0000 | [diff] [blame] | 380 | - NEPixelWiseMultiplicationKernel |
Sang-Hoon Park | 63001ac | 2021-01-18 14:20:27 +0000 | [diff] [blame] | 381 | - NEElementwiseDivision |
| 382 | - NEDivisionOperationKernel |
SiCong Li | 96209c7 | 2020-08-21 12:28:30 +0100 | [diff] [blame] | 383 | - Interface change |
| 384 | - Properly support softmax axis to have the same meaning as other major frameworks. That is, axis now defines the dimension |
| 385 | on which Softmax/Logsoftmax is performed. E.g. for input of shape 4x5x6 and axis=1, softmax will be applied to 4x6=24 vectors of size 5. |
| 386 | The supported value range of axis is [-rank, rank). |
| 387 | This change applies to the following functions: |
| 388 | - @ref NESoftmaxLayer |
| 389 | - @ref NELogSoftmaxLayer |
| 390 | - @ref CLSoftmaxLayer |
| 391 | - @ref CLLogSoftmaxLayer |
Manuel Bottini | ceaa0bf | 2021-02-16 15:15:19 +0000 | [diff] [blame] | 392 | - GCSoftmaxLayer |
Sheri Zhang | 824061d | 2020-10-26 15:46:37 +0000 | [diff] [blame] | 393 | - New OpenCL kernels / functions: |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 394 | - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel |
morgolock | 0e72849 | 2020-11-20 11:03:33 +0000 | [diff] [blame] | 395 | - @ref CLLogicalNot |
| 396 | - @ref CLLogicalAnd |
| 397 | - @ref CLLogicalOr |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 398 | - New Arm® Neon™ kernels / functions: |
morgolock | 0e72849 | 2020-11-20 11:03:33 +0000 | [diff] [blame] | 399 | - @ref NELogicalNot |
| 400 | - @ref NELogicalAnd |
| 401 | - @ref NELogicalOr |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 402 | - Removed padding from Arm® Neon™ kernels: |
Sheri Zhang | 1e3ab42 | 2021-03-16 17:35:08 +0000 | [diff] [blame] | 403 | - NEComplexPixelWiseMultiplicationKernel |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 404 | - NENonMaximaSuppression3x3Kernel |
Adnan AlSinan | 6863fa0 | 2022-02-04 13:04:55 +0000 | [diff] [blame] | 405 | - NERemapKernel |
Michele Di Giorgio | 93b75e0 | 2021-06-21 12:00:43 +0100 | [diff] [blame] | 406 | - NEGEMMInterleave4x4Kernel |
Manuel Bottini | 327225d | 2021-04-13 13:09:30 +0100 | [diff] [blame] | 407 | - NEDirectConvolutionLayerKernel |
Manuel Bottini | 10b3826 | 2021-02-19 18:16:44 +0000 | [diff] [blame] | 408 | - NEScaleKernel |
Georgios Pinitas | 96b16b6 | 2020-12-01 17:41:34 +0000 | [diff] [blame] | 409 | - NELocallyConnectedMatrixMultiplyKernel |
Manuel Bottini | cfac51c | 2021-06-18 15:47:28 +0100 | [diff] [blame] | 410 | - NEGEMMLowpOffsetContributionKernel |
Michele Di Giorgio | 93b75e0 | 2021-06-21 12:00:43 +0100 | [diff] [blame] | 411 | - NEGEMMTranspose1xWKernel |
Michele Di Giorgio | 1928904 | 2021-02-03 16:05:00 +0000 | [diff] [blame] | 412 | - NEPoolingLayerKernel |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 413 | - NEConvolutionKernel |
Michalis Spyrou | 60c3b0e | 2021-04-08 12:02:58 +0100 | [diff] [blame] | 414 | - NEDepthwiseConvolutionLayerNativeKernel |
Manuel Bottini | cfac51c | 2021-06-18 15:47:28 +0100 | [diff] [blame] | 415 | - NEGEMMLowpMatrixMultiplyKernel |
Michele Di Giorgio | 53832b2 | 2021-06-21 14:45:44 +0100 | [diff] [blame] | 416 | - NEGEMMMatrixMultiplyKernel |
Manuel Bottini | 327225d | 2021-04-13 13:09:30 +0100 | [diff] [blame] | 417 | - NEDirectConvolutionLayerOutputStageKernel |
Sheri Zhang | ed36713 | 2020-10-08 15:46:16 +0100 | [diff] [blame] | 418 | - @ref NEReductionOperationKernel |
Manuel Bottini | cfac51c | 2021-06-18 15:47:28 +0100 | [diff] [blame] | 419 | - NEGEMMLowpMatrixAReductionKernel |
| 420 | - NEGEMMLowpMatrixBReductionKernel |
Sheri Zhang | 824061d | 2020-10-26 15:46:37 +0000 | [diff] [blame] | 421 | - Removed padding from OpenCL kernels: |
Michele Di Giorgio | 7d61ff0 | 2021-01-18 21:15:59 +0000 | [diff] [blame] | 422 | - CLBatchConcatenateLayerKernel |
Michele Di Giorgio | 1e0208a | 2021-01-22 15:42:59 +0000 | [diff] [blame] | 423 | - CLElementwiseOperationKernel |
Sheri Zhang | 824061d | 2020-10-26 15:46:37 +0000 | [diff] [blame] | 424 | - @ref CLBatchNormalizationLayerKernel |
Michele Di Giorgio | e131466 | 2021-02-01 17:09:32 +0000 | [diff] [blame] | 425 | - CLPoolingLayerKernel |
Manuel Bottini | c6f4ec3 | 2021-05-18 18:41:56 +0100 | [diff] [blame] | 426 | - CLWinogradInputTransformKernel |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 427 | - CLGEMMLowpMatrixMultiplyNativeKernel |
| 428 | - CLGEMMLowpMatrixAReductionKernel |
| 429 | - CLGEMMLowpMatrixBReductionKernel |
| 430 | - CLGEMMLowpOffsetContributionOutputStageKernel |
| 431 | - CLGEMMLowpOffsetContributionKernel |
Manuel Bottini | c6f4ec3 | 2021-05-18 18:41:56 +0100 | [diff] [blame] | 432 | - CLWinogradOutputTransformKernel |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 433 | - CLGEMMLowpMatrixMultiplyReshapedKernel |
Sheri Zhang | 824061d | 2020-10-26 15:46:37 +0000 | [diff] [blame] | 434 | - @ref CLFuseBatchNormalizationKernel |
| 435 | - @ref CLDepthwiseConvolutionLayerNativeKernel |
Georgios Pinitas | 11d8415 | 2021-04-28 10:20:18 +0100 | [diff] [blame] | 436 | - CLDepthConvertLayerKernel |
Sheri Zhang | 7e20e29 | 2021-02-02 11:49:34 +0000 | [diff] [blame] | 437 | - CLCopyKernel |
Gian Marco Iodice | 8155c02 | 2021-04-16 15:08:59 +0100 | [diff] [blame] | 438 | - CLDepthwiseConvolutionLayer3x3NHWCKernel |
Georgios Pinitas | f47f718 | 2021-01-15 09:29:50 +0000 | [diff] [blame] | 439 | - CLActivationLayerKernel |
Manuel Bottini | c6f4ec3 | 2021-05-18 18:41:56 +0100 | [diff] [blame] | 440 | - CLWinogradFilterTransformKernel |
Michele Di Giorgio | 7d61ff0 | 2021-01-18 21:15:59 +0000 | [diff] [blame] | 441 | - CLWidthConcatenateLayerKernel |
| 442 | - CLWidthConcatenate4TensorsKernel |
| 443 | - CLWidthConcatenate2TensorsKernel |
Sang-Hoon Park | 201e0fe | 2021-01-27 13:14:56 +0000 | [diff] [blame] | 444 | - CLLogits1DMaxShiftExpSumKernel |
| 445 | - CLLogits1DNormKernel |
Michele Di Giorgio | 7d61ff0 | 2021-01-18 21:15:59 +0000 | [diff] [blame] | 446 | - CLHeightConcatenateLayerKernel |
Georgios Pinitas | 856f66e | 2021-04-22 21:13:21 +0100 | [diff] [blame] | 447 | - CLGEMMMatrixMultiplyKernel |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 448 | - CLGEMMLowpQuantizeDownInt32ScaleKernel |
| 449 | - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel |
| 450 | - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel |
Michele Di Giorgio | 7d61ff0 | 2021-01-18 21:15:59 +0000 | [diff] [blame] | 451 | - CLDepthConcatenateLayerKernel |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 452 | - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel |
Sheri Zhang | 824061d | 2020-10-26 15:46:37 +0000 | [diff] [blame] | 453 | - Removed OpenCL kernels / functions: |
| 454 | - CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel |
| 455 | - CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel |
| 456 | - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel |
morgolock | 00c7601 | 2020-11-06 10:40:12 +0000 | [diff] [blame] | 457 | - Deprecated OpenCL kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together): |
Georgios Pinitas | 2d22139 | 2020-09-03 15:16:37 +0100 | [diff] [blame] | 458 | - CLLocallyConnectedLayer |
| 459 | - CLLocallyConnectedMatrixMultiplyKernel |
morgolock | 00c7601 | 2020-11-06 10:40:12 +0000 | [diff] [blame] | 460 | - CLAbsoluteDifference |
| 461 | - CLAbsoluteDifferenceKernel |
| 462 | - CLAccumulate |
| 463 | - CLAccumulateKernel |
| 464 | - CLAccumulateSquared |
| 465 | - CLAccumulateSquaredKernel |
| 466 | - CLAccumulateWeighted |
| 467 | - CLAccumulateWeightedKernel |
| 468 | - CLAccumulateWeightedFP16Kernel |
| 469 | - CLBox3x3 |
| 470 | - CLBox3x3Kernel |
| 471 | - CLBox3x3FP16Kernel |
| 472 | - CLCannyEdge |
| 473 | - CLChannelCombine |
| 474 | - CLChannelCombineKernel |
| 475 | - CLChannelExtract |
| 476 | - CLChannelExtractKernel |
| 477 | - CLColorConvert |
| 478 | - CLColorConvertKernel |
| 479 | - CLConvolution3x3 |
| 480 | - CLConvolutionRectangle |
| 481 | - CLConvolutionRectangleKernel |
| 482 | - CLConvolutionSquare |
| 483 | - CLConvolutionKernel |
| 484 | - CLDerivative |
| 485 | - CLDerivativeKernel |
| 486 | - CLDilate |
| 487 | - CLDilateKernel |
| 488 | - CLEqualizeHistogram |
| 489 | - CLErode |
| 490 | - CLErodeKernel |
| 491 | - CLFastCorners |
| 492 | - CLFastCornersKernel |
| 493 | - CLGaussian3x3 |
| 494 | - CLGaussian3x3Kernel |
| 495 | - CLGaussian5x5 |
| 496 | - CLGaussian5x5HorKernel |
| 497 | - CLGaussian5x5VertKernel |
| 498 | - CLGaussianPyramid |
| 499 | - CLGaussianPyramidHalf |
| 500 | - CLGaussianPyramidOrb |
| 501 | - CLHarrisCorners |
| 502 | - CLHarrisScoreKernel |
| 503 | - CLHarrisScoreFP16Kernel |
| 504 | - CLHistogram |
| 505 | - CLHistogramKernel |
| 506 | - CLHOGOrientationBinningKernel |
| 507 | - CLHOGBlockNormalizationKernel |
| 508 | - CLHOGDetectorKernel |
| 509 | - CLHOGNonMaximaSuppressionKernel |
| 510 | - CLHOGDescriptor |
| 511 | - CLHOGDetector |
| 512 | - CLHOGGradient |
| 513 | - CLHOGMultiDetection |
| 514 | - CLHOGOrientationBinningKernel |
| 515 | - CLHOGBlockNormalizationKernel |
| 516 | - CLHOGDetectorKernel |
| 517 | - CLIntegralImage |
| 518 | - CLIntegralImageKernel |
| 519 | - CLLaplacianReconstruct |
| 520 | - CLLaplacianPyramid |
| 521 | - CLMagnitude |
| 522 | - CLMagnitudePhaseKernel |
| 523 | - CLMedian3x3 |
| 524 | - CLMedian3x3Kernel |
| 525 | - CLMinMaxLocation |
| 526 | - CLMinMaxLocationKernel |
| 527 | - CLNonLinearFilter |
| 528 | - CLNonLinearFilterKernel |
| 529 | - CLNonMaximaSuppression3x3 |
| 530 | - CLNonMaximaSuppression3x3FP16Kernel |
| 531 | - CLNonMaximaSuppression3x3Kernel |
| 532 | - CLOpticalFlow |
| 533 | - CLPhase |
| 534 | - CLRemap |
| 535 | - CLRemapKernel |
| 536 | - CLScharr3x3 |
| 537 | - CLScharr3x3Kernel |
| 538 | - CLSobel3x3 |
| 539 | - CLSobel3x3Kernel |
| 540 | - CLSobel5x5 |
| 541 | - CLSobel5x5HorKernel |
| 542 | - CLSobel5x5VertKernel |
| 543 | - CLSobel7x7 |
| 544 | - CLSobel7x7HorKernel |
| 545 | - CLSobel7x7VertKernel |
| 546 | - CLThreshold |
| 547 | - CLThresholdKernel |
| 548 | - CLWarpAffine |
| 549 | - CLWarpAffineKernel |
| 550 | - CLWarpPerspective |
| 551 | - CLWarpPerspectiveKernel |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 552 | - Deprecated Arm® Neon™ kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together): |
Georgios Pinitas | 2d22139 | 2020-09-03 15:16:37 +0100 | [diff] [blame] | 553 | - NELocallyConnectedLayer |
| 554 | - NELocallyConnectedMatrixMultiplyKernel |
morgolock | 0c86265 | 2020-11-06 08:59:45 +0000 | [diff] [blame] | 555 | - NEAbsoluteDifference |
| 556 | - NEAbsoluteDifferenceKernel |
| 557 | - NEAccumulate |
| 558 | - NEAccumulateKernel |
| 559 | - NEAccumulateSquared |
| 560 | - NEAccumulateSquaredKernel |
| 561 | - NEAccumulateWeighted |
| 562 | - NEAccumulateWeightedKernel |
| 563 | - NEAccumulateWeightedFP16Kernel |
| 564 | - NEBox3x3 |
| 565 | - NEBox3x3Kernel |
| 566 | - NEBox3x3FP16Kernel |
| 567 | - NECannyEdge |
| 568 | - NEChannelCombine |
| 569 | - NEChannelCombineKernel |
| 570 | - NEChannelExtract |
| 571 | - NEChannelExtractKernel |
| 572 | - NEColorConvert |
| 573 | - NEColorConvertKernel |
| 574 | - NEConvolution3x3 |
| 575 | - NEConvolutionRectangle |
| 576 | - NEConvolutionRectangleKernel |
| 577 | - NEConvolutionSquare |
| 578 | - NEConvolutionKernel |
| 579 | - NEDerivative |
| 580 | - NEDerivativeKernel |
| 581 | - NEDilate |
| 582 | - NEDilateKernel |
| 583 | - NEEqualizeHistogram |
| 584 | - NEErode |
| 585 | - NEErodeKernel |
| 586 | - NEFastCorners |
| 587 | - NEFastCornersKernel |
| 588 | - NEGaussian3x3 |
| 589 | - NEGaussian3x3Kernel |
| 590 | - NEGaussian5x5 |
| 591 | - NEGaussian5x5HorKernel |
| 592 | - NEGaussian5x5VertKernel |
| 593 | - NEGaussianPyramid |
| 594 | - NEGaussianPyramidHalf |
| 595 | - NEGaussianPyramidOrb |
| 596 | - NEHarrisCorners |
| 597 | - NEHarrisScoreKernel |
| 598 | - NEHarrisScoreFP16Kernel |
| 599 | - NEHistogram |
| 600 | - NEHistogramKernel |
| 601 | - NEHOGOrientationBinningKernel |
| 602 | - NEHOGBlockNormalizationKernel |
| 603 | - NEHOGDetectorKernel |
| 604 | - NEHOGNonMaximaSuppressionKernel |
| 605 | - NEHOGDescriptor |
| 606 | - NEHOGDetector |
| 607 | - NEHOGGradient |
| 608 | - NEHOGMultiDetection |
| 609 | - NEHOGOrientationBinningKernel |
| 610 | - NEHOGBlockNormalizationKernel |
| 611 | - NEHOGDetectorKernel |
| 612 | - NEIntegralImage |
| 613 | - NEIntegralImageKernel |
| 614 | - NELaplacianReconstruct |
| 615 | - NELaplacianPyramid |
| 616 | - NEMagnitude |
| 617 | - NEMagnitudePhaseKernel |
| 618 | - NEMedian3x3 |
| 619 | - NEMedian3x3Kernel |
| 620 | - NEMinMaxLocation |
| 621 | - NEMinMaxLocationKernel |
| 622 | - NENonLinearFilter |
| 623 | - NENonLinearFilterKernel |
| 624 | - NENonMaximaSuppression3x3 |
| 625 | - NENonMaximaSuppression3x3FP16Kernel |
| 626 | - NENonMaximaSuppression3x3Kernel |
| 627 | - NEOpticalFlow |
| 628 | - NEPhase |
| 629 | - NERemap |
| 630 | - NERemapKernel |
| 631 | - NEScharr3x3 |
| 632 | - NEScharr3x3Kernel |
| 633 | - NESobel3x3 |
| 634 | - NESobel3x3Kernel |
| 635 | - NESobel5x5 |
| 636 | - NESobel5x5HorKernel |
| 637 | - NESobel5x5VertKernel |
| 638 | - NESobel7x7 |
| 639 | - NESobel7x7HorKernel |
| 640 | - NESobel7x7VertKernel |
| 641 | - NEThreshold |
| 642 | - NEThresholdKernel |
| 643 | - NEWarpAffine |
| 644 | - NEWarpAffineKernel |
| 645 | - NEWarpPerspective |
| 646 | - NEWarpPerspectiveKernel |
morgolock | d6ee9ed | 2020-11-19 10:07:14 +0000 | [diff] [blame] | 647 | - Deprecated GLES kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together): |
| 648 | - GCAbsoluteDifference |
| 649 | - GCActivationLayer |
| 650 | - GCArithmeticAddition |
| 651 | - GCBatchNormalizationLayer |
| 652 | - GCConcatenateLayer |
| 653 | - GCConvolutionLayer |
| 654 | - GCDepthwiseConvolutionLayer |
| 655 | - GCDirectConvolutionLayer |
| 656 | - GCDropoutLayer |
| 657 | - GCFillBorder |
| 658 | - GCFullyConnectedLayer |
| 659 | - GCGEMM |
| 660 | - GCGEMMInterleave4x4 |
| 661 | - GCGEMMTranspose1xW |
| 662 | - GCNormalizationLayer |
| 663 | - GCNormalizePlanarYUVLayer |
| 664 | - GCPixelWiseMultiplication |
| 665 | - GCPoolingLayer |
| 666 | - GCScale |
| 667 | - GCSoftmaxLayer |
| 668 | - GCTensorShift |
| 669 | - GCTranspose |
| 670 | |
SiCong Li | 96209c7 | 2020-08-21 12:28:30 +0100 | [diff] [blame] | 671 | |
Georgios Pinitas | 25ef721 | 2020-06-02 23:00:41 +0100 | [diff] [blame] | 672 | v20.08 Public major release |
| 673 | - Various bug fixes. |
| 674 | - Various optimisations. |
Sheri Zhang | 3ef9b5f | 2020-07-09 16:32:58 +0100 | [diff] [blame] | 675 | - Added new data type QASYMM8_SIGNED support for: |
Sheri Zhang | dd4cfc0 | 2020-07-10 14:15:41 +0100 | [diff] [blame] | 676 | - @ref CLArgMinMaxLayer |
| 677 | - @ref CLArgMinMaxLayerKernel |
| 678 | - Added new data type U8 support for: |
| 679 | - @ref NECropKernel |
Sheri Zhang | 7e20e29 | 2021-02-02 11:49:34 +0000 | [diff] [blame] | 680 | - CLCropKernel |
Jakub Sujak | ee301b3 | 2021-06-04 09:46:08 +0100 | [diff] [blame] | 681 | - Added align_corner support for nearest neighbor interpolation in: |
Manuel Bottini | 10b3826 | 2021-02-19 18:16:44 +0000 | [diff] [blame] | 682 | - NEScaleKernel |
Manuel Bottini | 3b131ab | 2021-02-19 18:16:44 +0000 | [diff] [blame] | 683 | - CLScaleKernel |
Sheri Zhang | dd4cfc0 | 2020-07-10 14:15:41 +0100 | [diff] [blame] | 684 | - New OpenCL kernels / functions: |
| 685 | - @ref CLMaxUnpoolingLayerKernel |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 686 | - New Arm® Neon™ kernels / functions: |
Dana Zlotnik | 149203b | 2022-01-26 12:38:03 +0200 | [diff] [blame] | 687 | - NEMaxUnpoolingLayerKernel |
Sheri Zhang | 3ef9b5f | 2020-07-09 16:32:58 +0100 | [diff] [blame] | 688 | - New graph example: |
Sheri Zhang | dd4cfc0 | 2020-07-10 14:15:41 +0100 | [diff] [blame] | 689 | - graph_yolov3_output_detector |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 690 | - GEMMTuner improvements: |
| 691 | - Added fp16 support |
| 692 | - Output json files for easier integration |
| 693 | - Enabled tuning for export_to_cl_image_rhs option for RHS tensors |
| 694 | - More robust script for running benchmarks |
Sheri Zhang | 3ef9b5f | 2020-07-09 16:32:58 +0100 | [diff] [blame] | 695 | - Removed padding from: |
Sheri Zhang | 1e3ab42 | 2021-03-16 17:35:08 +0000 | [diff] [blame] | 696 | - NEPixelWiseMultiplicationKernel |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 697 | - NEHeightConcatenateLayerKernel |
Michalis Spyrou | 27e67f0 | 2021-02-16 11:34:39 +0000 | [diff] [blame] | 698 | - NEThresholdKernel |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 699 | - NEBatchConcatenateLayerKernel |
Teresa Charlin | d1dc09c | 2021-03-04 15:24:45 +0000 | [diff] [blame] | 700 | - NETransposeKernel |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 701 | - @ref NEBatchNormalizationLayerKernel |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 702 | - NEArithmeticSubtractionKernel |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 703 | - @ref NEBoundingBoxTransformKernel |
Michalis Spyrou | 373b407 | 2021-01-20 16:41:12 +0000 | [diff] [blame] | 704 | - NELogits1DMaxKernel |
| 705 | - NELogits1DSoftmaxKernel |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 706 | - @ref NEROIPoolingLayerKernel |
| 707 | - @ref NEROIAlignLayerKernel |
Georgios Pinitas | 0b1c2db | 2020-12-04 15:51:34 +0000 | [diff] [blame] | 708 | - NEYOLOLayerKernel |
Georgios Pinitas | c53266e | 2020-12-09 03:11:53 +0000 | [diff] [blame] | 709 | - NEUpsampleLayerKernel |
Georgios Pinitas | 70eb53b | 2021-01-06 19:42:21 +0000 | [diff] [blame] | 710 | - NEFloorKernel |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 711 | - NEWidthConcatenateLayerKernel |
| 712 | - NEDepthConcatenateLayerKernel |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 713 | - @ref NENormalizationLayerKernel |
| 714 | - @ref NEL2NormalizeLayerKernel |
Georgios Pinitas | c6f9510 | 2021-03-30 10:03:01 +0100 | [diff] [blame] | 715 | - NEFillArrayKernel |
Georgios Pinitas | 11d8415 | 2021-04-28 10:20:18 +0100 | [diff] [blame] | 716 | - NEDepthConvertLayerKernel |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 717 | - @ref NERangeKernel |
| 718 | - @ref NEPriorBoxLayer |
Sheri Zhang | ed36713 | 2020-10-08 15:46:16 +0100 | [diff] [blame] | 719 | - Removed OpenCL kernels / functions: |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 720 | - CLGEMMLowpQuantizeDownInt32ToUint8Scale |
| 721 | - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 722 | - Removed Arm® Neon™ kernels / functions: |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 723 | - NEGEMMLowpQuantizeDownInt32ToUint8Scale |
| 724 | - NEGEMMMatrixAccumulateBiasesKernel |
SiCong Li | d004a7a | 2020-05-28 15:26:41 +0100 | [diff] [blame] | 725 | - Deprecated functions / interfaces: |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 726 | - Non-descriptor based interfaces for NEThreshold, CLThreshold |
Manuel Bottini | ceaa0bf | 2021-02-16 15:15:19 +0000 | [diff] [blame] | 727 | - Non-descriptor based interfaces for @ref NEScale, @ref CLScale and GCScale |
| 728 | - In @ref NESoftmaxLayer, @ref NELogSoftmaxLayer, @ref CLSoftmaxLayer, @ref CLLogSoftmaxLayer and GCSoftmaxLayer : |
| 729 | The default "axis" value for @ref CLSoftmaxLayer, @ref CLLogSoftmaxLayer and GCSoftmaxLayer is changed from 1 to 0. |
morgolock | 9c7fed8 | 2020-08-05 12:30:56 +0100 | [diff] [blame] | 730 | Only axis 0 is supported. |
| 731 | The default "axis" value for @ref NESoftmaxLayer, @ref NELogSoftmaxLayer is changed from 1 to 0. |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 732 | Only axis 0 is supported. |
Sang-Hoon Park | a0205b9 | 2020-07-07 09:36:09 +0100 | [diff] [blame] | 733 | - The support for quantized data types has been removed from @ref CLLogSoftmaxLayer due to implementation complexity. |
Manuel Bottini | d844c08 | 2021-07-14 12:58:54 +0100 | [diff] [blame] | 734 | - Removed padding requirement for the input (e.g. LHS of GEMM) and output in CLGEMMMatrixMultiplyNativeKernel, CLGEMMMatrixMultiplyReshapedKernel, CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLIm2ColKernel (NHWC only) |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 735 | - This change allows to use @ref CLGEMMConvolutionLayer without extra padding for the input and output. |
| 736 | - Only the weights/bias of @ref CLGEMMConvolutionLayer could require padding for the computation. |
Georgios Pinitas | 856f66e | 2021-04-22 21:13:21 +0100 | [diff] [blame] | 737 | - Only on Arm® Mali™ Midgard GPUs, @ref CLGEMMConvolutionLayer could require padding since CLGEMMMatrixMultiplyKernel is called and currently requires padding. |
| 738 | - Added support for exporting the OpenCL buffer object to the OpenCL image object in CLGEMMMatrixMultiplyReshapedKernel and CLGEMMMatrixMultiplyReshapedOnlyRHSKernel. |
Sang-Hoon Park | adfaefb | 2020-08-18 09:13:05 +0100 | [diff] [blame] | 739 | - This support allows to export the OpenCL buffer used for the reshaped RHS matrix to the OpenCL image object. |
Georgios Pinitas | 856f66e | 2021-04-22 21:13:21 +0100 | [diff] [blame] | 740 | - The padding requirement for the OpenCL image object is considered into the CLGEMMReshapeRHSMatrixKernel. |
| 741 | - The reshaped RHS matrix stores the weights when GEMM is used to accelerate CLGEMMConvolutionLayer. |
Georgios Pinitas | 25ef721 | 2020-06-02 23:00:41 +0100 | [diff] [blame] | 742 | |
Georgios Pinitas | fd7780d | 2020-03-17 11:41:00 +0000 | [diff] [blame] | 743 | v20.05 Public major release |
Georgios Pinitas | c7b183a | 2020-03-06 18:12:09 +0000 | [diff] [blame] | 744 | - Various bug fixes. |
| 745 | - Various optimisations. |
Michele Di Giorgio | 36a551f | 2020-04-23 11:55:29 +0100 | [diff] [blame] | 746 | - Updated recommended NDK version to r18b. |
| 747 | - Updated recommended gcc version to Linaro 6.3.1. |
Georgios Pinitas | c7b183a | 2020-03-06 18:12:09 +0000 | [diff] [blame] | 748 | - Added Bfloat16 type support |
| 749 | - Added Bfloat16 support in: |
Manuel Bottini | 29599d0 | 2021-07-06 15:01:35 +0100 | [diff] [blame] | 750 | - NEWeightsReshapeKernel |
| 751 | - NEConvolutionLayerReshapeWeights |
Manuel Bottini | 9002899 | 2021-06-30 18:29:18 +0100 | [diff] [blame] | 752 | - NEIm2ColKernel |
Georgios Pinitas | f7c5a41 | 2020-12-03 14:38:33 +0000 | [diff] [blame] | 753 | - NEIm2Col |
Georgios Pinitas | 11d8415 | 2021-04-28 10:20:18 +0100 | [diff] [blame] | 754 | - NEDepthConvertLayerKernel |
Georgios Pinitas | c7b183a | 2020-03-06 18:12:09 +0000 | [diff] [blame] | 755 | - @ref NEDepthConvertLayer |
| 756 | - @ref NEGEMMConvolutionLayer |
Georgios Pinitas | ec2256b | 2020-12-03 18:51:58 +0000 | [diff] [blame] | 757 | - NEGEMMAssemblyDispatch |
Sheri Zhang | 0f2522b | 2020-03-25 16:38:19 +0000 | [diff] [blame] | 758 | - Added new data type QASYMM8_SIGNED support for: |
| 759 | - @ref CLDirectConvolutionLayer |
| 760 | - @ref CLDeconvolutionLayer |
| 761 | - @ref CLDirectDeconvolutionLayer |
| 762 | - @ref CLGEMMDeconvolutionLayer |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 763 | - CLGEMMLowpMatrixMultiplyReshapedKernel |
| 764 | - CLGEMMLowpQuantizeDownInt32ScaleKernel |
| 765 | - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel |
Sheri Zhang | 0f2522b | 2020-03-25 16:38:19 +0000 | [diff] [blame] | 766 | - @ref CLReductionOperation |
| 767 | - @ref CLReduceMean |
Sheri Zhang | 359c48e | 2020-04-30 22:53:39 +0100 | [diff] [blame] | 768 | - @ref NEScale |
Manuel Bottini | 10b3826 | 2021-02-19 18:16:44 +0000 | [diff] [blame] | 769 | - NEScaleKernel |
Georgios Pinitas | c53266e | 2020-12-09 03:11:53 +0000 | [diff] [blame] | 770 | - NEUpsampleLayer |
Sheri Zhang | 0f2522b | 2020-03-25 16:38:19 +0000 | [diff] [blame] | 771 | - @ref NECast |
| 772 | - @ref NEReductionOperation |
| 773 | - @ref NEReduceMean |
| 774 | - @ref NEArgMinMaxLayer |
| 775 | - @ref NEDeconvolutionLayer |
Manuel Bottini | ae58bdf | 2021-06-17 17:18:45 +0100 | [diff] [blame] | 776 | - NEGEMMLowpQuantizeDownInt32ScaleKernel |
Sheri Zhang | 0f2522b | 2020-03-25 16:38:19 +0000 | [diff] [blame] | 777 | - @ref CPPBoxWithNonMaximaSuppressionLimit |
| 778 | - @ref CPPDetectionPostProcessLayer |
| 779 | - @ref CPPPermuteKernel |
| 780 | - @ref CPPPermute |
| 781 | - @ref CPPTopKVKernel |
| 782 | - @ref CPPTopKV |
Sheri Zhang | 359c48e | 2020-04-30 22:53:39 +0100 | [diff] [blame] | 783 | - @ref CPPUpsample |
| 784 | - @ref CPPUpsampleKernel |
Sheri Zhang | 31b49ca | 2020-04-24 11:15:10 +0100 | [diff] [blame] | 785 | - New OpenCL kernels / functions: |
| 786 | - @ref CLQLSTMLayer |
| 787 | - @ref CLQLSTMLayerNormalizationKernel |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 788 | - New Arm® Neon™ kernels / functions: |
Sheri Zhang | 31b49ca | 2020-04-24 11:15:10 +0100 | [diff] [blame] | 789 | - @ref NEQLSTMLayer |
| 790 | - @ref NEQLSTMLayerNormalizationKernel |
| 791 | - Added HARD_SWISH support in: |
Georgios Pinitas | f47f718 | 2021-01-15 09:29:50 +0000 | [diff] [blame] | 792 | - CLActivationLayerKernel |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 793 | - NEActivationLayerKernel |
Sheri Zhang | 0f2522b | 2020-03-25 16:38:19 +0000 | [diff] [blame] | 794 | - Deprecated OpenCL kernels / functions: |
| 795 | - CLGEMMLowpQuantizeDownInt32ToUint8Scale |
| 796 | - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 797 | - Deprecated Arm® Neon™ kernels / functions: |
Sheri Zhang | 0f2522b | 2020-03-25 16:38:19 +0000 | [diff] [blame] | 798 | - NEGEMMLowpQuantizeDownInt32ToUint8Scale |
| 799 | - Removed CPP kernels / functions: |
| 800 | - CPPFlipWeightsKernel |
Manuel Bottini | 387259a | 2020-05-21 17:14:36 +0100 | [diff] [blame] | 801 | - Removed PoolingLayerInfo constructors without Data Layout. |
| 802 | - Removed CLDepthwiseConvolutionLayer3x3 |
| 803 | - Removed NEDepthwiseConvolutionLayerOptimized |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 804 | - Added support for Winograd 3x3,4x4 on Arm® Neon™ FP16: |
Manuel Bottini | 075253a | 2020-05-22 12:57:18 +0100 | [diff] [blame] | 805 | - @ref NEWinogradConvolutionLayer |
Michalis Spyrou | 96f977e | 2021-07-01 12:20:56 +0100 | [diff] [blame] | 806 | - CpuWinogradConv2dTransformInputKernel |
| 807 | - CpuWinogradConv2dTransformOutputKernel |
| 808 | - CpuWinogradConv2dTransformWeightsKernel |
Manuel Bottini | 075253a | 2020-05-22 12:57:18 +0100 | [diff] [blame] | 809 | - Added CLCompileContext |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 810 | - Added Arm® Neon™ GEMM kernel with 2D window support |
Georgios Pinitas | c7b183a | 2020-03-06 18:12:09 +0000 | [diff] [blame] | 811 | |
Michele Di Giorgio | 740872e | 2020-03-04 15:29:49 +0000 | [diff] [blame] | 812 | v20.02.1 Maintenance release |
| 813 | - Added Android-NN build script. |
| 814 | |
Giuseppe Rossini | f04ddbc | 2020-02-17 17:22:49 +0000 | [diff] [blame] | 815 | v20.02 Public major release |
| 816 | - Various bug fixes. |
| 817 | - Various optimisations. |
| 818 | - Added new data type QASYMM8_SIGNED support for: |
| 819 | - @ref CLDepthwiseConvolutionLayer |
Manuel Bottini | 387259a | 2020-05-21 17:14:36 +0100 | [diff] [blame] | 820 | - CLDepthwiseConvolutionLayer3x3 |
Giuseppe Rossini | f04ddbc | 2020-02-17 17:22:49 +0000 | [diff] [blame] | 821 | - @ref CLGEMMConvolutionLayer |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 822 | - CLGEMMLowpMatrixMultiplyCore |
| 823 | - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel |
| 824 | - CLGEMMLowpMatrixMultiplyNativeKernel |
Giuseppe Rossini | f04ddbc | 2020-02-17 17:22:49 +0000 | [diff] [blame] | 825 | - @ref NEActivationLayer |
Sang-Hoon Park | 63001ac | 2021-01-18 14:20:27 +0000 | [diff] [blame] | 826 | - NEComparisonOperationKernel |
Giuseppe Rossini | f04ddbc | 2020-02-17 17:22:49 +0000 | [diff] [blame] | 827 | - @ref NEConvolutionLayer |
| 828 | - @ref NEDepthwiseConvolutionLayer |
Georgios Pinitas | 7d0adc6 | 2020-09-04 15:25:24 +0100 | [diff] [blame] | 829 | - NEDepthwiseConvolutionLayer3x3Kernel |
Manuel Bottini | 327225d | 2021-04-13 13:09:30 +0100 | [diff] [blame] | 830 | - NEDirectConvolutionLayerOutputStageKernel |
Giuseppe Rossini | f04ddbc | 2020-02-17 17:22:49 +0000 | [diff] [blame] | 831 | - @ref NEElementwiseComparison |
| 832 | - @ref NEElementwiseMax |
| 833 | - @ref NEElementwiseMin |
| 834 | - @ref NEElementwiseSquaredDiff |
| 835 | - @ref NEFullyConnectedLayer |
Michele Di Giorgio | f22f672 | 2020-07-03 16:29:24 +0100 | [diff] [blame] | 836 | - NEGEMMMatrixVectorMultiplyKernel |
Giuseppe Rossini | f04ddbc | 2020-02-17 17:22:49 +0000 | [diff] [blame] | 837 | - @ref NEPixelWiseMultiplication |
| 838 | - @ref NEPoolingLayer |
| 839 | - @ref NEPReluLayer |
| 840 | - Added support for QSYMM8_PER_CHANNEL in: |
Georgios Pinitas | 7d0adc6 | 2020-09-04 15:25:24 +0100 | [diff] [blame] | 841 | - NEDepthwiseConvolutionLayer3x3Kernel |
Giuseppe Rossini | f04ddbc | 2020-02-17 17:22:49 +0000 | [diff] [blame] | 842 | - Added support for split sizes in: |
| 843 | - @ref CLSplit |
| 844 | - @ref NESplit |
| 845 | - New OpenCL kernels / functions: |
| 846 | - @ref CLFill |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 847 | - CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 848 | - New Arm® Neon™ kernels / functions: |
Giuseppe Rossini | f04ddbc | 2020-02-17 17:22:49 +0000 | [diff] [blame] | 849 | - @ref NEFill |
Manuel Bottini | ae58bdf | 2021-06-17 17:18:45 +0100 | [diff] [blame] | 850 | - NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 851 | - Deprecated Arm® Neon™ functions / interfaces: |
Manuel Bottini | 387259a | 2020-05-21 17:14:36 +0100 | [diff] [blame] | 852 | - CLDepthwiseConvolutionLayer3x3 |
| 853 | - NEDepthwiseConvolutionLayerOptimized |
| 854 | - PoolingLayerInfo constructors without Data Layout. |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 855 | - Added support for quantization with multiplier greater than 1 on Arm® Neon™ and CL. |
Giuseppe Rossini | f04ddbc | 2020-02-17 17:22:49 +0000 | [diff] [blame] | 856 | - Added support for quantized inputs of type QASYMM8_SIGNED and QASYMM8 to @ref CLQuantizationLayer. |
| 857 | - Added the ability to build bootcode for bare metal. |
| 858 | - Added support for generating synthetic QASYMM8 graphs. |
| 859 | - Added support for F16 datatype in VGG16. |
| 860 | - Removed pre-built binaries for GLES. |
| 861 | |
Michele Di Giorgio | d374ff2 | 2020-01-21 10:03:20 +0000 | [diff] [blame] | 862 | v19.11.1 Public maintenance release |
| 863 | - Fix offset calculation in NEReductionOperationKernel. |
| 864 | - Fix data layout in NEScaleKernel for nhwc. |
| 865 | - Retain configuration step data layout to avoid side-effects. |
| 866 | - Perform sqrt in double domain for L2 pooling. |
| 867 | - Fix output shape calculation for Reduce Mean |
| 868 | - Restrict cases where optimized NEPadLayer runs. |
| 869 | |
Michele Di Giorgio | a046e16 | 2019-10-08 09:36:26 +0100 | [diff] [blame] | 870 | v19.11 Public major release |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 871 | - Various bug fixes. |
| 872 | - Various optimisations. |
SiCong Li | 1f7f988 | 2019-11-28 14:59:35 +0000 | [diff] [blame] | 873 | - Updated recommended NDK version to r17c. |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 874 | - Deprecated OpenCL kernels / functions: |
Michele Di Giorgio | a046e16 | 2019-10-08 09:36:26 +0100 | [diff] [blame] | 875 | - CLDepthwiseConvolutionLayerReshapeWeightsGenericKernel |
| 876 | - CLDepthwiseIm2ColKernel |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 877 | - CLDepthwiseSeparableConvolutionLayer |
Michele Di Giorgio | a046e16 | 2019-10-08 09:36:26 +0100 | [diff] [blame] | 878 | - CLDepthwiseVectorToTensorKernel |
| 879 | - CLDirectConvolutionLayerOutputStageKernel |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 880 | - Deprecated Arm® Neon™ kernels / functions: |
Giorgio Arena | d93e263 | 2019-10-15 11:09:33 +0100 | [diff] [blame] | 881 | - NEDepthwiseWeightsReshapeKernel |
| 882 | - NEDepthwiseIm2ColKernel |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 883 | - NEDepthwiseSeparableConvolutionLayer |
Giorgio Arena | d93e263 | 2019-10-15 11:09:33 +0100 | [diff] [blame] | 884 | - NEDepthwiseVectorToTensorKernel |
Manuel Bottini | 05069f0 | 2019-09-26 17:18:26 +0100 | [diff] [blame] | 885 | - NEDepthwiseConvolutionLayer3x3 |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 886 | - New OpenCL kernels / functions: |
| 887 | - @ref CLInstanceNormalizationLayerKernel / @ref CLInstanceNormalizationLayer |
| 888 | - @ref CLDepthwiseConvolutionLayerNativeKernel to replace the old generic depthwise convolution (see Deprecated |
| 889 | OpenCL kernels / functions) |
| 890 | - @ref CLLogSoftmaxLayer |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 891 | - New Arm® Neon™ kernels / functions: |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 892 | - @ref NEBoundingBoxTransformKernel / @ref NEBoundingBoxTransform |
Georgios Pinitas | 8c3c0e7 | 2020-12-03 20:11:53 +0000 | [diff] [blame] | 893 | - @ref NEComputeAllAnchorsKernel / NEComputeAllAnchors |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 894 | - @ref NEDetectionPostProcessLayer |
| 895 | - @ref NEGenerateProposalsLayer |
| 896 | - @ref NEInstanceNormalizationLayerKernel / @ref NEInstanceNormalizationLayer |
| 897 | - @ref NELogSoftmaxLayer |
| 898 | - @ref NEROIAlignLayerKernel / @ref NEROIAlignLayer |
| 899 | - Added QASYMM8 support for: |
| 900 | - @ref CLGenerateProposalsLayer |
| 901 | - @ref CLROIAlignLayer |
| 902 | - @ref CPPBoxWithNonMaximaSuppressionLimit |
| 903 | - Added QASYMM16 support for: |
| 904 | - @ref CLBoundingBoxTransform |
| 905 | - Added FP16 support for: |
Georgios Pinitas | 856f66e | 2021-04-22 21:13:21 +0100 | [diff] [blame] | 906 | - CLGEMMMatrixMultiplyReshapedKernel |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 907 | - Added new data type QASYMM8_PER_CHANNEL support for: |
Manuel Bottini | 9e73c93 | 2021-03-02 17:40:42 +0000 | [diff] [blame] | 908 | - CLDequantizationLayer |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 909 | - @ref NEDequantizationLayer |
| 910 | - Added new data type QSYMM8_PER_CHANNEL support for: |
| 911 | - @ref CLConvolutionLayer |
| 912 | - @ref NEConvolutionLayer |
| 913 | - @ref CLDepthwiseConvolutionLayer |
| 914 | - @ref NEDepthwiseConvolutionLayer |
| 915 | - Added FP16 mixed-precision support for: |
Georgios Pinitas | 856f66e | 2021-04-22 21:13:21 +0100 | [diff] [blame] | 916 | - CLGEMMMatrixMultiplyReshapedKernel |
Michele Di Giorgio | e131466 | 2021-02-01 17:09:32 +0000 | [diff] [blame] | 917 | - CLPoolingLayerKernel |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 918 | - Added FP32 and FP16 ELU activation for: |
| 919 | - @ref CLActivationLayer |
| 920 | - @ref NEActivationLayer |
| 921 | - Added asymmetric padding support for: |
| 922 | - @ref CLDirectDeconvolutionLayer |
| 923 | - @ref CLGEMMDeconvolutionLayer |
| 924 | - @ref NEDeconvolutionLayer |
| 925 | - Added SYMMETRIC and REFLECT modes for @ref CLPadLayerKernel / @ref CLPadLayer. |
Georgios Pinitas | 0f7ef8a | 2021-01-10 04:23:52 +0000 | [diff] [blame] | 926 | - Replaced the calls to NECopyKernel and NEMemsetKernel with @ref NEPadLayer in @ref NEGenerateProposalsLayer. |
| 927 | - Replaced the calls to CLCopyKernel and CLMemsetKernel with @ref CLPadLayer in @ref CLGenerateProposalsLayer. |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 928 | - Improved performance for CL Inception V3 - FP16. |
| 929 | - Improved accuracy for CL Inception V3 - FP16 by enabling FP32 accumulator (mixed-precision). |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 930 | - Improved Arm® Neon™ performance by enabling fusing batch normalization with convolution and depth-wise convolution layer. |
| 931 | - Improved Arm® Neon™ performance for MobileNet-SSD by improving the output detection performance. |
SiCong Li | ca1f98c | 2019-11-28 11:06:11 +0000 | [diff] [blame] | 932 | - Optimized @ref CLPadLayer. |
| 933 | - Optimized CL generic depthwise convolution layer by introducing @ref CLDepthwiseConvolutionLayerNativeKernel. |
| 934 | - Reduced memory consumption by implementing weights sharing. |
Michele Di Giorgio | a046e16 | 2019-10-08 09:36:26 +0100 | [diff] [blame] | 935 | |
Michele Di Giorgio | d374ff2 | 2020-01-21 10:03:20 +0000 | [diff] [blame] | 936 | v19.08.1 Public maintenance release |
| 937 | - Fix offset calculation in NEReductionOperationKernel. |
| 938 | - Fix data layout in NEScaleKernel for nhwc. |
| 939 | - Retain configuration step data layout to avoid side-effects. |
| 940 | - Perform sqrt in double domain for L2 pooling. |
| 941 | - Fix output shape calculation for Reduce Mean |
| 942 | - Fix broadcast CLPixelwiseMultiplication with 5D tensors |
| 943 | |
Georgios Pinitas | 3d13af8 | 2019-06-04 13:04:16 +0100 | [diff] [blame] | 944 | v19.08 Public major release |
| 945 | - Various bug fixes. |
| 946 | - Various optimisations. |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 947 | - Deprecated Arm® Neon™ functions |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 948 | - NEDepthConcatenateLayer |
| 949 | - NEWidthConcatenateLayer |
| 950 | - Deprecated OpenCL kernels / functions |
| 951 | - CLDepthConcatenateLayer |
| 952 | - CLGEMMInterleave4x4Kernel / CLGEMMInterleave4x4 |
| 953 | - CLGEMMTranspose1xWKernel / CLGEMMTranspose1xW |
| 954 | - CLWidthConcatenateLayer |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 955 | - New Arm® Neon™ kernels / functions: |
Gian Marco Iodice | c5f48ad | 2019-09-02 09:52:12 +0100 | [diff] [blame] | 956 | - @ref NEAbsLayer |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 957 | - @ref NECast |
Gian Marco Iodice | c5f48ad | 2019-09-02 09:52:12 +0100 | [diff] [blame] | 958 | - @ref NEElementwisePower |
| 959 | - @ref NELogLayer |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 960 | - @ref NELSTMLayerQuantized |
Gian Marco Iodice | c5f48ad | 2019-09-02 09:52:12 +0100 | [diff] [blame] | 961 | - @ref NENegLayer |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 962 | - @ref NEPReluLayer |
Gian Marco Iodice | c5f48ad | 2019-09-02 09:52:12 +0100 | [diff] [blame] | 963 | - @ref NESinLayer |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 964 | - NEBatchConcatenateLayerKernel |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 965 | - @ref NEDepthToSpaceLayerKernel / @ref NEDepthToSpaceLayer |
Michalis Spyrou | 60c3b0e | 2021-04-08 12:02:58 +0100 | [diff] [blame] | 966 | - NEDepthwiseConvolutionLayerNativeKernel |
Manuel Bottini | ae58bdf | 2021-06-17 17:18:45 +0100 | [diff] [blame] | 967 | - NEGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 968 | - @ref NEMeanStdDevNormalizationKernel / @ref NEMeanStdDevNormalizationLayer |
| 969 | - @ref NESpaceToDepthLayerKernel / @ref NESpaceToDepthLayer |
| 970 | - New OpenCL kernels / functions: |
Gian Marco Iodice | c5f48ad | 2019-09-02 09:52:12 +0100 | [diff] [blame] | 971 | - @ref CLAbsLayer |
| 972 | - @ref CLElementwisePower |
| 973 | - @ref CLLogLayer |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 974 | - @ref CLLSTMLayerQuantized |
Gian Marco Iodice | c5f48ad | 2019-09-02 09:52:12 +0100 | [diff] [blame] | 975 | - @ref CLNegLayer |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 976 | - @ref CLPReluLayer |
Gian Marco Iodice | c5f48ad | 2019-09-02 09:52:12 +0100 | [diff] [blame] | 977 | - @ref CLSinLayer |
Michele Di Giorgio | 7d61ff0 | 2021-01-18 21:15:59 +0000 | [diff] [blame] | 978 | - CLBatchConcatenateLayerKernel |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 979 | - @ref CLDepthToSpaceLayerKernel / @ref CLDepthToSpaceLayer |
Georgios Pinitas | 856f66e | 2021-04-22 21:13:21 +0100 | [diff] [blame] | 980 | - CLGEMMLowpMatrixMultiplyNativeKernel |
Michele Di Giorgio | ba14c92 | 2020-10-12 13:27:57 +0100 | [diff] [blame] | 981 | - CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel |
Georgios Pinitas | 856f66e | 2021-04-22 21:13:21 +0100 | [diff] [blame] | 982 | - CLGEMMMatrixMultiplyNativeKernel |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 983 | - CLMeanStdDevNormalizationKernel /CLMeanStdDevNormalizationLayer |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 984 | - @ref CLSpaceToDepthLayerKernel / @ref CLSpaceToDepthLayer |
| 985 | - New examples: |
| 986 | - neon_opticalflow |
| 987 | - cl_cache |
| 988 | - neon_permute |
Gian Marco Iodice | c5f48ad | 2019-09-02 09:52:12 +0100 | [diff] [blame] | 989 | - Added support for FP16 in @ref NEDeconvolutionLayer |
| 990 | - Added support for FP16 in @ref CLDeconvolutionLayer |
| 991 | - Added support for REDUCE_MIN and REDUCE_MAX in @ref ReductionOperation |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 992 | - Enable the fusion of batch normalization with convolution and depthwise convolution layer for FP32 in the graph API (OpenCL only) |
| 993 | - Added support for fusing activation function and broadcast addition with the matrix multiplication for FP32 (OpenCL only) |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 994 | - Re-factored the depthwise convolution layer kernel on Arm® Neon™ for generic cases |
Jakub Sujak | ee301b3 | 2021-06-04 09:46:08 +0100 | [diff] [blame] | 995 | - Added an optimized depthwise convolution layer kernel for 5x5 filters (Neon™ only) |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 996 | - Added support to enable OpenCL kernel cache. Added example showing how to load the prebuilt OpenCL kernels from a binary cache file |
| 997 | - Altered @ref QuantizationInfo interface to support per-channel quantization. |
Manuel Bottini | 387259a | 2020-05-21 17:14:36 +0100 | [diff] [blame] | 998 | - The CLDepthwiseConvolutionLayer3x3 will be included by @ref CLDepthwiseConvolutionLayer to accommodate for future optimizations. |
| 999 | - The NEDepthwiseConvolutionLayerOptimized will be included by @ref NEDepthwiseConvolutionLayer to accommodate for future optimizations. |
Gian Marco Iodice | cc2f54b | 2019-08-22 10:10:52 +0100 | [diff] [blame] | 1000 | - Removed inner_border_right and inner_border_top parameters from @ref CLDeconvolutionLayer interface |
| 1001 | - Removed inner_border_right and inner_border_top parameters from @ref NEDeconvolutionLayer interface |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1002 | - Optimized the Arm® Neon™ assembly kernel for GEMMLowp. The new implementation fuses the output stage and quantization with the matrix multiplication kernel |
Georgios Pinitas | 3d13af8 | 2019-06-04 13:04:16 +0100 | [diff] [blame] | 1003 | |
Michalis Spyrou | a9c4472 | 2019-04-05 17:18:36 +0100 | [diff] [blame] | 1004 | v19.05 Public major release |
Michalis Spyrou | c6608ac | 2019-05-16 17:40:23 +0100 | [diff] [blame] | 1005 | - Various bug fixes. |
| 1006 | - Various optimisations. |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1007 | - New Arm® Neon™ kernels / functions: |
Georgios Pinitas | f790fdb | 2019-04-24 12:41:25 +0100 | [diff] [blame] | 1008 | - @ref NEBatchToSpaceLayerKernel / @ref NEBatchToSpaceLayer |
Sheri Zhang | 1e3ab42 | 2021-03-16 17:35:08 +0000 | [diff] [blame] | 1009 | - NEComplexPixelWiseMultiplicationKernel / @ref NEComplexPixelWiseMultiplication |
Georgios Pinitas | f790fdb | 2019-04-24 12:41:25 +0100 | [diff] [blame] | 1010 | - @ref NECropKernel / @ref NECropResize |
Michalis Spyrou | 60c3b0e | 2021-04-08 12:02:58 +0100 | [diff] [blame] | 1011 | - NEDepthwiseConvolutionAssemblyDispatch |
Michalis Spyrou | ca82e62 | 2019-05-10 16:43:20 +0100 | [diff] [blame] | 1012 | - @ref NEFFTDigitReverseKernel |
| 1013 | - @ref NEFFTRadixStageKernel |
| 1014 | - @ref NEFFTScaleKernel |
Manuel Bottini | cfac51c | 2021-06-18 15:47:28 +0100 | [diff] [blame] | 1015 | - NEGEMMLowpOffsetContributionOutputStageKernel |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 1016 | - NEHeightConcatenateLayerKernel |
Georgios Pinitas | f790fdb | 2019-04-24 12:41:25 +0100 | [diff] [blame] | 1017 | - @ref NESpaceToBatchLayerKernel / @ref NESpaceToBatchLayer |
Michalis Spyrou | d7dd15c | 2019-05-30 14:53:58 +0100 | [diff] [blame] | 1018 | - @ref NEFFT1D |
| 1019 | - @ref NEFFT2D |
| 1020 | - @ref NEFFTConvolutionLayer |
Georgios Pinitas | f790fdb | 2019-04-24 12:41:25 +0100 | [diff] [blame] | 1021 | - New OpenCL kernels / functions: |
Sheri Zhang | f9ab9f9 | 2021-03-16 12:09:15 +0000 | [diff] [blame] | 1022 | - CLComplexPixelWiseMultiplicationKernel / @ref CLComplexPixelWiseMultiplication |
Sheri Zhang | 7e20e29 | 2021-02-02 11:49:34 +0000 | [diff] [blame] | 1023 | - CLCropKernel / @ref CLCropResize |
Michalis Spyrou | d7dd15c | 2019-05-30 14:53:58 +0100 | [diff] [blame] | 1024 | - @ref CLDeconvolutionReshapeOutputKernel |
Georgios Pinitas | f790fdb | 2019-04-24 12:41:25 +0100 | [diff] [blame] | 1025 | - @ref CLFFTDigitReverseKernel |
| 1026 | - @ref CLFFTRadixStageKernel |
| 1027 | - @ref CLFFTScaleKernel |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 1028 | - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel |
Georgios Pinitas | 856f66e | 2021-04-22 21:13:21 +0100 | [diff] [blame] | 1029 | - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel |
Michele Di Giorgio | 7d61ff0 | 2021-01-18 21:15:59 +0000 | [diff] [blame] | 1030 | - CLHeightConcatenateLayerKernel |
Georgios Pinitas | f790fdb | 2019-04-24 12:41:25 +0100 | [diff] [blame] | 1031 | - @ref CLDirectDeconvolutionLayer |
| 1032 | - @ref CLFFT1D |
| 1033 | - @ref CLFFT2D |
| 1034 | - @ref CLFFTConvolutionLayer |
Michalis Spyrou | ca82e62 | 2019-05-10 16:43:20 +0100 | [diff] [blame] | 1035 | - @ref CLGEMMDeconvolutionLayer |
| 1036 | - New OpenGLES kernels / functions: |
Manuel Bottini | ceaa0bf | 2021-02-16 15:15:19 +0000 | [diff] [blame] | 1037 | - GCConcatenateLayer |
Michalis Spyrou | a9c4472 | 2019-04-05 17:18:36 +0100 | [diff] [blame] | 1038 | - Deprecated functions/interfaces |
Georgios Pinitas | 09f2497 | 2019-05-17 18:14:40 +0100 | [diff] [blame] | 1039 | - GCDepthConcatenateLayer |
| 1040 | - NEWidthConcatenateLayer |
| 1041 | - NEDepthConcatenateLayer |
| 1042 | - CLWidthConcatenateLayer |
| 1043 | - CLDepthConcatenateLayer |
Gian Marco Iodice | 5fc07aa | 2019-05-15 17:08:02 +0100 | [diff] [blame] | 1044 | - CLGEMMInterleave4x4 |
| 1045 | - CLGEMMTranspose1xW |
Michalis Spyrou | c6608ac | 2019-05-16 17:40:23 +0100 | [diff] [blame] | 1046 | - Support different quantization info in CLConcatLayer. |
| 1047 | - Add checks on different input/output quantization info were not supported. |
| 1048 | - Tensors have different quantization information. |
| 1049 | - Add FP16 support checks. |
| 1050 | - Fix output quantization CLDeptwiseConv3x3 when activation is fused. |
| 1051 | - New graph examples: |
| 1052 | - graph_convolution |
| 1053 | - graph_fully_connected |
| 1054 | - graph_depthwise_convolution |
| 1055 | - Deepspeech v0.4.1 |
| 1056 | - Add support for QASYMM8 in NEArithmeticSubtractionKernel. |
| 1057 | - Add support for QASYMM8 in NEPixelWiseMultiplicationKernel. |
| 1058 | - Add support for QASYMM8 NEDeconvolution. |
Sheri Zhang | ac6499a | 2021-02-10 15:32:38 +0000 | [diff] [blame] | 1059 | - Add support for DequantizationLayer for Neon/CL. |
Michalis Spyrou | c6608ac | 2019-05-16 17:40:23 +0100 | [diff] [blame] | 1060 | - Add support for dilation in CLDepthwiseConvolution. |
| 1061 | - Fuse offset contribution with the output stage when we use NEGEMMLowpMatrixMultiplyCore. |
| 1062 | - Optimize CLDeconvolution. |
| 1063 | - Add StackLayer to the graph API. |
| 1064 | - Add support for "reflect" padding mode in NEPad. |
| 1065 | - Winograd 7x7 NHWC on OpenCL. |
| 1066 | - Rework CL ML layers to run exclusively on CL. |
| 1067 | - Support different quantization info in PoolingLayer. |
| 1068 | - Implement and test import memory interfaces. |
| 1069 | - Added new tests and removed old ones. |
| 1070 | - Various clang-tidy fixes. |
Michalis Spyrou | a9c4472 | 2019-04-05 17:18:36 +0100 | [diff] [blame] | 1071 | |
giuros01 | a69a88b | 2019-01-31 16:29:19 +0000 | [diff] [blame] | 1072 | v19.02 Public major release |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1073 | - Various bug fixes. |
| 1074 | - Various optimisations. |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1075 | - New Arm® Neon™ kernels / functions: |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1076 | - @ref NETileKernel / @ref NETile |
| 1077 | - @ref NEFuseBatchNormalizationKernel / @ref NEFuseBatchNormalization |
Sang-Hoon Park | 63001ac | 2021-01-18 14:20:27 +0000 | [diff] [blame] | 1078 | - NEElementwiseOperationKernel |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1079 | - @ref NEElementwiseMax |
| 1080 | - @ref NEElementwiseMin |
| 1081 | - @ref NEElementwiseSquaredDiff |
| 1082 | - @ref NESelectKernel / @ref NESelect |
| 1083 | - @ref NESplit |
| 1084 | - @ref NESlice |
| 1085 | - @ref NEUnstack |
| 1086 | - @ref NEStridedSliceKernel / @ref NEStridedSlice |
Sang-Hoon Park | 7249f15 | 2021-01-22 11:55:03 +0000 | [diff] [blame] | 1087 | - NEElementwiseUnaryKernel |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1088 | - @ref NERsqrtLayer |
| 1089 | - @ref NEExpLayer |
| 1090 | - @ref NEReverseKernel / @ref NEReverse |
| 1091 | - @ref NEArgMinMaxLayer |
| 1092 | - @ref NEStackLayerKernel / @ref NEStackLayer |
| 1093 | - @ref NERangeKernel / @ref NERange |
| 1094 | - @ref NEPadLayer |
Georgios Pinitas | 0f7ef8a | 2021-01-10 04:23:52 +0000 | [diff] [blame] | 1095 | - NEMemsetKernel |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1096 | - @ref NEGatherKernel / @ref NEGather |
| 1097 | - @ref NEElementwiseComparison |
| 1098 | - @ref NEElementwiseComparisonStatic |
Sang-Hoon Park | 63001ac | 2021-01-18 14:20:27 +0000 | [diff] [blame] | 1099 | - NEComparisonOperationKernel |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1100 | - @ref NEElementwiseDivision |
| 1101 | - New OpenCL kernels / functions: |
| 1102 | - @ref CLSelectKernel / @ref CLSelect |
| 1103 | - @ref CLTileKernel / @ref CLTile |
| 1104 | - @ref CLComparisonKernel / @ref CLComparison |
| 1105 | - @ref CLArgMinMaxLayer |
| 1106 | - @ref CLElementwiseMax |
| 1107 | - @ref CLElementwiseMin |
| 1108 | - @ref CLElementwiseSquaredDiff |
| 1109 | - @ref CLStackLayerKernel / @ref CLStackLayer |
| 1110 | - @ref CLReverse / @ref CLReverseKernel |
| 1111 | - @ref CLRsqrtLayer |
| 1112 | - @ref CLExpLayer |
Michele Di Giorgio | c9c8905 | 2021-01-26 10:20:17 +0000 | [diff] [blame] | 1113 | - CLElementWiseUnaryLayerKernel |
Georgios Pinitas | 856f66e | 2021-04-22 21:13:21 +0100 | [diff] [blame] | 1114 | - CLGEMMReshapeLHSMatrixKernel |
| 1115 | - CLGEMMReshapeRHSMatrixKernel |
| 1116 | - CLGEMMMatrixMultiplyReshapedKernel |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1117 | - @ref CLRangeKernel / @ref CLRange |
| 1118 | - @ref CLUnstack |
| 1119 | - @ref CLGatherKernel / @ref CLGather |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 1120 | - CLGEMMLowpMatrixMultiplyReshapedKernel |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1121 | - New CPP kernels / functions: |
| 1122 | - @ref CPPDetectionOutputLayer |
| 1123 | - @ref CPPTopKV / @ref CPPTopKVKernel |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1124 | - Added new examples: |
| 1125 | - graph_ssd_mobilenet.cpp |
| 1126 | - graph_mobilenet_v2.cpp |
| 1127 | - graph_resnet12.cpp |
| 1128 | - graph_srcnn955.cpp |
| 1129 | - graph_vgg_vdsr.cpp |
| 1130 | - graph_inception_resnet_v1.cpp |
| 1131 | - Add 4D tensors support to |
| 1132 | - @ref NESoftmaxLayer |
| 1133 | - Fused activation in @ref CLWinogradConvolutionLayer |
Jakub Sujak | ee301b3 | 2021-06-04 09:46:08 +0100 | [diff] [blame] | 1134 | - Extended @ref NEPermute to support more cases |
| 1135 | - Added Neon™/SVE GEMM Hybrid kernels |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1136 | - Added u8 and s8 hybrid assembly kernels |
| 1137 | - Introduced GEMM strategy name in NEGEMMAssemblyWrapper |
| 1138 | - Improved @ref CLTuner |
| 1139 | - Fused the bias addition within @ref CLGEMM |
| 1140 | - Added support for QASYMM8 LOGISTIC activation in @ref NEActivationLayer |
| 1141 | - Added NHWC data layout support to: |
| 1142 | - @ref NEScale for F16 |
| 1143 | - @ref CLNormalizationLayer IN_MAP_2D for FP32/FP16 |
| 1144 | - @ref NEL2NormalizeLayer for FP32/FP16 |
| 1145 | - @ref NENormalizationLayer IN_MAP_2D for FP32/FP16 |
| 1146 | - @ref CLROIAlignLayer |
Manuel Bottini | 5209be5 | 2019-02-13 16:34:56 +0000 | [diff] [blame] | 1147 | - @ref CLGenerateProposalsLayer |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1148 | - Added QASYMM8 support to the following kernels: |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 1149 | - NEArithmeticAdditionKernel |
Isabella Gottardi | 6253897 | 2019-02-12 19:52:44 +0000 | [diff] [blame] | 1150 | - @ref NEScale |
| 1151 | - Added new tests and improved validation and benchmarking suites. |
giuros01 | a69a88b | 2019-01-31 16:29:19 +0000 | [diff] [blame] | 1152 | - Deprecated functions/interfaces |
| 1153 | - Usage of inner_border_right and inner_border_top has been deprecated in @ref CLDeconvolutionLayer and @ref NEDeconvolutionLayer |
| 1154 | |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1155 | v18.11 Public major release |
| 1156 | - Various bug fixes. |
| 1157 | - Various optimisations. |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1158 | - New Arm® Neon™ kernels / functions: |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1159 | - @ref NEChannelShuffleLayer / @ref NEChannelShuffleLayerKernel |
| 1160 | - @ref NEReduceMean |
| 1161 | - @ref NEReorgLayer / @ref NEReorgLayerKernel |
| 1162 | - @ref NEPriorBoxLayer / @ref NEPriorBoxLayerKernel |
Georgios Pinitas | c53266e | 2020-12-09 03:11:53 +0000 | [diff] [blame] | 1163 | - NEUpsampleLayer / NEUpsampleLayerKernel |
Georgios Pinitas | 0b1c2db | 2020-12-04 15:51:34 +0000 | [diff] [blame] | 1164 | - NEYOLOLayer / NEYOLOLayerKernel |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1165 | - New OpenCL kernels / functions: |
| 1166 | - @ref CLBatchToSpaceLayer / @ref CLBatchToSpaceLayerKernel |
| 1167 | - @ref CLBoundingBoxTransform / @ref CLBoundingBoxTransformKernel |
Manuel Bottini | 5209be5 | 2019-02-13 16:34:56 +0000 | [diff] [blame] | 1168 | - @ref CLComputeAllAnchorsKernel |
| 1169 | - @ref CLGenerateProposalsLayer |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1170 | - @ref CLNormalizePlanarYUVLayer / @ref CLNormalizePlanarYUVLayerKernel |
| 1171 | - @ref CLReorgLayer / @ref CLReorgLayerKernel |
| 1172 | - @ref CLSpaceToBatchLayer / @ref CLSpaceToBatchLayerKernel |
| 1173 | - @ref CLPadLayer |
| 1174 | - @ref CLReduceMean |
| 1175 | - @ref CLPriorBoxLayer / @ref CLPriorBoxLayerKernel |
| 1176 | - @ref CLROIAlignLayer / @ref CLROIAlignLayerKernel |
| 1177 | - @ref CLSlice |
| 1178 | - @ref CLSplit |
| 1179 | - @ref CLStridedSlice / @ref CLStridedSliceKernel |
Georgios Pinitas | c53266e | 2020-12-09 03:11:53 +0000 | [diff] [blame] | 1180 | - CLUpsampleLayer / CLUpsampleLayerKernel |
Georgios Pinitas | 0b1c2db | 2020-12-04 15:51:34 +0000 | [diff] [blame] | 1181 | - CLYOLOLayer / CLYOLOLayerKernel |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1182 | - New CPP kernels / functions: |
| 1183 | - @ref CPPBoxWithNonMaximaSuppressionLimit / @ref CPPBoxWithNonMaximaSuppressionLimitKernel |
| 1184 | - Added the validate method in: |
| 1185 | - @ref NEDepthConvertLayer |
| 1186 | - @ref NEFloor / @ref CLFloor |
Michele Di Giorgio | 93b75e0 | 2021-06-21 12:00:43 +0100 | [diff] [blame] | 1187 | - NEGEMMMatrixAdditionKernel |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1188 | - @ref NEReshapeLayer / @ref CLReshapeLayer |
| 1189 | - @ref CLScale |
| 1190 | - Added new examples: |
| 1191 | - graph_shufflenet.cpp |
| 1192 | - graph_yolov3.cpp |
| 1193 | - Added documentation for add a new function or kernel. |
| 1194 | - Improved doxygen documentation adding a list of the existing functions. |
| 1195 | - Add 4D tensors support to |
Georgios Pinitas | 09f2497 | 2019-05-17 18:14:40 +0100 | [diff] [blame] | 1196 | - CLWidthConcatenateLayer |
Georgios Pinitas | e2696b1 | 2020-12-03 20:37:43 +0000 | [diff] [blame] | 1197 | - CLFlattenLayer |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1198 | - @ref CLSoftmaxLayer |
Gian Marco Iodice | 8155c02 | 2021-04-16 15:08:59 +0100 | [diff] [blame] | 1199 | - Add dot product support for CLDepthwiseConvolutionLayer3x3NHWCKernel non-unit stride |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1200 | - Add SVE support |
| 1201 | - Fused batch normalization into convolution layer weights in @ref CLFuseBatchNormalization |
Gian Marco Iodice | 8155c02 | 2021-04-16 15:08:59 +0100 | [diff] [blame] | 1202 | - Fuses activation in CLDepthwiseConvolutionLayer3x3NCHWKernel, CLDepthwiseConvolutionLayer3x3NHWCKernel and @ref NEGEMMConvolutionLayer |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1203 | - Added NHWC data layout support to: |
| 1204 | - @ref CLChannelShuffleLayer |
| 1205 | - @ref CLDeconvolutionLayer |
| 1206 | - @ref CLL2NormalizeLayer |
| 1207 | - Added QASYMM8 support to the following kernels: |
Manuel Bottini | 3b131ab | 2021-02-19 18:16:44 +0000 | [diff] [blame] | 1208 | - CLScaleKernel |
Georgios Pinitas | 7d0adc6 | 2020-09-04 15:25:24 +0100 | [diff] [blame] | 1209 | - NEDepthwiseConvolutionLayer3x3Kernel |
Sheri Zhang | f9ab9f9 | 2021-03-16 12:09:15 +0000 | [diff] [blame] | 1210 | - CLPixelWiseMultiplicationKernel |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1211 | - Added FP16 support to the following kernels: |
Gian Marco Iodice | 8155c02 | 2021-04-16 15:08:59 +0100 | [diff] [blame] | 1212 | - CLDepthwiseConvolutionLayer3x3NHWCKernel |
Georgios Pinitas | 7d0adc6 | 2020-09-04 15:25:24 +0100 | [diff] [blame] | 1213 | - NEDepthwiseConvolutionLayer3x3Kernel |
Isabella Gottardi | 8773d7c | 2018-11-20 09:56:46 +0000 | [diff] [blame] | 1214 | - @ref CLNormalizePlanarYUVLayerKernel |
| 1215 | - @ref CLWinogradConvolutionLayer (5x5 kernel) |
| 1216 | - More tests added to both validation and benchmarking suites. |
| 1217 | |
Anthony Barbier | d51ea0a | 2018-08-07 17:48:03 +0100 | [diff] [blame] | 1218 | v18.08 Public major release |
| 1219 | - Various bug fixes. |
Michele Di Giorgio | 02baf01 | 2018-08-20 18:10:38 +0100 | [diff] [blame] | 1220 | - Various optimisations. |
Anthony Barbier | d51ea0a | 2018-08-07 17:48:03 +0100 | [diff] [blame] | 1221 | - Updated recommended NDK version to r17b. |
Michele Di Giorgio | 02baf01 | 2018-08-20 18:10:38 +0100 | [diff] [blame] | 1222 | - Removed support for QS8/QS16 data types. |
| 1223 | - Added support for grouped convolution in @ref CLConvolutionLayer. |
| 1224 | - Added NHWC data layout support to: |
Georgios Pinitas | 09f2497 | 2019-05-17 18:14:40 +0100 | [diff] [blame] | 1225 | - NEDepthConcatenateLayer / CLDepthConcatenateLayer |
Michele Di Giorgio | 02baf01 | 2018-08-20 18:10:38 +0100 | [diff] [blame] | 1226 | - @ref NEWinogradConvolutionLayer / @ref CLWinogradConvolutionLayer |
| 1227 | - @ref CLDepthwiseConvolutionLayer |
| 1228 | - @ref CLDirectConvolutionLayer |
| 1229 | - @ref CLConvolutionLayer |
| 1230 | - @ref CLScale |
Manuel Bottini | d844c08 | 2021-07-14 12:58:54 +0100 | [diff] [blame] | 1231 | - CLIm2ColKernel |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1232 | - New Arm® Neon™ kernels / functions: |
Michele Di Giorgio | 02baf01 | 2018-08-20 18:10:38 +0100 | [diff] [blame] | 1233 | - @ref NERNNLayer |
| 1234 | - New OpenCL kernels / functions: |
| 1235 | - @ref CLArithmeticDivision |
| 1236 | - Introduced prepare() stage support in the graph API for GLES. |
| 1237 | - Added support for memory reusage when trying to allocate smaller CLTensors. |
| 1238 | - Enabled NHWC execution on graph examples. |
| 1239 | - Added JPEG accessor for validation purposes. |
| 1240 | - Added validate methods to some kernels / functions. |
Anthony Barbier | d51ea0a | 2018-08-07 17:48:03 +0100 | [diff] [blame] | 1241 | |
| 1242 | v18.05 Public major release |
Pablo Tello | b5cc95b | 2018-05-15 11:49:33 +0100 | [diff] [blame] | 1243 | - Various bug fixes. |
| 1244 | - Various optimisations. |
Jakub Sujak | ee301b3 | 2021-06-04 09:46:08 +0100 | [diff] [blame] | 1245 | - Major redesign in the interface for the Neon™ kernels implemented in assembly. |
Pablo Tello | eb82fd2 | 2018-02-23 13:43:50 +0000 | [diff] [blame] | 1246 | - Removed arm_compute::NEGEMMLowpAArch64A53Kernel / arm_compute::NEGEMMLowpAArch64Kernel / arm_compute::NEGEMMLowpAArch64V8P4Kernel / arm_compute::NEGEMMInterleavedBlockedKernel / arm_compute::NEGEMMLowpAssemblyMatrixMultiplyCore / arm_compute::NEHGEMMAArch64FP16Kernel |
Jakub Sujak | ee301b3 | 2021-06-04 09:46:08 +0100 | [diff] [blame] | 1247 | - Added NEGEMMAssemblyWrapper and AssemblyKernelGlue which are used to execute assembly kernels in Neon™ functions. |
Pablo Tello | eb82fd2 | 2018-02-23 13:43:50 +0000 | [diff] [blame] | 1248 | - Minor changes to the CPUInfo type to make it compatible with the new assembly gemm interface. |
Jakub Sujak | ee301b3 | 2021-06-04 09:46:08 +0100 | [diff] [blame] | 1249 | - Moved Neon™ assembly kernels to the folder src/core/Neon/kernels/arm_gemm. |
Pablo Tello | b5cc95b | 2018-05-15 11:49:33 +0100 | [diff] [blame] | 1250 | - Improved doxygen documentation. |
| 1251 | - Improved memory management for layer's transitions. |
| 1252 | - Added support for NHWC data layout in tensors. |
| 1253 | - Added NHWC data layout support to: |
| 1254 | - @ref NEGEMMConvolutionLayer |
| 1255 | - @ref NEDirectConvolutionLayer |
| 1256 | - @ref NEPoolingLayer / @ref CLPoolingLayer |
| 1257 | - @ref NEBatchNormalizationLayer / @ref CLBatchNormalizationLayer |
| 1258 | - @ref NEDepthwiseConvolutionLayer |
| 1259 | - @ref NEScale |
Georgios Pinitas | f7c5a41 | 2020-12-03 14:38:33 +0000 | [diff] [blame] | 1260 | - NEIm2Col |
Pablo Tello | b5cc95b | 2018-05-15 11:49:33 +0100 | [diff] [blame] | 1261 | - Added support for dilated convolutions in @ref NEConvolutionLayer and @ref CLConvolutionLayer. |
| 1262 | - New OpenCL kernels / functions: |
| 1263 | - @ref CLChannelShuffleLayer / @ref CLChannelShuffleLayerKernel |
Teresa Charlin | 91b7f74 | 2021-04-12 13:57:00 +0100 | [diff] [blame] | 1264 | - CLConvertFullyConnectedWeightsKernel / @ref CLConvertFullyConnectedWeights |
Sheri Zhang | 7e20e29 | 2021-02-02 11:49:34 +0000 | [diff] [blame] | 1265 | - @ref CLCopy / CLCopyKernel |
Anthony Barbier | 38e7f1f | 2018-05-21 13:37:47 +0100 | [diff] [blame] | 1266 | - @ref CLLSTMLayer |
Pablo Tello | b5cc95b | 2018-05-15 11:49:33 +0100 | [diff] [blame] | 1267 | - @ref CLRNNLayer |
Michele Di Giorgio | 7d61ff0 | 2021-01-18 21:15:59 +0000 | [diff] [blame] | 1268 | - CLWidthConcatenateLayer / CLWidthConcatenateLayerKernel |
Manuel Bottini | c6f4ec3 | 2021-05-18 18:41:56 +0100 | [diff] [blame] | 1269 | - CLWinogradFilterTransformKernel / @ref CLWinogradConvolutionLayer |
| 1270 | - CLWinogradInputTransformKernel / CLWinogradInputTransform |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1271 | - New Arm® Neon™ kernels / functions: |
Teresa Charlin | 562bee5 | 2021-04-13 17:44:15 +0100 | [diff] [blame] | 1272 | - NEConvertFullyConnectedWeightsKernel / @ref NEConvertFullyConnectedWeights. |
Pablo Tello | b5cc95b | 2018-05-15 11:49:33 +0100 | [diff] [blame] | 1273 | - Created the validate method in @ref CLDepthwiseConvolutionLayer. |
| 1274 | - Beta and gamma are no longer mandatory arguments in @ref NEBatchNormalizationLayer and @ref CLBatchNormalizationLayer. |
| 1275 | - Added depth multiplier support in @ref NEDepthwiseConvolutionLayer and @ref CLDepthwiseConvolutionLayer. |
Sheri Zhang | 1e3ab42 | 2021-03-16 17:35:08 +0000 | [diff] [blame] | 1276 | - Added broadcast multiply support in @ref NEPixelWiseMultiplication / NEPixelWiseMultiplicationKernel. |
Pablo Tello | b5cc95b | 2018-05-15 11:49:33 +0100 | [diff] [blame] | 1277 | - Port mobilenet example to NHWC data layout. |
| 1278 | - Enabled Winograd method in @ref CLConvolutionLayer. |
| 1279 | - Renamed NEWinogradLayer to @ref NEWinogradConvolutionLayer. |
Sheri Zhang | ac6499a | 2021-02-10 15:32:38 +0000 | [diff] [blame] | 1280 | - Updated @ref NEWinogradConvolutionLayer to use highly optimised assembly kernels in src/core/Neon/kernels/arm_gemm. |
Pablo Tello | b5cc95b | 2018-05-15 11:49:33 +0100 | [diff] [blame] | 1281 | - Added memory manager support in GLES functions. |
| 1282 | - Major refactoring of the graph API. |
| 1283 | - Added GLES backend in the graph API. |
| 1284 | - Added support for the memory manager in the graph API. |
| 1285 | - Enabled Winograd Convolution method in the graph API. |
| 1286 | - Added support for grouped convolutions in the graph API. |
Manuel Bottini | 10b3826 | 2021-02-19 18:16:44 +0000 | [diff] [blame] | 1287 | - Replaced NEDeconvolutionLayerUpsampleKernel with NEScaleKernel in @ref NEDeconvolutionLayer. |
Pablo Tello | b5cc95b | 2018-05-15 11:49:33 +0100 | [diff] [blame] | 1288 | - Added fast maths flag in @ref CLConvolutionLayer. |
| 1289 | - Added new tests and benchmarks in validation and benchmark frameworks |
Jakub Sujak | ee301b3 | 2021-06-04 09:46:08 +0100 | [diff] [blame] | 1290 | - Merge Activation layer with Convolution Layer (Neon™, CL, GLES) |
Pablo Tello | b5cc95b | 2018-05-15 11:49:33 +0100 | [diff] [blame] | 1291 | - Added support to OpenCL 2.0 SVM |
| 1292 | - Added support to import memory in OpenCL tensors. |
| 1293 | - Added the prepare() method to perform any one off pre-processing before running the function. |
| 1294 | - Added new examples: |
| 1295 | - graph_inception_v4.cpp |
Anthony Barbier | 38e7f1f | 2018-05-21 13:37:47 +0100 | [diff] [blame] | 1296 | - graph_resnext50.cpp |
Pablo Tello | b5cc95b | 2018-05-15 11:49:33 +0100 | [diff] [blame] | 1297 | - Added memory measurement instrument for CL. |
Pablo Tello | eb82fd2 | 2018-02-23 13:43:50 +0000 | [diff] [blame] | 1298 | |
Anthony Barbier | 577fbdf | 2018-03-01 15:17:54 +0000 | [diff] [blame] | 1299 | v18.03 Public maintenance release |
| 1300 | - Various bug fixes. |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1301 | - Fixed bug in @ref NEActivationLayer |
| 1302 | - Fix in @ref CLTuner when using batches. |
Anthony Barbier | 577fbdf | 2018-03-01 15:17:54 +0000 | [diff] [blame] | 1303 | - Updated recommended NDK version to r16b (And fixed warnings). |
| 1304 | - Fixed bug in validation code. |
| 1305 | - Added Inception v4 graph example. |
Georgios Pinitas | 9fb1159 | 2018-04-26 20:34:58 +0100 | [diff] [blame] | 1306 | - Renamed NEWinogradLayer.cpp to @ref NEWinogradConvolutionLayer |
Anthony Barbier | 577fbdf | 2018-03-01 15:17:54 +0000 | [diff] [blame] | 1307 | |
Anthony Barbier | 2d0ce77 | 2018-02-21 15:35:36 +0000 | [diff] [blame] | 1308 | v18.02 Public major release |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1309 | - Various Arm® Neon™ / OpenCL / GLES optimisations. |
Anthony Barbier | 2d0ce77 | 2018-02-21 15:35:36 +0000 | [diff] [blame] | 1310 | - Various bug fixes. |
| 1311 | - Changed default number of threads on big LITTLE systems. |
| 1312 | - Refactored examples and added: |
| 1313 | - graph_mobilenet_qassym8 |
| 1314 | - graph_resnet |
| 1315 | - graph_squeezenet_v1_1 |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1316 | - Renamed @ref CLConvolutionLayer into @ref CLGEMMConvolutionLayer and created a new @ref CLConvolutionLayer to select the fastest convolution method. |
| 1317 | - Renamed @ref NEConvolutionLayer into @ref NEGEMMConvolutionLayer and created a new @ref NEConvolutionLayer to select the fastest convolution method. |
Anthony Barbier | 2d0ce77 | 2018-02-21 15:35:36 +0000 | [diff] [blame] | 1318 | - Added in place support to: |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1319 | - @ref CLActivationLayer |
| 1320 | - @ref CLBatchNormalizationLayer |
Anthony Barbier | 2d0ce77 | 2018-02-21 15:35:36 +0000 | [diff] [blame] | 1321 | - Added QASYMM8 support to: |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1322 | - @ref CLActivationLayer |
| 1323 | - @ref CLDepthwiseConvolutionLayer |
| 1324 | - @ref NEDepthwiseConvolutionLayer |
| 1325 | - @ref NESoftmaxLayer |
Anthony Barbier | 2d0ce77 | 2018-02-21 15:35:36 +0000 | [diff] [blame] | 1326 | - Added FP16 support to: |
Manuel Bottini | 387259a | 2020-05-21 17:14:36 +0100 | [diff] [blame] | 1327 | - CLDepthwiseConvolutionLayer3x3 |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1328 | - @ref CLDepthwiseConvolutionLayer |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 1329 | - Added broadcasting support to NEArithmeticAddition / @ref CLArithmeticAddition / @ref CLPixelWiseMultiplication |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1330 | - Added fused batched normalization and activation to @ref CLBatchNormalizationLayer and @ref NEBatchNormalizationLayer |
| 1331 | - Added support for non-square pooling to @ref NEPoolingLayer and @ref CLPoolingLayer |
Anthony Barbier | 2d0ce77 | 2018-02-21 15:35:36 +0000 | [diff] [blame] | 1332 | - New OpenCL kernels / functions: |
Michele Di Giorgio | a046e16 | 2019-10-08 09:36:26 +0100 | [diff] [blame] | 1333 | - CLDirectConvolutionLayerOutputStageKernel |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1334 | - New Arm® Neon™ kernels / functions |
Anthony Barbier | 2d0ce77 | 2018-02-21 15:35:36 +0000 | [diff] [blame] | 1335 | - Added name() method to all kernels. |
| 1336 | - Added support for Winograd 5x5. |
Georgios Pinitas | 0f7ef8a | 2021-01-10 04:23:52 +0000 | [diff] [blame] | 1337 | - NEPermuteKernel / @ref NEPermute |
Michalis Spyrou | 96f977e | 2021-07-01 12:20:56 +0100 | [diff] [blame] | 1338 | - CpuWinogradConv2dTransformInputKernel / NEWinogradLayer |
| 1339 | - CpuWinogradConv2dTransformOutputKernel / NEWinogradLayer |
| 1340 | - CpuWinogradConv2dTransformWeightsKernel / NEWinogradLayer |
Anthony Barbier | e155337 | 2018-07-16 18:53:52 +0100 | [diff] [blame] | 1341 | - Renamed NEWinogradLayerKernel into NEWinogradLayerBatchedGEMMKernel |
Anthony Barbier | 2d0ce77 | 2018-02-21 15:35:36 +0000 | [diff] [blame] | 1342 | - New GLES kernels / functions: |
Manuel Bottini | ceaa0bf | 2021-02-16 15:15:19 +0000 | [diff] [blame] | 1343 | - GCTensorShiftKernel / GCTensorShift |
Pablo Tello | f6c572c | 2018-02-14 12:47:30 +0000 | [diff] [blame] | 1344 | |
Anthony Barbier | 64c95a0 | 2018-01-22 18:48:55 +0000 | [diff] [blame] | 1345 | v18.01 Public maintenance release |
| 1346 | - Various bug fixes |
| 1347 | - Added some of the missing validate() methods |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1348 | - Added @ref CLDeconvolutionLayerUpsampleKernel / @ref CLDeconvolutionLayer @ref CLDeconvolutionLayerUpsample |
Sheri Zhang | 7e20e29 | 2021-02-02 11:49:34 +0000 | [diff] [blame] | 1349 | - Added CLPermuteKernel / @ref CLPermute |
Anthony Barbier | 64c95a0 | 2018-01-22 18:48:55 +0000 | [diff] [blame] | 1350 | - Added method to clean the programs cache in the CL Kernel library. |
Manuel Bottini | ceaa0bf | 2021-02-16 15:15:19 +0000 | [diff] [blame] | 1351 | - Added GCArithmeticAdditionKernel / GCArithmeticAddition |
| 1352 | - Added GCDepthwiseConvolutionLayer3x3Kernel / GCDepthwiseConvolutionLayer3x3 |
| 1353 | - Added GCNormalizePlanarYUVLayerKernel / GCNormalizePlanarYUVLayer |
| 1354 | - Added GCScaleKernel / GCScale |
| 1355 | - Added GCWeightsReshapeKernel / GCConvolutionLayer |
Anthony Barbier | 64c95a0 | 2018-01-22 18:48:55 +0000 | [diff] [blame] | 1356 | - Added FP16 support to the following GLES compute kernels: |
Manuel Bottini | ceaa0bf | 2021-02-16 15:15:19 +0000 | [diff] [blame] | 1357 | - GCCol2ImKernel |
| 1358 | - GCGEMMInterleave4x4Kernel |
| 1359 | - GCGEMMTranspose1xWKernel |
| 1360 | - GCIm2ColKernel |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1361 | - Refactored Arm® Neon™ Winograd (NEWinogradLayerKernel) |
Manuel Bottini | 327225d | 2021-04-13 13:09:30 +0100 | [diff] [blame] | 1362 | - Added NEDirectConvolutionLayerOutputStageKernel |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1363 | - Added QASYMM8 support to the following Arm® Neon™ kernels: |
Georgios Pinitas | 7d0adc6 | 2020-09-04 15:25:24 +0100 | [diff] [blame] | 1364 | - NEDepthwiseConvolutionLayer3x3Kernel |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1365 | - @ref NEFillBorderKernel |
Michele Di Giorgio | 1928904 | 2021-02-03 16:05:00 +0000 | [diff] [blame] | 1366 | - NEPoolingLayerKernel |
Anthony Barbier | 64c95a0 | 2018-01-22 18:48:55 +0000 | [diff] [blame] | 1367 | - Added new examples: |
| 1368 | - graph_cl_mobilenet_qasymm8.cpp |
| 1369 | - graph_inception_v3.cpp |
| 1370 | - gc_dc.cpp |
| 1371 | - More tests added to both validation and benchmarking suites. |
| 1372 | |
Gian Marco | ff85093 | 2017-12-11 12:37:17 +0000 | [diff] [blame] | 1373 | v17.12 Public major release |
| 1374 | - Most machine learning functions on OpenCL support the new data type QASYMM8 |
| 1375 | - Introduced logging interface |
| 1376 | - Introduced opencl timer |
| 1377 | - Reworked GEMMLowp interface |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1378 | - Added new Arm® Neon™ assembly kernels for GEMMLowp, SGEMM and HGEMM |
Gian Marco | ff85093 | 2017-12-11 12:37:17 +0000 | [diff] [blame] | 1379 | - Added validation method for most Machine Learning kernels / functions |
| 1380 | - Added new graph examples such as googlenet, mobilenet, squeezenet, vgg16 and vgg19 |
| 1381 | - Added sgemm example for OpenCL |
| 1382 | - Added absolute difference example for GLES compute |
| 1383 | - Added new tests and benchmarks in validation and benchmark frameworks |
| 1384 | - Added new kernels / functions for GLES compute |
| 1385 | |
| 1386 | - New OpenGL ES kernels / functions |
Manuel Bottini | ceaa0bf | 2021-02-16 15:15:19 +0000 | [diff] [blame] | 1387 | - GCAbsoluteDifferenceKernel / GCAbsoluteDifference |
| 1388 | - GCActivationLayerKernel / GCActivationLayer |
| 1389 | - GCBatchNormalizationLayerKernel / GCBatchNormalizationLayer |
| 1390 | - GCCol2ImKernel |
| 1391 | - GCDepthConcatenateLayerKernel / GCDepthConcatenateLayer |
| 1392 | - GCDirectConvolutionLayerKernel / GCDirectConvolutionLayer |
| 1393 | - GCDropoutLayerKernel / GCDropoutLayer |
| 1394 | - GCFillBorderKernel / GCFillBorder |
| 1395 | - GCGEMMInterleave4x4Kernel / GCGEMMInterleave4x4 |
| 1396 | - GCGEMMMatrixAccumulateBiasesKernel / GCGEMMMatrixAdditionKernel / GCGEMMMatrixMultiplyKernel / GCGEMM |
| 1397 | - GCGEMMTranspose1xWKernel / GCGEMMTranspose1xW |
| 1398 | - GCIm2ColKernel |
| 1399 | - GCNormalizationLayerKernel / GCNormalizationLayer |
| 1400 | - GCPixelWiseMultiplicationKernel / GCPixelWiseMultiplication |
| 1401 | - GCPoolingLayerKernel / GCPoolingLayer |
| 1402 | - GCLogits1DMaxKernel / GCLogits1DShiftExpSumKernel / GCLogits1DNormKernel / GCSoftmaxLayer |
| 1403 | - GCTransposeKernel / GCTranspose |
Gian Marco | ff85093 | 2017-12-11 12:37:17 +0000 | [diff] [blame] | 1404 | |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1405 | - New Arm® Neon™ kernels / functions |
Pablo Tello | eb82fd2 | 2018-02-23 13:43:50 +0000 | [diff] [blame] | 1406 | - arm_compute::NEGEMMLowpAArch64A53Kernel / arm_compute::NEGEMMLowpAArch64Kernel / arm_compute::NEGEMMLowpAArch64V8P4Kernel / arm_compute::NEGEMMInterleavedBlockedKernel / arm_compute::NEGEMMLowpAssemblyMatrixMultiplyCore |
| 1407 | - arm_compute::NEHGEMMAArch64FP16Kernel |
Georgios Pinitas | 7d0adc6 | 2020-09-04 15:25:24 +0100 | [diff] [blame] | 1408 | - NEDepthwiseConvolutionLayer3x3Kernel / NEDepthwiseIm2ColKernel / NEGEMMMatrixVectorMultiplyKernel / NEDepthwiseVectorToTensorKernel / @ref NEDepthwiseConvolutionLayer |
Manuel Bottini | cfac51c | 2021-06-18 15:47:28 +0100 | [diff] [blame] | 1409 | - NEGEMMLowpOffsetContributionKernel / NEGEMMLowpMatrixAReductionKernel / NEGEMMLowpMatrixBReductionKernel / NEGEMMLowpMatrixMultiplyCore |
Manuel Bottini | ae58bdf | 2021-06-17 17:18:45 +0100 | [diff] [blame] | 1410 | - NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint |
Georgios Pinitas | 9fb1159 | 2018-04-26 20:34:58 +0100 | [diff] [blame] | 1411 | - NEWinogradLayer / NEWinogradLayerKernel |
Gian Marco | ff85093 | 2017-12-11 12:37:17 +0000 | [diff] [blame] | 1412 | |
| 1413 | - New OpenCL kernels / functions |
Georgios Pinitas | 4a578b9 | 2021-06-25 12:13:49 +0100 | [diff] [blame] | 1414 | - CLGEMMLowpOffsetContributionKernel / CLGEMMLowpMatrixAReductionKernel / CLGEMMLowpMatrixBReductionKernel / CLGEMMLowpMatrixMultiplyCore |
| 1415 | - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint |
Gian Marco | ff85093 | 2017-12-11 12:37:17 +0000 | [diff] [blame] | 1416 | |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1417 | - New graph nodes for Arm® Neon™ and OpenCL |
Georgios Pinitas | d9eb275 | 2018-04-03 13:44:29 +0100 | [diff] [blame] | 1418 | - graph::BranchLayer |
| 1419 | - graph::DepthConvertLayer |
| 1420 | - graph::DepthwiseConvolutionLayer |
| 1421 | - graph::DequantizationLayer |
| 1422 | - graph::FlattenLayer |
| 1423 | - graph::QuantizationLayer |
| 1424 | - graph::ReshapeLayer |
Gian Marco | ff85093 | 2017-12-11 12:37:17 +0000 | [diff] [blame] | 1425 | |
Anthony Barbier | 3c5b4ff | 2017-10-12 13:20:52 +0100 | [diff] [blame] | 1426 | v17.10 Public maintenance release |
| 1427 | - Bug fixes: |
| 1428 | - Check the maximum local workgroup size supported by OpenCL devices |
| 1429 | - Minor documentation updates (Fixed instructions to build the examples) |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1430 | - Introduced a graph::GraphContext |
Anthony Barbier | 3c5b4ff | 2017-10-12 13:20:52 +0100 | [diff] [blame] | 1431 | - Added a few new Graph nodes, support for branches and grouping. |
| 1432 | - Automatically enable cl_printf in debug builds |
| 1433 | - Fixed bare metal builds for armv7a |
| 1434 | - Added AlexNet and cartoon effect examples |
| 1435 | - Fixed library builds: libraries are no longer built as supersets of each other.(It means application using the Runtime part of the library now need to link against both libarm_compute_core and libarm_compute) |
| 1436 | |
Anthony Barbier | 6a5627a | 2017-09-26 14:42:02 +0100 | [diff] [blame] | 1437 | v17.09 Public major release |
| 1438 | - Experimental Graph support: initial implementation of a simple stream API to easily chain machine learning layers. |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1439 | - Memory Manager (@ref BlobLifetimeManager, @ref BlobMemoryPool, @ref ILifetimeManager, @ref IMemoryGroup, @ref IMemoryManager, @ref IMemoryPool, @ref IPoolManager, @ref MemoryManagerOnDemand, @ref PoolManager) |
Anthony Barbier | 6a5627a | 2017-09-26 14:42:02 +0100 | [diff] [blame] | 1440 | - New validation and benchmark frameworks (Boost and Google frameworks replaced by homemade framework). |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1441 | - Most machine learning functions support both fixed point 8 and 16 bit (QS8, QS16) for both Arm® Neon™ and OpenCL. |
| 1442 | - New Arm® Neon™ kernels / functions: |
Pablo Tello | eb82fd2 | 2018-02-23 13:43:50 +0000 | [diff] [blame] | 1443 | - arm_compute::NEGEMMAssemblyBaseKernel arm_compute::NEGEMMAArch64Kernel |
Manuel Bottini | 00f4dfc | 2021-03-10 09:55:14 +0000 | [diff] [blame] | 1444 | - NEDequantizationLayerKernel / @ref NEDequantizationLayer |
Georgios Pinitas | 70eb53b | 2021-01-06 19:42:21 +0000 | [diff] [blame] | 1445 | - NEFloorKernel / @ref NEFloor |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1446 | - @ref NEL2NormalizeLayerKernel / @ref NEL2NormalizeLayer |
Georgios Pinitas | b6af482 | 2021-09-14 12:33:34 +0100 | [diff] [blame] | 1447 | - NEQuantizationLayerKernel NEMinMaxLayerKernel / @ref NEQuantizationLayer |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1448 | - @ref NEROIPoolingLayerKernel / @ref NEROIPoolingLayer |
| 1449 | - @ref NEReductionOperationKernel / @ref NEReductionOperation |
Georgios Pinitas | 0f7ef8a | 2021-01-10 04:23:52 +0000 | [diff] [blame] | 1450 | - NEReshapeLayerKernel / @ref NEReshapeLayer |
Anthony Barbier | 6a5627a | 2017-09-26 14:42:02 +0100 | [diff] [blame] | 1451 | |
| 1452 | - New OpenCL kernels / functions: |
Gian Marco Iodice | 8155c02 | 2021-04-16 15:08:59 +0100 | [diff] [blame] | 1453 | - CLDepthwiseConvolutionLayer3x3NCHWKernel CLDepthwiseConvolutionLayer3x3NHWCKernel CLDepthwiseIm2ColKernel CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer |
Manuel Bottini | 9e73c93 | 2021-03-02 17:40:42 +0000 | [diff] [blame] | 1454 | - CLDequantizationLayerKernel / CLDequantizationLayer |
Sheri Zhang | 1efed92 | 2021-03-10 22:43:38 +0000 | [diff] [blame] | 1455 | - CLDirectConvolutionLayerKernel / @ref CLDirectConvolutionLayer |
Georgios Pinitas | e2696b1 | 2020-12-03 20:37:43 +0000 | [diff] [blame] | 1456 | - CLFlattenLayer |
Georgios Pinitas | f47f718 | 2021-01-15 09:29:50 +0000 | [diff] [blame] | 1457 | - CLFloorKernel / @ref CLFloor |
Gian Marco Iodice | 5fc07aa | 2019-05-15 17:08:02 +0100 | [diff] [blame] | 1458 | - CLGEMMTranspose1xW |
Michele Di Giorgio | ee82d34 | 2021-01-05 16:14:28 +0000 | [diff] [blame] | 1459 | - CLGEMMMatrixVectorMultiplyKernel |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1460 | - @ref CLL2NormalizeLayerKernel / @ref CLL2NormalizeLayer |
Georgios Pinitas | b6af482 | 2021-09-14 12:33:34 +0100 | [diff] [blame] | 1461 | - CLQuantizationLayerKernel CLMinMaxLayerKernel / @ref CLQuantizationLayer |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1462 | - @ref CLROIPoolingLayerKernel / @ref CLROIPoolingLayer |
| 1463 | - @ref CLReductionOperationKernel / @ref CLReductionOperation |
Sheri Zhang | 7e20e29 | 2021-02-02 11:49:34 +0000 | [diff] [blame] | 1464 | - CLReshapeLayerKernel / @ref CLReshapeLayer |
Anthony Barbier | 6a5627a | 2017-09-26 14:42:02 +0100 | [diff] [blame] | 1465 | |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1466 | v17.06 Public major release |
| 1467 | - Various bug fixes |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1468 | - Added support for fixed point 8 bit (QS8) to the various Arm® Neon™ machine learning kernels. |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1469 | - Added unit tests and benchmarks (AlexNet, LeNet) |
| 1470 | - Added support for sub tensors. |
| 1471 | - Added infrastructure to provide GPU specific optimisation for some OpenCL kernels. |
Sheri Zhang | ac6499a | 2021-02-10 15:32:38 +0000 | [diff] [blame] | 1472 | - Added @ref OMPScheduler (OpenMP) scheduler for Neon |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1473 | - Added @ref SingleThreadScheduler scheduler for Arm® Neon™ (For bare metal) |
ramelg01 | b2eba7f | 2021-12-23 08:32:08 +0000 | [diff] [blame] | 1474 | - User can specify their own scheduler by implementing the @ref IScheduler interface. |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1475 | - New OpenCL kernels / functions: |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1476 | - @ref CLBatchNormalizationLayerKernel / @ref CLBatchNormalizationLayer |
Michele Di Giorgio | 7d61ff0 | 2021-01-18 21:15:59 +0000 | [diff] [blame] | 1477 | - CLDepthConcatenateLayerKernel / CLDepthConcatenateLayer |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 1478 | - CLHOGOrientationBinningKernel CLHOGBlockNormalizationKernel, CLHOGDetectorKernel / CLHOGDescriptor CLHOGDetector CLHOGGradient CLHOGMultiDetection |
Georgios Pinitas | 96b16b6 | 2020-12-01 17:41:34 +0000 | [diff] [blame] | 1479 | - CLLocallyConnectedMatrixMultiplyKernel / CLLocallyConnectedLayer |
Manuel Bottini | d87aded | 2021-07-16 10:23:31 +0100 | [diff] [blame] | 1480 | - CLWeightsReshapeKernel / CLConvolutionLayerReshapeWeights |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1481 | - New C++ kernels: |
Georgios Pinitas | c6f9510 | 2021-03-30 10:03:01 +0100 | [diff] [blame] | 1482 | - CPPDetectionWindowNonMaximaSuppressionKernel |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1483 | - New Arm® Neon™ kernels / functions: |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1484 | - @ref NEBatchNormalizationLayerKernel / @ref NEBatchNormalizationLayer |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 1485 | - NEDepthConcatenateLayerKernel / NEDepthConcatenateLayer |
Manuel Bottini | 327225d | 2021-04-13 13:09:30 +0100 | [diff] [blame] | 1486 | - NEDirectConvolutionLayerKernel / @ref NEDirectConvolutionLayer |
Georgios Pinitas | 96b16b6 | 2020-12-01 17:41:34 +0000 | [diff] [blame] | 1487 | - NELocallyConnectedMatrixMultiplyKernel / NELocallyConnectedLayer |
Manuel Bottini | 29599d0 | 2021-07-06 15:01:35 +0100 | [diff] [blame] | 1488 | - NEWeightsReshapeKernel / NEConvolutionLayerReshapeWeights |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1489 | |
| 1490 | v17.05 Public bug fixes release |
| 1491 | - Various bug fixes |
| 1492 | - Remaining of the functions ported to use accurate padding. |
| 1493 | - Library does not link against OpenCL anymore (It uses dlopen / dlsym at runtime instead to determine whether or not OpenCL is available). |
| 1494 | - Added "free" method to allocator. |
| 1495 | - Minimum version of g++ required for armv7 Linux changed from 4.8 to 4.9 |
| 1496 | |
| 1497 | v17.04 Public bug fixes release |
| 1498 | |
| 1499 | The following functions have been ported to use the new accurate padding: |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 1500 | - CLColorConvertKernel |
| 1501 | - CLEdgeNonMaxSuppressionKernel |
| 1502 | - CLEdgeTraceKernel |
| 1503 | - CLGaussianPyramidHorKernel |
| 1504 | - CLGaussianPyramidVertKernel |
| 1505 | - CLGradientKernel |
Michalis Spyrou | 27e67f0 | 2021-02-16 11:34:39 +0000 | [diff] [blame] | 1506 | - NEChannelCombineKernel |
Georgios Pinitas | c6f9510 | 2021-03-30 10:03:01 +0100 | [diff] [blame] | 1507 | - NEFillArrayKernel |
Michalis Spyrou | 27e67f0 | 2021-02-16 11:34:39 +0000 | [diff] [blame] | 1508 | - NEGaussianPyramidHorKernel |
| 1509 | - NEGaussianPyramidVertKernel |
Georgios Pinitas | 09d3451 | 2018-08-30 16:02:11 +0100 | [diff] [blame] | 1510 | - NEHarrisScoreFP16Kernel |
Michalis Spyrou | 27e67f0 | 2021-02-16 11:34:39 +0000 | [diff] [blame] | 1511 | - NEHarrisScoreKernel |
| 1512 | - NEHOGDetectorKernel |
Michalis Spyrou | 373b407 | 2021-01-20 16:41:12 +0000 | [diff] [blame] | 1513 | - NELogits1DMaxKernel |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1514 | - NELogits1DShiftExpSumKernel |
| 1515 | - NELogits1DNormKernel |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 1516 | - NENonMaximaSuppression3x3FP16Kernel |
| 1517 | - NENonMaximaSuppression3x3Kernel |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1518 | |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1519 | v17.03.1 First Major public release of the sources |
| 1520 | - Renamed the library to arm_compute |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1521 | - New CPP target introduced for C++ kernels shared between Arm® Neon™ and CL functions. |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1522 | - New padding calculation interface introduced and ported most kernels / functions to use it. |
| 1523 | - New OpenCL kernels / functions: |
Gian Marco Iodice | eb65f6d | 2020-04-15 11:42:15 +0100 | [diff] [blame] | 1524 | - CLGEMMLowpMatrixMultiplyKernel / CLGEMMLowp |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1525 | - New Arm® Neon™ kernels / functions: |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1526 | - @ref NENormalizationLayerKernel / @ref NENormalizationLayer |
Teresa Charlin | d1dc09c | 2021-03-04 15:24:45 +0000 | [diff] [blame] | 1527 | - NETransposeKernel / @ref NETranspose |
Michalis Spyrou | 373b407 | 2021-01-20 16:41:12 +0000 | [diff] [blame] | 1528 | - NELogits1DMaxKernel, NELogits1DShiftExpSumKernel, NELogits1DNormKernel / @ref NESoftmaxLayer |
Manuel Bottini | 24b8920 | 2021-07-01 18:13:33 +0100 | [diff] [blame] | 1529 | - NEIm2ColKernel, NECol2ImKernel, NEConvolutionLayerWeightsReshapeKernel / @ref NEConvolutionLayer |
Michele Di Giorgio | f22f672 | 2020-07-03 16:29:24 +0100 | [diff] [blame] | 1530 | - NEGEMMMatrixAccumulateBiasesKernel / @ref NEFullyConnectedLayer |
Manuel Bottini | cfac51c | 2021-06-18 15:47:28 +0100 | [diff] [blame] | 1531 | - NEGEMMLowpMatrixMultiplyKernel / NEGEMMLowp |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1532 | |
| 1533 | v17.03 Sources preview |
| 1534 | - New OpenCL kernels / functions: |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 1535 | - CLGradientKernel, CLEdgeNonMaxSuppressionKernel, CLEdgeTraceKernel / CLCannyEdge |
Georgios Pinitas | 856f66e | 2021-04-22 21:13:21 +0100 | [diff] [blame] | 1536 | - GEMM refactoring + FP16 support: CLGEMMInterleave4x4Kernel, CLGEMMTranspose1xWKernel, CLGEMMMatrixMultiplyKernel, CLGEMMMatrixAdditionKernel / @ref CLGEMM |
Michele Di Giorgio | f6f7876 | 2020-07-06 11:27:21 +0100 | [diff] [blame] | 1537 | - CLGEMMMatrixAccumulateBiasesKernel / @ref CLFullyConnectedLayer |
Teresa Charlin | 2788609 | 2021-02-25 20:15:01 +0000 | [diff] [blame] | 1538 | - CLTransposeKernel / @ref CLTranspose |
Georgios Pinitas | c6f9510 | 2021-03-30 10:03:01 +0100 | [diff] [blame] | 1539 | - CLLKTrackerInitKernel, CLLKTrackerStage0Kernel, CLLKTrackerStage1Kernel, CLLKTrackerFinalizeKernel / CLOpticalFlow |
Anthony Barbier | 3762e74 | 2018-03-02 11:49:33 +0000 | [diff] [blame] | 1540 | - @ref CLNormalizationLayerKernel / @ref CLNormalizationLayer |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 1541 | - CLLaplacianPyramid, CLLaplacianReconstruct |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1542 | - New Arm® Neon™ kernels / functions: |
Michele Di Giorgio | bd2c8e1 | 2021-01-19 15:29:02 +0000 | [diff] [blame] | 1543 | - NEActivationLayerKernel / @ref NEActivationLayer |
Michele Di Giorgio | 93b75e0 | 2021-06-21 12:00:43 +0100 | [diff] [blame] | 1544 | - GEMM refactoring + FP16 support (Requires armv8.2 CPU): NEGEMMInterleave4x4Kernel, NEGEMMTranspose1xWKernel, NEGEMMMatrixMultiplyKernel, NEGEMMMatrixAdditionKernel / @ref NEGEMM |
Michele Di Giorgio | 1928904 | 2021-02-03 16:05:00 +0000 | [diff] [blame] | 1545 | - NEPoolingLayerKernel / @ref NEPoolingLayer |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1546 | |
| 1547 | v17.02.1 Sources preview |
| 1548 | - New OpenCL kernels / functions: |
Sang-Hoon Park | 201e0fe | 2021-01-27 13:14:56 +0000 | [diff] [blame] | 1549 | - CLLogits1DMaxKernel, CLLogits1DShiftExpSumKernel, CLLogits1DNormKernel / @ref CLSoftmaxLayer |
Michele Di Giorgio | e131466 | 2021-02-01 17:09:32 +0000 | [diff] [blame] | 1550 | - CLPoolingLayerKernel / @ref CLPoolingLayer |
Manuel Bottini | d844c08 | 2021-07-14 12:58:54 +0100 | [diff] [blame] | 1551 | - CLIm2ColKernel, CLCol2ImKernel, CLConvolutionLayerWeightsReshapeKernel / CLConvolutionLayer |
Adnan AlSinan | 6863fa0 | 2022-02-04 13:04:55 +0000 | [diff] [blame] | 1552 | - CLRemapKernel / CLRemap |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 1553 | - CLGaussianPyramidHorKernel, CLGaussianPyramidVertKernel / CLGaussianPyramid, CLGaussianPyramidHalf, CLGaussianPyramidOrb |
| 1554 | - CLMinMaxKernel, CLMinMaxLocationKernel / CLMinMaxLocation |
| 1555 | - CLNonLinearFilterKernel / CLNonLinearFilter |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1556 | - New Arm® Neon™ FP16 kernels (Requires armv8.2 CPU) |
Michalis Spyrou | 27e67f0 | 2021-02-16 11:34:39 +0000 | [diff] [blame] | 1557 | - NEAccumulateWeightedFP16Kernel |
| 1558 | - NEBox3x3FP16Kernel |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 1559 | - NENonMaximaSuppression3x3FP16Kernel |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1560 | |
| 1561 | v17.02 Sources preview |
| 1562 | - New OpenCL kernels / functions: |
Georgios Pinitas | f47f718 | 2021-01-15 09:29:50 +0000 | [diff] [blame] | 1563 | - CLActivationLayerKernel / @ref CLActivationLayer |
Michalis Spyrou | 473cb01 | 2021-02-23 11:48:12 +0000 | [diff] [blame] | 1564 | - CLChannelCombineKernel / CLChannelCombine |
| 1565 | - CLDerivativeKernel / CLChannelExtract |
| 1566 | - CLFastCornersKernel / CLFastCorners |
| 1567 | - CLMeanStdDevKernel / CLMeanStdDev |
Michele Di Giorgio | 33f41fa | 2021-03-09 14:09:08 +0000 | [diff] [blame] | 1568 | - New Arm® Neon™ kernels / functions: |
Michalis Spyrou | 27e67f0 | 2021-02-16 11:34:39 +0000 | [diff] [blame] | 1569 | - HOG / SVM: NEHOGOrientationBinningKernel, NEHOGBlockNormalizationKernel, NEHOGDetectorKernel, NEHOGNonMaximaSuppressionKernel / NEHOGDescriptor, NEHOGDetector, NEHOGGradient, NEHOGMultiDetection |
| 1570 | - NENonLinearFilterKernel / NENonLinearFilter |
Anthony Barbier | 6ff3b19 | 2017-09-04 18:44:23 +0100 | [diff] [blame] | 1571 | - Introduced a CLScheduler to manage the default context and command queue used by the runtime library and create synchronisation events. |
| 1572 | - Switched all the kernels / functions to use tensors instead of images. |
| 1573 | - Updated documentation to include instructions to build the library from sources. |
| 1574 | |
| 1575 | v16.12 Binary preview release |
| 1576 | - Original release |
| 1577 | |
Sheri Zhang | d813bab | 2021-04-30 16:53:41 +0100 | [diff] [blame] | 1578 | */ |
Ramy Elgammal | 0d274b7 | 2022-08-05 13:14:57 +0100 | [diff] [blame] | 1579 | } // namespace arm_compute |