blob: 1bfb468ef1b12df77a8d7a99c96b9264f33433c8 [file] [log] [blame]
Vidhya Sudhan Loganathand646ae12018-11-19 15:18:20 +00001///
SiCong Li90e57202023-02-01 14:39:41 +00002/// Copyright (c) 2017-2023 Arm Limited.
Vidhya Sudhan Loganathand646ae12018-11-19 15:18:20 +00003///
4/// SPDX-License-Identifier: MIT
5///
6/// Permission is hereby granted, free of charge, to any person obtaining a copy
7/// of this software and associated documentation files (the "Software"), to
8/// deal in the Software without restriction, including without limitation the
9/// rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
10/// sell copies of the Software, and to permit persons to whom the Software is
11/// furnished to do so, subject to the following conditions:
12///
13/// The above copyright notice and this permission notice shall be included in all
14/// copies or substantial portions of the Software.
15///
16/// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17/// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18/// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19/// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20/// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21/// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22/// SOFTWARE.
23///
Anthony Barbier3762e742018-03-02 11:49:33 +000024namespace arm_compute
25{
Sheri Zhangd813bab2021-04-30 16:53:41 +010026/** @page versions_changelogs Release Versions and Changelog
Anthony Barbier6ff3b192017-09-04 18:44:23 +010027
28@tableofcontents
29
Sheri Zhangd813bab2021-04-30 16:53:41 +010030@section S2_1_versions Release versions
Anthony Barbier6ff3b192017-09-04 18:44:23 +010031
32All releases are numbered vYY.MM Where YY are the last two digits of the year, and MM the month number.
33If there is more than one release in a month then an extra sequential number is appended at the end:
34
35 v17.03 (First release of March 2017)
36 v17.03.1 (Second release of March 2017)
37 v17.04 (First release of April 2017)
38
39@note We're aiming at releasing one major public release with new features per quarter. All releases in between will only contain bug fixes.
Ramy Elgammalfa8ff8e2022-08-12 16:57:10 +010040@note Starting from release 22.05, 'master' branch is no longer being used, it has been replaced by 'main'. Please update your clone jobs accordingly.
Anthony Barbier6ff3b192017-09-04 18:44:23 +010041
Sheri Zhangd813bab2021-04-30 16:53:41 +010042@section S2_2_changelog Changelog
Jakub Sujak22e76132023-03-13 17:27:51 +000043v23.02.1 Public patch release
44 - Allow mismatching data layouts between the source tensor and weights for \link cpu::CpuGemmDirectConv2d CpuGemmDirectConv2d \endlink with fixed format kernels.
45 - Fixes for experimental CPU only Bazel and CMake builds.
46
SiCong Li90e57202023-02-01 14:39:41 +000047v23.02 Public major release
Jakub Sujak06db85e2023-02-06 17:42:47 +000048 - New features:
49 - Rework the experimental dynamic fusion interface by identifying auxiliary and intermediate tensors, and specifying an explicit output operator.
50 - Add the following operators to the experimental dynamic fusion API:
51 - GpuAdd, GpuCast, GpuClamp, GpuDepthwiseConv2d, GpuMul, GpuOutput, GpuPool2d, GpuReshape, GpuResize, GpuSoftmax, GpuSub.
52 - Add SME/SME2 kernels for GeMM, Winograd convolution, Depthwise convolution and Pooling.
Jakub Sujak9eefd4b2023-02-10 14:36:48 +000053 - Add new CPU operator AddMulAdd for float and quantized types.
Jakub Sujak06db85e2023-02-06 17:42:47 +000054 - Add new flag @ref ITensorInfo::lock_paddings() to tensors to prevent extending tensor paddings.
Jakub Sujak06db85e2023-02-06 17:42:47 +000055 - Add experimental support for CPU only Bazel and CMake builds.
56 - Performance optimizations:
57 - Optimize CPU base-e exponential functions for FP32.
58 - Optimize CPU StridedSlice by copying first dimension elements in bulk where possible.
59 - Optimize CPU quantized Subtraction by reusing the quantized Addition kernel.
60 - Optimize CPU ReduceMean by removing quantization steps and performing the operation in integer domain.
61 - Optimize GPU Scale and Dynamic Fusion GpuResize by removing quantization steps and performing the operation in integer domain.
Jakub Sujak9eefd4b2023-02-10 14:36:48 +000062 - Update the heuristic for CLDepthwiseConvolutionNative kernel.
63 - Add new optimized OpenCL kernel to compute indirect convolution:
64 - \link opencl::kernels::ClIndirectConv2dKernel ClIndirectConv2dKernel \endlink
65 - Add new optimized OpenCL kernel to compute transposed convolution:
66 - \link opencl::kernels::ClTransposedConvolutionKernel ClTransposedConvolutionKernel \endlink
SiCong Li90e57202023-02-01 14:39:41 +000067 - Update recommended/minimum NDK version to r20b.
Jakub Sujak06db85e2023-02-06 17:42:47 +000068 - Various optimizations and bug fixes.
Anthony Barbier6ff3b192017-09-04 18:44:23 +010069
Viet-Hoa Dob1f82882022-11-11 11:29:50 +000070v22.11 Public major release
71 - New features:
72 - Add new experimental dynamic fusion API.
Viet-Hoa Do293ab602022-11-15 10:51:26 +000073 - Add CPU batch matrix multiplication with adj_x = false and adj_y = false for FP32.
Viet-Hoa Dob1f82882022-11-11 11:29:50 +000074 - Add CPU MeanStdDevNorm for QASYMM8.
75 - Add CPU and GPU GELU activation function for FP32 and FP16.
76 - Add CPU swish activation function for FP32 and FP16.
77 - Performance optimizations:
78 - Optimize CPU bilinear scale for FP32, FP16, QASYMM8, QASYMM8_SIGNED, U8 and S8.
79 - Optimize CPU activation functions using LUT-based implementation:
80 - Sigmoid function for QASYMM8 and QASYMM8_SIGNED.
81 - Hard swish function for QASYMM8_SIGNED.
82 - Optimize CPU addition for QASYMM8 and QASYMM8_SIGNED using fixed-point arithmetic.
83 - Optimize CPU multiplication, subtraction and activation layers by considering tensors as 1D.
84 - Optimize GPU depthwise convolution kernel and heuristic.
85 - Optimize GPU Conv2d heuristic.
86 - Optimize CPU MeanStdDevNorm for FP16.
87 - Optimize CPU tanh activation function for FP16 using rational approximation.
88 - Improve GPU GeMMLowp start-up time.
89 - Various optimizations and bug fixes.
90
SiCong Life1b1f62022-05-19 18:58:31 +010091v22.08 Public major release
Ramy Elgammal0d274b72022-08-05 13:14:57 +010092 - Various bug fixes.
93 - Disable unsafe FP optimizations causing accuracy issues in:
94 - \link opencl::kernels::ClDirectConv2dKernel ClDirectConv2dKernel \endlink
95 - \link opencl::kernels::ClDirectConv2dKernel ClDirectConv3dKernel \endlink
96 - @ref CLDepthwiseConvolutionLayerNativeKernel
97 - Add Dynamic Fusion of Elementwise Operators: Div, Floor, Add.
98 - Optimize the gemm_reshaped_rhs_nly_nt OpenCL kernel using the arm_matrix_multiply extension available for Arm® Mali™-G715 and Arm® Mali™-G615.
99 - Add support for the arm_matrix_multiply extension in the gemmlowp_mm_reshaped_only_rhs_t OpenCL kernel.
100 - Expand GPUTarget list with missing Mali™ GPUs product names: G57, G68, G78AE, G610, G510, G310.
101 - Extend the direct convolution 2d interface to configure the block size.
102 - Update ClConv2D heuristic to use direct convolution.
103 - Use official Khronos® OpenCL extensions:
104 - Add cl_khr_integer_dot_product extension support.
105 - Add support of OpenCL 3.0 non-uniform workgroup.
106 - Cpu performance optimizations:
107 - Add LUT-based implementation of Hard Swish and Leaky ReLU activation function for aarch64 build.
108 - Optimize Add layer by considering the input tensors as 1D array.
109 - Add fixed-format BF16, FP16 and FP32 Neon™ GEMM kernels to support variable weights.
110 - Add new winograd convolution kernels implementation and update the ACL \link arm_compute::cpu::CpuWinogradConv2d CpuWinogradConv2d\endlink operator.
Jakub Sujak117e17e2023-02-21 10:52:57 +0000111 - Add experimental support for native builds for Windows® on Arm™.
Ramy Elgammal966218d2022-08-11 16:23:22 +0100112 - Build flag interpretation change: arch=armv8.6-a now translates to -march=armv8.6-a CXX flag instead of march=armv8.2-a + explicit selection of feature extensions.
SiCong Life1b1f62022-05-19 18:58:31 +0100113 - Build flag change: toolchain_prefix, compiler_prefix:
Ramy Elgammal0d274b72022-08-05 13:14:57 +0100114 - Use empty string "" to suppress any prefixes.
115 - Use "auto" to use default (auto) prefixes chosen by the build script. This is the default behavior when unspecified.
116 - Any other string will be used as custom prefixes to the compiler and the rest of toolchain tools.
117 - The default behaviour when prefix is unspecified does not change, but its signifier has been changed from empty string "" to "auto".
118 - armv7a with Android build will no longer be tested or maintained.
SiCong Life1b1f62022-05-19 18:58:31 +0100119
Adnan AlSinan2921e5b2022-05-16 14:30:41 +0100120v22.05 Public major release
121 - Various bug fixes.
122 - Various optimizations.
123 - Add support for NDK r23b.
124 - Inclusive language adjustment. Please refer to @ref S5_0_inc_lang for details.
125 - New Arm® Neon™ kernels / functions :
126 - \link opencl::kernels::ClPool3dKernel ClPool3dKernel \endlink
127 - New OpenCL kernels / functions :
128 - \link cpu::kernels::CpuPool3dKernel CpuPool3dKernel \endlink
129 - Improve the start-up times for the following OpenCL kernels:
130 - \link opencl::kernels::ClWinogradInputTransformKernel ClWinogradInputTransformKernel \endlink
131 - \link opencl::kernels::ClWinogradOutputTransformKernel ClWinogradOutputTransformKernel \endlink
132 - \link opencl::kernels::ClWinogradFilterTransformKernel ClWinogradFilterTransformKernel \endlink
133 - \link opencl::kernels::ClHeightConcatenateKernel ClHeightConcatenateKernel \endlink
134 - Decouple the implementation of the following Cpu kernels into various data types (fp32, fp16, int):
135 - \link cpu::kernels::CpuDirectConv2dKernel CpuDirectConv2dKernel \endlink
136 - \link cpu::kernels::CpuDepthwiseConv2dNativeKernel CpuDepthwiseConv2dNativeKernel \endlink
137 - \link cpu::kernels::CpuGemmMatrixAdditionKernel CpuGemmMatrixAdditionKernel \endlink
138 - \link cpu::kernels::CpuGemmMatrixMultiplyKernel CpuGemmMatrixMultiplyKernel \endlink
139 - @ref NEFuseBatchNormalizationKernel
140 - @ref NEL2NormalizeLayerKernel
141
Adnan AlSinan69854ba2022-02-07 15:28:56 +0000142v22.02 Public major release
143 - Various bug fixes.
144 - Various optimizations.
145 - Update A510 arm_gemm cpu Kernels.
146 - Inclusive language adjustment. Please refer to @ref S5_0_inc_lang for details.
147 - Improve the start-up time for the following OpenCL kernels:
148 - @ref CLScale
149 - @ref CLGEMM
150 - @ref CLDepthwiseConvolutionLayer
151 - \link opencl::kernels::ClIm2ColKernel ClIm2ColKernel \endlink
152 - \link opencl::kernels::ClDirectConv2dKernel ClDirectConv2dKernel \endlink
153 - Remove functions:
154 - CLRemap
155 - NERemap
156 - Remove padding from OpenCL kernels:
157 - \link opencl::kernels::ClDirectConv2dKernel ClDirectConv2dKernel \endlink
158 - Remove padding from Cpu kernels:
159 - \link cpu::kernels::CpuDirectConv2dKernel CpuDirectConv2dKernel \endlink
160 - Decouple the implementation of the following Cpu kernels into various data types (fp32, fp16, int):
161 - \link cpu::kernels::CpuActivationKernel CpuActivationKernel \endlink
162 - \link cpu::kernels::CpuAddKernel CpuAddKernel \endlink
163 - \link cpu::kernels::CpuElementwiseKernel CpuElementwiseKernel \endlink
164 - \link cpu::CpuSoftmaxGeneric CpuSoftmaxKernel \endlink
165 - @ref NEBoundingBoxTransformKernel
166 - @ref NECropKernel
167 - @ref NEComputeAllAnchorsKernel
168 - @ref NEInstanceNormalizationLayerKernel
Adnan AlSinanbb8b2352022-02-14 14:30:38 +0000169 - NEMaxUnpoolingLayerKernel
Adnan AlSinan69854ba2022-02-07 15:28:56 +0000170 - @ref NEMeanStdDevNormalizationKernel
171 - @ref NERangeKernel
172 - @ref NEROIAlignLayerKernel
173 - @ref NESelectKernel
174
Sheri Zhang5dda2172021-10-15 19:54:17 +0100175v21.11 Public major release
176 - Various bug fixes.
Gunes Bayir08773702021-11-05 12:34:34 +0000177 - Various optimizations:
178 - Improve performance of bilinear and nearest neighbor Scale on both CPU and GPU for FP32, FP16, Int8, Uint8 data types
Adnan AlSinanabc093b2022-02-08 16:57:06 +0000179 - Improve performance of Softmax on GPU for Uint8/Int8
Sheri Zhang5dda2172021-10-15 19:54:17 +0100180 - New OpenCL kernels / functions:
181 - @ref CLConv3D
182 - New Arm® Neon™ kernels / functions:
183 - @ref NEConv3D
Gunes Bayir08773702021-11-05 12:34:34 +0000184 - Support configurable build by a selected subset of operator list
185 - Support MobileBert on Neon™ backend
186 - Improve operator/function logging
187 - Remove padding from OpenCL kernels:
188 - ClPool2dKernel
189 - ClScaleKernel
190 - ClGemmMatrixMultiplyReshapedKernel
191 - Remove padding from Cpu kernels:
192 - CpuPool2dKernel
193 - Remove Y padding from OpenCL kernels:
194 - ClGemmMatrixMultiplyKernel
195 - ClGemmReshapedRHSMatrixKernel
196 - Remove legacy GeMM kernels in gemm_v1.cl
Sheri Zhang5dda2172021-10-15 19:54:17 +0100197
Freddie Liardet77014ff2021-08-05 15:50:31 +0100198v21.08 Public major release
199 - Various bug fixes.
200 - Various optimizations:
201 - Improve LWS (Local-Workgroup-Size) heuristic in OpenCL for GeMM, Direct Convolution and Winograd Transformations when OpenCL tuner is not used
202 - Improve QASYMM8/QSYMM8 performance on OpenCL for various Arm® Mali™ GPU architectures
203 - Add dynamic weights support in Fully connected layer (CPU/GPU)
204 - Various performance optimizations for floating-point data types (CPU/GPU)
205 - Add a reduced core library build arm_compute_core_v2
206 - Expose Operator API
207 - Support fat binary build for arm8.2-a via fat_binary build flag
208 - Add CPU discovery capabilities
209 - Add data type f16 support for:
Adnan AlSinan6863fa02022-02-04 13:04:55 +0000210 - CLRemapKernel
Freddie Liardet77014ff2021-08-05 15:50:31 +0100211 - Port the following functions to stateless API:
212 - @ref CLConvolutionLayer
213 - @ref CLFlattenLayer
214 - @ref CLFullyConnectedLayer
215 - @ref CLGEMM
216 - @ref CLGEMMConvolutionLayer
217 - @ref CLGEMMLowpMatrixMultiplyCore
218 - @ref CLWinogradConvolutionLayer
219 - @ref NEConvolutionLayer
220 - @ref NEFlattenLayer
221 - @ref NEFullyConnectedLayer
222 - @ref NEGEMM
223 - @ref NEGEMMConv2d
224 - @ref NEGEMMConvolutionLayer
225 - @ref NEGEMMLowpMatrixMultiplyCore
226 - @ref NEWinogradConvolutionLayer
227 - Remove the following functions:
228 - CLWinogradInputTransform
229 - Remove CLCoreRuntimeContext
230 - Remove ICPPSimpleKernel
231 - Rename file arm_compute/runtime/CL/functions/CLElementWiseUnaryLayer.h to arm_compute/runtime/CL/functions/CLElementwiseUnaryLayer.h
232
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000233v21.05 Public major release
Sheri Zhangc2bed952021-05-06 12:12:38 +0100234 - Various bug fixes.
235 - Various optimisations.
236 - Various documentation updates:
Jakub Sujakee301b32021-06-04 09:46:08 +0100237 - Add supported operators and corresponding Android NNAPI operators.
238 - Documentation reorg into user guide and contributor guide.
Sheri Zhangc2bed952021-05-06 12:12:38 +0100239 - Add support for a global allocator for OpenCL tensors
240 - Add experimental support for [CLVK](https://github.com/kpet/clvk).
241 - Add data type S32 support for:
242 - @ref opencl::kernels::ClArithmeticKernel
243 - Add data type QASYMM8 support for:
244 - @ref CLROIPoolingLayer
245 - @ref CLROIPoolingLayerKernel
246 - @ref NEROIPoolingLayer
247 - @ref NEROIPoolingLayerKernel
248 - Add per-channel quantization support for:
249 - @ref CLDeconvolutionLayer
250 - @ref CLDirectDeconvolutionLayer
251 - @ref NEConvolutionLayer
252 - @ref NEDeconvolutionLayer
253 - Remove padding from OpenCL kernels:
254 - @ref CLL2NormalizeLayerKernel
Gian Marco Iodice8155c022021-04-16 15:08:59 +0100255 - CLDepthwiseConvolutionLayer3x3NHWCKernel
Sheri Zhangc2bed952021-05-06 12:12:38 +0100256 - @ref CLNormalizationLayerKernel
257 - @ref CLNormalizePlanarYUVLayerKernel
258 - @ref opencl::kernels::ClMulKernel
259 - @ref CLReductionOperationKernel
260 - @ref CLROIPoolingLayerKernel
261 - Remove computer vision support from Arm® Neon™ backend
262 - Remove the following functions:
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000263 - NEAbsoluteDifference
264 - NEAccumulate
265 - NEBox3x3
266 - NECannyEdge
267 - NEChannelCombine
268 - NEChannelExtract
269 - NEColorConvert
Michalis Spyrou473cb012021-02-23 11:48:12 +0000270 - NEConvolution
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000271 - NEDerivative
272 - NEDilate
273 - NEEqualizeHistogram
274 - NEErode
275 - NEFastCorners
276 - NEGaussian3x3
277 - NEGaussian5x5
278 - NEGaussianPyramid
279 - NEHOGDescriptor
280 - NEHOGDetector
281 - NEHOGGradient
282 - NEHOGMultiDetection
283 - NEHarrisCorners
284 - NEHistogram
285 - NEIntegralImage
286 - NELaplacianPyramid
287 - NELaplacianReconstruct
288 - NEMagnitude
289 - NEMeanStdDev
290 - NEMedian3x3
291 - NEMinMaxLocation
292 - NENonLinearFilter
293 - NEOpticalFlow
294 - NEPhase
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000295 - NEScharr3x3
296 - NESobel3x3
297 - NESobel5x5
298 - NESobel7x7
299 - NETableLookup
300 - NEThreshold
301 - NEWarpAffine
Michalis Spyrou473cb012021-02-23 11:48:12 +0000302 - NEWarpPerspectiveKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000303 - Remove all GLES kernels / functions / tests / examples
Sheri Zhangc2bed952021-05-06 12:12:38 +0100304 - Remove computer vision support from CL backend
305 - Remove the following functions:
Michalis Spyrou473cb012021-02-23 11:48:12 +0000306 - CLAbsoluteDifference
307 - CLAccumulate
308 - CLBox3x3
309 - CLCannyEdge
310 - CLChannelCombine
311 - CLChannelExtract
312 - CLColorConvert
313 - CLConvolution
314 - CLDerivative
315 - CLDilate
316 - CLEqualizeHistogram
317 - CLErode
318 - CLFastCorners
319 - CLGaussian3x3
320 - CLGaussian5x5
321 - CLGaussianPyramid
322 - CLHOGDescriptor
323 - CLHOGDetector
324 - CLHOGGradient
325 - CLHOGMultiDetection
326 - CLHarrisCorners
327 - CLHistogram
328 - CLIntegralImage
329 - CLLaplacianPyramid
330 - CLLaplacianReconstruct
331 - CLMagnitude
332 - CLMeanStdDev
333 - CLMedian3x3
334 - CLMinMaxLocation
335 - CLNonLinearFilter
336 - CLOpticalFlow
337 - CLPhase
338 - CLScharr3x3
339 - CLSobel3x3
340 - CLSobel5x5
341 - CLSobel7x7
342 - CLTableLookup
343 - CLThreshold
344 - CLWarpAffine
345 - CLWarpPerspective
Ramy Elgammal0d274b72022-08-05 13:14:57 +0100346
Georgios Pinitas40f51a62020-11-21 03:04:18 +0000347v21.02 Public major release
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000348 - Various bug fixes.
349 - Various optimisations.
Georgios Pinitas45514032020-12-30 00:03:09 +0000350 - Upgrade C++ standard to C++14
351 - Add macOS support
Giorgio Arena1055dc12021-02-19 09:53:06 +0000352 - Add Armv8-R AArch64 architecture support
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000353 - Add SVE/SVE2 support for:
Manuel Bottini10b38262021-02-19 18:16:44 +0000354 - NEScaleKernel
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000355 - @ref NEActivationLayer
356 - @ref NEArithmeticAddition
357 - @ref NEBatchNormalizationLayerKernel
Giorgio Arena1055dc12021-02-19 09:53:06 +0000358 - @ref cpu::kernels::CpuLogits1DSoftmaxKernel
359 - @ref cpu::kernels::CpuLogits1DMaxKernel
360 - @ref cpu::kernels::CpuElementwiseUnaryKernel
Sheri Zhangdda69142021-02-01 19:06:57 +0000361 - Remove padding from OpenCL kernels:
Sheri Zhang1efed922021-03-10 22:43:38 +0000362 - CLDirectConvolutionLayerKernel
Sheri Zhangdda69142021-02-01 19:06:57 +0000363 - @ref CLArgMinMaxLayerKernel
364 - @ref CLPadLayerKernel
365 - @ref CLROIAlignLayerKernel
366 - @ref CLRangeKernel
Manuel Bottini3b131ab2021-02-19 18:16:44 +0000367 - CLScaleKernel
Sheri Zhangdda69142021-02-01 19:06:57 +0000368 - @ref CLSelectKernel
369 - @ref CLBitwiseKernel
Giorgio Arena1055dc12021-02-19 09:53:06 +0000370 - @ref opencl::kernels::ClFloorKernel
Teresa Charlin27886092021-02-25 20:15:01 +0000371 - CLTransposeKernel
Giorgio Arena5b50f422021-02-17 11:43:05 +0000372 - Deprecate functions in CLTuner:
373 - add_lws_to_table
374 - import_lws_table
375 - lws_table
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000376 - Remove functions:
Georgios Pinitas96b16b62020-12-01 17:41:34 +0000377 - NELocallyConnectedLayer / CLLocallyConnectedLayer
Georgios Pinitasf7c5a412020-12-03 14:38:33 +0000378 - NEIm2Col
379 - NECol2Im
380 - NEGEMMInterleave4x4
381 - NEGEMMTranspose1xW
Georgios Pinitas8c3c0e72020-12-03 20:11:53 +0000382 - NEComputeAllAnchors / CLComputeAllAnchors
Georgios Pinitasec2256b2020-12-03 18:51:58 +0000383 - NEGEMMAssemblyDispatch
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000384 - NEUpsampleLayer / CLUpsampleLayer
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000385 - Remove kernels:
Georgios Pinitasd308df32020-12-01 16:56:36 +0000386 - NEGEMMMatrixVectorMultiplyKernel
Georgios Pinitas96b16b62020-12-01 17:41:34 +0000387 - NELocallyConnectedMatrixMultiplyKernel / CLLocallyConnectedMatrixMultiplyKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000388 - NEUpsampleLayerKernel / CLUpsampleLayerKernel
Gian Marco Iodicef5aad512021-02-08 17:34:40 +0000389 - Extend OpenCL tuner with workgroup batch size support
390 - Experimental extension for the OpenCL tuner to tune the batches of work groups distribute to compute units
Gian Marco Iodice716b1be2021-02-10 17:33:27 +0000391 - Add functionality to load the OpenCL GEMM heuristics at runtime
392 - The GEMM heuristic file (MLGO) can be used to update the default GEMM heuristics available for OpenCL
Giorgio Arenacd7d1782021-02-22 14:58:37 +0000393 - Note: there might be performance regressions against v20.08 in Inception v3 using int8 data types on Arm Mali-G77 GPUs. Currently under investigation
Jakub Sujakee301b32021-06-04 09:46:08 +0100394 - Note: data-type decoupling is in progress and experimental. Warning of unused symbols might be raised
Georgios Pinitas40f51a62020-11-21 03:04:18 +0000395
SiCong Li96209c72020-08-21 12:28:30 +0100396v20.11 Public major release
morgolock70b1eb82020-11-24 13:54:19 +0000397 - Various bug fixes.
398 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000399 - Performance regressions can be noted when executing Depthwise Convolution on Arm® Neon™ with a depth multiplier > 1 for quantized data type.
morgolock0e728492020-11-20 11:03:33 +0000400 This is planned to be resolved in 21.02 release.
morgolock70b1eb82020-11-24 13:54:19 +0000401 - Added new data type QASYMM8_SIGNED support for @ref NEROIAlignLayer.
SiCong Li903f8cc2020-08-27 10:17:10 +0100402 - Added new data type S32 support for:
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000403 - NEArithmeticSubtraction
404 - NEArithmeticSubtractionKernel
SiCong Libb88f892020-08-28 11:18:47 +0100405 - @ref NEPixelWiseMultiplication
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000406 - NEPixelWiseMultiplicationKernel
Sang-Hoon Park63001ac2021-01-18 14:20:27 +0000407 - NEElementwiseDivision
408 - NEDivisionOperationKernel
SiCong Li96209c72020-08-21 12:28:30 +0100409 - Interface change
410 - Properly support softmax axis to have the same meaning as other major frameworks. That is, axis now defines the dimension
411 on which Softmax/Logsoftmax is performed. E.g. for input of shape 4x5x6 and axis=1, softmax will be applied to 4x6=24 vectors of size 5.
412 The supported value range of axis is [-rank, rank).
413 This change applies to the following functions:
414 - @ref NESoftmaxLayer
415 - @ref NELogSoftmaxLayer
416 - @ref CLSoftmaxLayer
417 - @ref CLLogSoftmaxLayer
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +0000418 - GCSoftmaxLayer
Sheri Zhang824061d2020-10-26 15:46:37 +0000419 - New OpenCL kernels / functions:
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100420 - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
morgolock0e728492020-11-20 11:03:33 +0000421 - @ref CLLogicalNot
422 - @ref CLLogicalAnd
423 - @ref CLLogicalOr
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000424 - New Arm® Neon™ kernels / functions:
morgolock0e728492020-11-20 11:03:33 +0000425 - @ref NELogicalNot
426 - @ref NELogicalAnd
427 - @ref NELogicalOr
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000428 - Removed padding from Arm® Neon™ kernels:
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000429 - NEComplexPixelWiseMultiplicationKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000430 - NENonMaximaSuppression3x3Kernel
Adnan AlSinan6863fa02022-02-04 13:04:55 +0000431 - NERemapKernel
Michele Di Giorgio93b75e02021-06-21 12:00:43 +0100432 - NEGEMMInterleave4x4Kernel
Manuel Bottini327225d2021-04-13 13:09:30 +0100433 - NEDirectConvolutionLayerKernel
Manuel Bottini10b38262021-02-19 18:16:44 +0000434 - NEScaleKernel
Georgios Pinitas96b16b62020-12-01 17:41:34 +0000435 - NELocallyConnectedMatrixMultiplyKernel
Manuel Bottinicfac51c2021-06-18 15:47:28 +0100436 - NEGEMMLowpOffsetContributionKernel
Michele Di Giorgio93b75e02021-06-21 12:00:43 +0100437 - NEGEMMTranspose1xWKernel
Michele Di Giorgio19289042021-02-03 16:05:00 +0000438 - NEPoolingLayerKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000439 - NEConvolutionKernel
Michalis Spyrou60c3b0e2021-04-08 12:02:58 +0100440 - NEDepthwiseConvolutionLayerNativeKernel
Manuel Bottinicfac51c2021-06-18 15:47:28 +0100441 - NEGEMMLowpMatrixMultiplyKernel
Michele Di Giorgio53832b22021-06-21 14:45:44 +0100442 - NEGEMMMatrixMultiplyKernel
Manuel Bottini327225d2021-04-13 13:09:30 +0100443 - NEDirectConvolutionLayerOutputStageKernel
Sheri Zhanged367132020-10-08 15:46:16 +0100444 - @ref NEReductionOperationKernel
Manuel Bottinicfac51c2021-06-18 15:47:28 +0100445 - NEGEMMLowpMatrixAReductionKernel
446 - NEGEMMLowpMatrixBReductionKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000447 - Removed padding from OpenCL kernels:
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000448 - CLBatchConcatenateLayerKernel
Michele Di Giorgio1e0208a2021-01-22 15:42:59 +0000449 - CLElementwiseOperationKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000450 - @ref CLBatchNormalizationLayerKernel
Michele Di Giorgioe1314662021-02-01 17:09:32 +0000451 - CLPoolingLayerKernel
Manuel Bottinic6f4ec32021-05-18 18:41:56 +0100452 - CLWinogradInputTransformKernel
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100453 - CLGEMMLowpMatrixMultiplyNativeKernel
454 - CLGEMMLowpMatrixAReductionKernel
455 - CLGEMMLowpMatrixBReductionKernel
456 - CLGEMMLowpOffsetContributionOutputStageKernel
457 - CLGEMMLowpOffsetContributionKernel
Manuel Bottinic6f4ec32021-05-18 18:41:56 +0100458 - CLWinogradOutputTransformKernel
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100459 - CLGEMMLowpMatrixMultiplyReshapedKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000460 - @ref CLFuseBatchNormalizationKernel
461 - @ref CLDepthwiseConvolutionLayerNativeKernel
Georgios Pinitas11d84152021-04-28 10:20:18 +0100462 - CLDepthConvertLayerKernel
Sheri Zhang7e20e292021-02-02 11:49:34 +0000463 - CLCopyKernel
Gian Marco Iodice8155c022021-04-16 15:08:59 +0100464 - CLDepthwiseConvolutionLayer3x3NHWCKernel
Georgios Pinitasf47f7182021-01-15 09:29:50 +0000465 - CLActivationLayerKernel
Manuel Bottinic6f4ec32021-05-18 18:41:56 +0100466 - CLWinogradFilterTransformKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000467 - CLWidthConcatenateLayerKernel
468 - CLWidthConcatenate4TensorsKernel
469 - CLWidthConcatenate2TensorsKernel
Sang-Hoon Park201e0fe2021-01-27 13:14:56 +0000470 - CLLogits1DMaxShiftExpSumKernel
471 - CLLogits1DNormKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000472 - CLHeightConcatenateLayerKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100473 - CLGEMMMatrixMultiplyKernel
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100474 - CLGEMMLowpQuantizeDownInt32ScaleKernel
475 - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
476 - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000477 - CLDepthConcatenateLayerKernel
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100478 - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000479 - Removed OpenCL kernels / functions:
480 - CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
481 - CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel
482 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel
morgolock00c76012020-11-06 10:40:12 +0000483 - Deprecated OpenCL kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together):
Georgios Pinitas2d221392020-09-03 15:16:37 +0100484 - CLLocallyConnectedLayer
485 - CLLocallyConnectedMatrixMultiplyKernel
morgolock00c76012020-11-06 10:40:12 +0000486 - CLAbsoluteDifference
487 - CLAbsoluteDifferenceKernel
488 - CLAccumulate
489 - CLAccumulateKernel
490 - CLAccumulateSquared
491 - CLAccumulateSquaredKernel
492 - CLAccumulateWeighted
493 - CLAccumulateWeightedKernel
494 - CLAccumulateWeightedFP16Kernel
495 - CLBox3x3
496 - CLBox3x3Kernel
497 - CLBox3x3FP16Kernel
498 - CLCannyEdge
499 - CLChannelCombine
500 - CLChannelCombineKernel
501 - CLChannelExtract
502 - CLChannelExtractKernel
503 - CLColorConvert
504 - CLColorConvertKernel
505 - CLConvolution3x3
506 - CLConvolutionRectangle
507 - CLConvolutionRectangleKernel
508 - CLConvolutionSquare
509 - CLConvolutionKernel
510 - CLDerivative
511 - CLDerivativeKernel
512 - CLDilate
513 - CLDilateKernel
514 - CLEqualizeHistogram
515 - CLErode
516 - CLErodeKernel
517 - CLFastCorners
518 - CLFastCornersKernel
519 - CLGaussian3x3
520 - CLGaussian3x3Kernel
521 - CLGaussian5x5
522 - CLGaussian5x5HorKernel
523 - CLGaussian5x5VertKernel
524 - CLGaussianPyramid
525 - CLGaussianPyramidHalf
526 - CLGaussianPyramidOrb
527 - CLHarrisCorners
528 - CLHarrisScoreKernel
529 - CLHarrisScoreFP16Kernel
530 - CLHistogram
531 - CLHistogramKernel
532 - CLHOGOrientationBinningKernel
533 - CLHOGBlockNormalizationKernel
534 - CLHOGDetectorKernel
535 - CLHOGNonMaximaSuppressionKernel
536 - CLHOGDescriptor
537 - CLHOGDetector
538 - CLHOGGradient
539 - CLHOGMultiDetection
540 - CLHOGOrientationBinningKernel
541 - CLHOGBlockNormalizationKernel
542 - CLHOGDetectorKernel
543 - CLIntegralImage
544 - CLIntegralImageKernel
545 - CLLaplacianReconstruct
546 - CLLaplacianPyramid
547 - CLMagnitude
548 - CLMagnitudePhaseKernel
549 - CLMedian3x3
550 - CLMedian3x3Kernel
551 - CLMinMaxLocation
552 - CLMinMaxLocationKernel
553 - CLNonLinearFilter
554 - CLNonLinearFilterKernel
555 - CLNonMaximaSuppression3x3
556 - CLNonMaximaSuppression3x3FP16Kernel
557 - CLNonMaximaSuppression3x3Kernel
558 - CLOpticalFlow
559 - CLPhase
560 - CLRemap
561 - CLRemapKernel
562 - CLScharr3x3
563 - CLScharr3x3Kernel
564 - CLSobel3x3
565 - CLSobel3x3Kernel
566 - CLSobel5x5
567 - CLSobel5x5HorKernel
568 - CLSobel5x5VertKernel
569 - CLSobel7x7
570 - CLSobel7x7HorKernel
571 - CLSobel7x7VertKernel
572 - CLThreshold
573 - CLThresholdKernel
574 - CLWarpAffine
575 - CLWarpAffineKernel
576 - CLWarpPerspective
577 - CLWarpPerspectiveKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000578 - Deprecated Arm® Neon™ kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together):
Georgios Pinitas2d221392020-09-03 15:16:37 +0100579 - NELocallyConnectedLayer
580 - NELocallyConnectedMatrixMultiplyKernel
morgolock0c862652020-11-06 08:59:45 +0000581 - NEAbsoluteDifference
582 - NEAbsoluteDifferenceKernel
583 - NEAccumulate
584 - NEAccumulateKernel
585 - NEAccumulateSquared
586 - NEAccumulateSquaredKernel
587 - NEAccumulateWeighted
588 - NEAccumulateWeightedKernel
589 - NEAccumulateWeightedFP16Kernel
590 - NEBox3x3
591 - NEBox3x3Kernel
592 - NEBox3x3FP16Kernel
593 - NECannyEdge
594 - NEChannelCombine
595 - NEChannelCombineKernel
596 - NEChannelExtract
597 - NEChannelExtractKernel
598 - NEColorConvert
599 - NEColorConvertKernel
600 - NEConvolution3x3
601 - NEConvolutionRectangle
602 - NEConvolutionRectangleKernel
603 - NEConvolutionSquare
604 - NEConvolutionKernel
605 - NEDerivative
606 - NEDerivativeKernel
607 - NEDilate
608 - NEDilateKernel
609 - NEEqualizeHistogram
610 - NEErode
611 - NEErodeKernel
612 - NEFastCorners
613 - NEFastCornersKernel
614 - NEGaussian3x3
615 - NEGaussian3x3Kernel
616 - NEGaussian5x5
617 - NEGaussian5x5HorKernel
618 - NEGaussian5x5VertKernel
619 - NEGaussianPyramid
620 - NEGaussianPyramidHalf
621 - NEGaussianPyramidOrb
622 - NEHarrisCorners
623 - NEHarrisScoreKernel
624 - NEHarrisScoreFP16Kernel
625 - NEHistogram
626 - NEHistogramKernel
627 - NEHOGOrientationBinningKernel
628 - NEHOGBlockNormalizationKernel
629 - NEHOGDetectorKernel
630 - NEHOGNonMaximaSuppressionKernel
631 - NEHOGDescriptor
632 - NEHOGDetector
633 - NEHOGGradient
634 - NEHOGMultiDetection
635 - NEHOGOrientationBinningKernel
636 - NEHOGBlockNormalizationKernel
637 - NEHOGDetectorKernel
638 - NEIntegralImage
639 - NEIntegralImageKernel
640 - NELaplacianReconstruct
641 - NELaplacianPyramid
642 - NEMagnitude
643 - NEMagnitudePhaseKernel
644 - NEMedian3x3
645 - NEMedian3x3Kernel
646 - NEMinMaxLocation
647 - NEMinMaxLocationKernel
648 - NENonLinearFilter
649 - NENonLinearFilterKernel
650 - NENonMaximaSuppression3x3
651 - NENonMaximaSuppression3x3FP16Kernel
652 - NENonMaximaSuppression3x3Kernel
653 - NEOpticalFlow
654 - NEPhase
655 - NERemap
656 - NERemapKernel
657 - NEScharr3x3
658 - NEScharr3x3Kernel
659 - NESobel3x3
660 - NESobel3x3Kernel
661 - NESobel5x5
662 - NESobel5x5HorKernel
663 - NESobel5x5VertKernel
664 - NESobel7x7
665 - NESobel7x7HorKernel
666 - NESobel7x7VertKernel
667 - NEThreshold
668 - NEThresholdKernel
669 - NEWarpAffine
670 - NEWarpAffineKernel
671 - NEWarpPerspective
672 - NEWarpPerspectiveKernel
morgolockd6ee9ed2020-11-19 10:07:14 +0000673 - Deprecated GLES kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together):
674 - GCAbsoluteDifference
675 - GCActivationLayer
676 - GCArithmeticAddition
677 - GCBatchNormalizationLayer
678 - GCConcatenateLayer
679 - GCConvolutionLayer
680 - GCDepthwiseConvolutionLayer
681 - GCDirectConvolutionLayer
682 - GCDropoutLayer
683 - GCFillBorder
684 - GCFullyConnectedLayer
685 - GCGEMM
686 - GCGEMMInterleave4x4
687 - GCGEMMTranspose1xW
688 - GCNormalizationLayer
689 - GCNormalizePlanarYUVLayer
690 - GCPixelWiseMultiplication
691 - GCPoolingLayer
692 - GCScale
693 - GCSoftmaxLayer
694 - GCTensorShift
695 - GCTranspose
696
SiCong Li96209c72020-08-21 12:28:30 +0100697
Georgios Pinitas25ef7212020-06-02 23:00:41 +0100698v20.08 Public major release
699 - Various bug fixes.
700 - Various optimisations.
Sheri Zhang3ef9b5f2020-07-09 16:32:58 +0100701 - Added new data type QASYMM8_SIGNED support for:
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100702 - @ref CLArgMinMaxLayer
703 - @ref CLArgMinMaxLayerKernel
704 - Added new data type U8 support for:
705 - @ref NECropKernel
Sheri Zhang7e20e292021-02-02 11:49:34 +0000706 - CLCropKernel
Jakub Sujakee301b32021-06-04 09:46:08 +0100707 - Added align_corner support for nearest neighbor interpolation in:
Manuel Bottini10b38262021-02-19 18:16:44 +0000708 - NEScaleKernel
Manuel Bottini3b131ab2021-02-19 18:16:44 +0000709 - CLScaleKernel
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100710 - New OpenCL kernels / functions:
711 - @ref CLMaxUnpoolingLayerKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000712 - New Arm® Neon™ kernels / functions:
Dana Zlotnik149203b2022-01-26 12:38:03 +0200713 - NEMaxUnpoolingLayerKernel
Sheri Zhang3ef9b5f2020-07-09 16:32:58 +0100714 - New graph example:
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100715 - graph_yolov3_output_detector
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100716 - GEMMTuner improvements:
717 - Added fp16 support
718 - Output json files for easier integration
719 - Enabled tuning for export_to_cl_image_rhs option for RHS tensors
720 - More robust script for running benchmarks
Sheri Zhang3ef9b5f2020-07-09 16:32:58 +0100721 - Removed padding from:
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000722 - NEPixelWiseMultiplicationKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000723 - NEHeightConcatenateLayerKernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000724 - NEThresholdKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000725 - NEBatchConcatenateLayerKernel
Teresa Charlind1dc09c2021-03-04 15:24:45 +0000726 - NETransposeKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100727 - @ref NEBatchNormalizationLayerKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000728 - NEArithmeticSubtractionKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100729 - @ref NEBoundingBoxTransformKernel
Michalis Spyrou373b4072021-01-20 16:41:12 +0000730 - NELogits1DMaxKernel
731 - NELogits1DSoftmaxKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100732 - @ref NEROIPoolingLayerKernel
733 - @ref NEROIAlignLayerKernel
Georgios Pinitas0b1c2db2020-12-04 15:51:34 +0000734 - NEYOLOLayerKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000735 - NEUpsampleLayerKernel
Georgios Pinitas70eb53b2021-01-06 19:42:21 +0000736 - NEFloorKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000737 - NEWidthConcatenateLayerKernel
738 - NEDepthConcatenateLayerKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100739 - @ref NENormalizationLayerKernel
740 - @ref NEL2NormalizeLayerKernel
Georgios Pinitasc6f95102021-03-30 10:03:01 +0100741 - NEFillArrayKernel
Georgios Pinitas11d84152021-04-28 10:20:18 +0100742 - NEDepthConvertLayerKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100743 - @ref NERangeKernel
744 - @ref NEPriorBoxLayer
Sheri Zhanged367132020-10-08 15:46:16 +0100745 - Removed OpenCL kernels / functions:
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100746 - CLGEMMLowpQuantizeDownInt32ToUint8Scale
747 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000748 - Removed Arm® Neon™ kernels / functions:
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100749 - NEGEMMLowpQuantizeDownInt32ToUint8Scale
750 - NEGEMMMatrixAccumulateBiasesKernel
SiCong Lid004a7a2020-05-28 15:26:41 +0100751 - Deprecated functions / interfaces:
Michalis Spyrou473cb012021-02-23 11:48:12 +0000752 - Non-descriptor based interfaces for NEThreshold, CLThreshold
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +0000753 - Non-descriptor based interfaces for @ref NEScale, @ref CLScale and GCScale
754 - In @ref NESoftmaxLayer, @ref NELogSoftmaxLayer, @ref CLSoftmaxLayer, @ref CLLogSoftmaxLayer and GCSoftmaxLayer :
755 The default "axis" value for @ref CLSoftmaxLayer, @ref CLLogSoftmaxLayer and GCSoftmaxLayer is changed from 1 to 0.
morgolock9c7fed82020-08-05 12:30:56 +0100756 Only axis 0 is supported.
757 The default "axis" value for @ref NESoftmaxLayer, @ref NELogSoftmaxLayer is changed from 1 to 0.
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100758 Only axis 0 is supported.
Sang-Hoon Parka0205b92020-07-07 09:36:09 +0100759 - The support for quantized data types has been removed from @ref CLLogSoftmaxLayer due to implementation complexity.
Manuel Bottinid844c082021-07-14 12:58:54 +0100760 - Removed padding requirement for the input (e.g. LHS of GEMM) and output in CLGEMMMatrixMultiplyNativeKernel, CLGEMMMatrixMultiplyReshapedKernel, CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLIm2ColKernel (NHWC only)
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100761 - This change allows to use @ref CLGEMMConvolutionLayer without extra padding for the input and output.
762 - Only the weights/bias of @ref CLGEMMConvolutionLayer could require padding for the computation.
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100763 - Only on Arm® Mali™ Midgard GPUs, @ref CLGEMMConvolutionLayer could require padding since CLGEMMMatrixMultiplyKernel is called and currently requires padding.
764 - Added support for exporting the OpenCL buffer object to the OpenCL image object in CLGEMMMatrixMultiplyReshapedKernel and CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100765 - This support allows to export the OpenCL buffer used for the reshaped RHS matrix to the OpenCL image object.
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100766 - The padding requirement for the OpenCL image object is considered into the CLGEMMReshapeRHSMatrixKernel.
767 - The reshaped RHS matrix stores the weights when GEMM is used to accelerate CLGEMMConvolutionLayer.
Georgios Pinitas25ef7212020-06-02 23:00:41 +0100768
Georgios Pinitasfd7780d2020-03-17 11:41:00 +0000769v20.05 Public major release
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000770 - Various bug fixes.
771 - Various optimisations.
Michele Di Giorgio36a551f2020-04-23 11:55:29 +0100772 - Updated recommended NDK version to r18b.
773 - Updated recommended gcc version to Linaro 6.3.1.
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000774 - Added Bfloat16 type support
775 - Added Bfloat16 support in:
Manuel Bottini29599d02021-07-06 15:01:35 +0100776 - NEWeightsReshapeKernel
777 - NEConvolutionLayerReshapeWeights
Manuel Bottini90028992021-06-30 18:29:18 +0100778 - NEIm2ColKernel
Georgios Pinitasf7c5a412020-12-03 14:38:33 +0000779 - NEIm2Col
Georgios Pinitas11d84152021-04-28 10:20:18 +0100780 - NEDepthConvertLayerKernel
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000781 - @ref NEDepthConvertLayer
782 - @ref NEGEMMConvolutionLayer
Georgios Pinitasec2256b2020-12-03 18:51:58 +0000783 - NEGEMMAssemblyDispatch
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000784 - Added new data type QASYMM8_SIGNED support for:
785 - @ref CLDirectConvolutionLayer
786 - @ref CLDeconvolutionLayer
787 - @ref CLDirectDeconvolutionLayer
788 - @ref CLGEMMDeconvolutionLayer
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100789 - CLGEMMLowpMatrixMultiplyReshapedKernel
790 - CLGEMMLowpQuantizeDownInt32ScaleKernel
791 - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000792 - @ref CLReductionOperation
793 - @ref CLReduceMean
Sheri Zhang359c48e2020-04-30 22:53:39 +0100794 - @ref NEScale
Manuel Bottini10b38262021-02-19 18:16:44 +0000795 - NEScaleKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000796 - NEUpsampleLayer
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000797 - @ref NECast
798 - @ref NEReductionOperation
799 - @ref NEReduceMean
800 - @ref NEArgMinMaxLayer
801 - @ref NEDeconvolutionLayer
Manuel Bottiniae58bdf2021-06-17 17:18:45 +0100802 - NEGEMMLowpQuantizeDownInt32ScaleKernel
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000803 - @ref CPPBoxWithNonMaximaSuppressionLimit
804 - @ref CPPDetectionPostProcessLayer
805 - @ref CPPPermuteKernel
806 - @ref CPPPermute
807 - @ref CPPTopKVKernel
808 - @ref CPPTopKV
Sheri Zhang359c48e2020-04-30 22:53:39 +0100809 - @ref CPPUpsample
810 - @ref CPPUpsampleKernel
Sheri Zhang31b49ca2020-04-24 11:15:10 +0100811 - New OpenCL kernels / functions:
812 - @ref CLQLSTMLayer
813 - @ref CLQLSTMLayerNormalizationKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000814 - New Arm® Neon™ kernels / functions:
Sheri Zhang31b49ca2020-04-24 11:15:10 +0100815 - @ref NEQLSTMLayer
816 - @ref NEQLSTMLayerNormalizationKernel
817 - Added HARD_SWISH support in:
Georgios Pinitasf47f7182021-01-15 09:29:50 +0000818 - CLActivationLayerKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000819 - NEActivationLayerKernel
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000820 - Deprecated OpenCL kernels / functions:
821 - CLGEMMLowpQuantizeDownInt32ToUint8Scale
822 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000823 - Deprecated Arm® Neon™ kernels / functions:
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000824 - NEGEMMLowpQuantizeDownInt32ToUint8Scale
825 - Removed CPP kernels / functions:
826 - CPPFlipWeightsKernel
Manuel Bottini387259a2020-05-21 17:14:36 +0100827 - Removed PoolingLayerInfo constructors without Data Layout.
828 - Removed CLDepthwiseConvolutionLayer3x3
829 - Removed NEDepthwiseConvolutionLayerOptimized
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000830 - Added support for Winograd 3x3,4x4 on Arm® Neon™ FP16:
Manuel Bottini075253a2020-05-22 12:57:18 +0100831 - @ref NEWinogradConvolutionLayer
Michalis Spyrou96f977e2021-07-01 12:20:56 +0100832 - CpuWinogradConv2dTransformInputKernel
833 - CpuWinogradConv2dTransformOutputKernel
834 - CpuWinogradConv2dTransformWeightsKernel
Manuel Bottini075253a2020-05-22 12:57:18 +0100835 - Added CLCompileContext
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000836 - Added Arm® Neon™ GEMM kernel with 2D window support
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000837
Michele Di Giorgio740872e2020-03-04 15:29:49 +0000838v20.02.1 Maintenance release
839 - Added Android-NN build script.
840
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000841v20.02 Public major release
842 - Various bug fixes.
843 - Various optimisations.
844 - Added new data type QASYMM8_SIGNED support for:
845 - @ref CLDepthwiseConvolutionLayer
Manuel Bottini387259a2020-05-21 17:14:36 +0100846 - CLDepthwiseConvolutionLayer3x3
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000847 - @ref CLGEMMConvolutionLayer
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100848 - CLGEMMLowpMatrixMultiplyCore
849 - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
850 - CLGEMMLowpMatrixMultiplyNativeKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000851 - @ref NEActivationLayer
Sang-Hoon Park63001ac2021-01-18 14:20:27 +0000852 - NEComparisonOperationKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000853 - @ref NEConvolutionLayer
854 - @ref NEDepthwiseConvolutionLayer
Georgios Pinitas7d0adc62020-09-04 15:25:24 +0100855 - NEDepthwiseConvolutionLayer3x3Kernel
Manuel Bottini327225d2021-04-13 13:09:30 +0100856 - NEDirectConvolutionLayerOutputStageKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000857 - @ref NEElementwiseComparison
858 - @ref NEElementwiseMax
859 - @ref NEElementwiseMin
860 - @ref NEElementwiseSquaredDiff
861 - @ref NEFullyConnectedLayer
Michele Di Giorgiof22f6722020-07-03 16:29:24 +0100862 - NEGEMMMatrixVectorMultiplyKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000863 - @ref NEPixelWiseMultiplication
864 - @ref NEPoolingLayer
865 - @ref NEPReluLayer
866 - Added support for QSYMM8_PER_CHANNEL in:
Georgios Pinitas7d0adc62020-09-04 15:25:24 +0100867 - NEDepthwiseConvolutionLayer3x3Kernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000868 - Added support for split sizes in:
869 - @ref CLSplit
870 - @ref NESplit
871 - New OpenCL kernels / functions:
872 - @ref CLFill
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100873 - CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000874 - New Arm® Neon™ kernels / functions:
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000875 - @ref NEFill
Manuel Bottiniae58bdf2021-06-17 17:18:45 +0100876 - NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000877 - Deprecated Arm® Neon™ functions / interfaces:
Manuel Bottini387259a2020-05-21 17:14:36 +0100878 - CLDepthwiseConvolutionLayer3x3
879 - NEDepthwiseConvolutionLayerOptimized
880 - PoolingLayerInfo constructors without Data Layout.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000881 - Added support for quantization with multiplier greater than 1 on Arm® Neon™ and CL.
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000882 - Added support for quantized inputs of type QASYMM8_SIGNED and QASYMM8 to @ref CLQuantizationLayer.
883 - Added the ability to build bootcode for bare metal.
884 - Added support for generating synthetic QASYMM8 graphs.
885 - Added support for F16 datatype in VGG16.
886 - Removed pre-built binaries for GLES.
887
Michele Di Giorgiod374ff22020-01-21 10:03:20 +0000888v19.11.1 Public maintenance release
889 - Fix offset calculation in NEReductionOperationKernel.
890 - Fix data layout in NEScaleKernel for nhwc.
891 - Retain configuration step data layout to avoid side-effects.
892 - Perform sqrt in double domain for L2 pooling.
893 - Fix output shape calculation for Reduce Mean
894 - Restrict cases where optimized NEPadLayer runs.
895
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100896v19.11 Public major release
SiCong Lica1f98c2019-11-28 11:06:11 +0000897 - Various bug fixes.
898 - Various optimisations.
SiCong Li1f7f9882019-11-28 14:59:35 +0000899 - Updated recommended NDK version to r17c.
SiCong Lica1f98c2019-11-28 11:06:11 +0000900 - Deprecated OpenCL kernels / functions:
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100901 - CLDepthwiseConvolutionLayerReshapeWeightsGenericKernel
902 - CLDepthwiseIm2ColKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000903 - CLDepthwiseSeparableConvolutionLayer
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100904 - CLDepthwiseVectorToTensorKernel
905 - CLDirectConvolutionLayerOutputStageKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000906 - Deprecated Arm® Neon™ kernels / functions:
Giorgio Arenad93e2632019-10-15 11:09:33 +0100907 - NEDepthwiseWeightsReshapeKernel
908 - NEDepthwiseIm2ColKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000909 - NEDepthwiseSeparableConvolutionLayer
Giorgio Arenad93e2632019-10-15 11:09:33 +0100910 - NEDepthwiseVectorToTensorKernel
Manuel Bottini05069f02019-09-26 17:18:26 +0100911 - NEDepthwiseConvolutionLayer3x3
SiCong Lica1f98c2019-11-28 11:06:11 +0000912 - New OpenCL kernels / functions:
913 - @ref CLInstanceNormalizationLayerKernel / @ref CLInstanceNormalizationLayer
914 - @ref CLDepthwiseConvolutionLayerNativeKernel to replace the old generic depthwise convolution (see Deprecated
915 OpenCL kernels / functions)
916 - @ref CLLogSoftmaxLayer
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000917 - New Arm® Neon™ kernels / functions:
SiCong Lica1f98c2019-11-28 11:06:11 +0000918 - @ref NEBoundingBoxTransformKernel / @ref NEBoundingBoxTransform
Georgios Pinitas8c3c0e72020-12-03 20:11:53 +0000919 - @ref NEComputeAllAnchorsKernel / NEComputeAllAnchors
SiCong Lica1f98c2019-11-28 11:06:11 +0000920 - @ref NEDetectionPostProcessLayer
921 - @ref NEGenerateProposalsLayer
922 - @ref NEInstanceNormalizationLayerKernel / @ref NEInstanceNormalizationLayer
923 - @ref NELogSoftmaxLayer
924 - @ref NEROIAlignLayerKernel / @ref NEROIAlignLayer
925 - Added QASYMM8 support for:
926 - @ref CLGenerateProposalsLayer
927 - @ref CLROIAlignLayer
928 - @ref CPPBoxWithNonMaximaSuppressionLimit
929 - Added QASYMM16 support for:
930 - @ref CLBoundingBoxTransform
931 - Added FP16 support for:
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100932 - CLGEMMMatrixMultiplyReshapedKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000933 - Added new data type QASYMM8_PER_CHANNEL support for:
Manuel Bottini9e73c932021-03-02 17:40:42 +0000934 - CLDequantizationLayer
SiCong Lica1f98c2019-11-28 11:06:11 +0000935 - @ref NEDequantizationLayer
936 - Added new data type QSYMM8_PER_CHANNEL support for:
937 - @ref CLConvolutionLayer
938 - @ref NEConvolutionLayer
939 - @ref CLDepthwiseConvolutionLayer
940 - @ref NEDepthwiseConvolutionLayer
941 - Added FP16 mixed-precision support for:
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100942 - CLGEMMMatrixMultiplyReshapedKernel
Michele Di Giorgioe1314662021-02-01 17:09:32 +0000943 - CLPoolingLayerKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000944 - Added FP32 and FP16 ELU activation for:
945 - @ref CLActivationLayer
946 - @ref NEActivationLayer
947 - Added asymmetric padding support for:
948 - @ref CLDirectDeconvolutionLayer
949 - @ref CLGEMMDeconvolutionLayer
950 - @ref NEDeconvolutionLayer
951 - Added SYMMETRIC and REFLECT modes for @ref CLPadLayerKernel / @ref CLPadLayer.
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +0000952 - Replaced the calls to NECopyKernel and NEMemsetKernel with @ref NEPadLayer in @ref NEGenerateProposalsLayer.
953 - Replaced the calls to CLCopyKernel and CLMemsetKernel with @ref CLPadLayer in @ref CLGenerateProposalsLayer.
SiCong Lica1f98c2019-11-28 11:06:11 +0000954 - Improved performance for CL Inception V3 - FP16.
955 - Improved accuracy for CL Inception V3 - FP16 by enabling FP32 accumulator (mixed-precision).
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000956 - Improved Arm® Neon™ performance by enabling fusing batch normalization with convolution and depth-wise convolution layer.
957 - Improved Arm® Neon™ performance for MobileNet-SSD by improving the output detection performance.
SiCong Lica1f98c2019-11-28 11:06:11 +0000958 - Optimized @ref CLPadLayer.
959 - Optimized CL generic depthwise convolution layer by introducing @ref CLDepthwiseConvolutionLayerNativeKernel.
960 - Reduced memory consumption by implementing weights sharing.
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100961
Michele Di Giorgiod374ff22020-01-21 10:03:20 +0000962v19.08.1 Public maintenance release
963 - Fix offset calculation in NEReductionOperationKernel.
964 - Fix data layout in NEScaleKernel for nhwc.
965 - Retain configuration step data layout to avoid side-effects.
966 - Perform sqrt in double domain for L2 pooling.
967 - Fix output shape calculation for Reduce Mean
968 - Fix broadcast CLPixelwiseMultiplication with 5D tensors
969
Georgios Pinitas3d13af82019-06-04 13:04:16 +0100970v19.08 Public major release
971 - Various bug fixes.
972 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000973 - Deprecated Arm® Neon™ functions
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100974 - NEDepthConcatenateLayer
975 - NEWidthConcatenateLayer
976 - Deprecated OpenCL kernels / functions
977 - CLDepthConcatenateLayer
978 - CLGEMMInterleave4x4Kernel / CLGEMMInterleave4x4
979 - CLGEMMTranspose1xWKernel / CLGEMMTranspose1xW
980 - CLWidthConcatenateLayer
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000981 - New Arm® Neon™ kernels / functions:
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100982 - @ref NEAbsLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100983 - @ref NECast
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100984 - @ref NEElementwisePower
985 - @ref NELogLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100986 - @ref NELSTMLayerQuantized
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100987 - @ref NENegLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100988 - @ref NEPReluLayer
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100989 - @ref NESinLayer
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000990 - NEBatchConcatenateLayerKernel
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100991 - @ref NEDepthToSpaceLayerKernel / @ref NEDepthToSpaceLayer
Michalis Spyrou60c3b0e2021-04-08 12:02:58 +0100992 - NEDepthwiseConvolutionLayerNativeKernel
Manuel Bottiniae58bdf2021-06-17 17:18:45 +0100993 - NEGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100994 - @ref NEMeanStdDevNormalizationKernel / @ref NEMeanStdDevNormalizationLayer
995 - @ref NESpaceToDepthLayerKernel / @ref NESpaceToDepthLayer
996 - New OpenCL kernels / functions:
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100997 - @ref CLAbsLayer
998 - @ref CLElementwisePower
999 - @ref CLLogLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +01001000 - @ref CLLSTMLayerQuantized
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +01001001 - @ref CLNegLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +01001002 - @ref CLPReluLayer
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +01001003 - @ref CLSinLayer
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +00001004 - CLBatchConcatenateLayerKernel
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +01001005 - @ref CLDepthToSpaceLayerKernel / @ref CLDepthToSpaceLayer
Georgios Pinitas856f66e2021-04-22 21:13:21 +01001006 - CLGEMMLowpMatrixMultiplyNativeKernel
Michele Di Giorgioba14c922020-10-12 13:27:57 +01001007 - CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +01001008 - CLGEMMMatrixMultiplyNativeKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +00001009 - CLMeanStdDevNormalizationKernel /CLMeanStdDevNormalizationLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +01001010 - @ref CLSpaceToDepthLayerKernel / @ref CLSpaceToDepthLayer
1011 - New examples:
1012 - neon_opticalflow
1013 - cl_cache
1014 - neon_permute
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +01001015 - Added support for FP16 in @ref NEDeconvolutionLayer
1016 - Added support for FP16 in @ref CLDeconvolutionLayer
1017 - Added support for REDUCE_MIN and REDUCE_MAX in @ref ReductionOperation
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +01001018 - Enable the fusion of batch normalization with convolution and depthwise convolution layer for FP32 in the graph API (OpenCL only)
1019 - Added support for fusing activation function and broadcast addition with the matrix multiplication for FP32 (OpenCL only)
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001020 - Re-factored the depthwise convolution layer kernel on Arm® Neon™ for generic cases
Jakub Sujakee301b32021-06-04 09:46:08 +01001021 - Added an optimized depthwise convolution layer kernel for 5x5 filters (Neon™ only)
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +01001022 - Added support to enable OpenCL kernel cache. Added example showing how to load the prebuilt OpenCL kernels from a binary cache file
1023 - Altered @ref QuantizationInfo interface to support per-channel quantization.
Manuel Bottini387259a2020-05-21 17:14:36 +01001024 - The CLDepthwiseConvolutionLayer3x3 will be included by @ref CLDepthwiseConvolutionLayer to accommodate for future optimizations.
1025 - The NEDepthwiseConvolutionLayerOptimized will be included by @ref NEDepthwiseConvolutionLayer to accommodate for future optimizations.
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +01001026 - Removed inner_border_right and inner_border_top parameters from @ref CLDeconvolutionLayer interface
1027 - Removed inner_border_right and inner_border_top parameters from @ref NEDeconvolutionLayer interface
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001028 - Optimized the Arm® Neon™ assembly kernel for GEMMLowp. The new implementation fuses the output stage and quantization with the matrix multiplication kernel
Georgios Pinitas3d13af82019-06-04 13:04:16 +01001029
Michalis Spyroua9c44722019-04-05 17:18:36 +01001030v19.05 Public major release
Michalis Spyrouc6608ac2019-05-16 17:40:23 +01001031 - Various bug fixes.
1032 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001033 - New Arm® Neon™ kernels / functions:
Georgios Pinitasf790fdb2019-04-24 12:41:25 +01001034 - @ref NEBatchToSpaceLayerKernel / @ref NEBatchToSpaceLayer
Sheri Zhang1e3ab422021-03-16 17:35:08 +00001035 - NEComplexPixelWiseMultiplicationKernel / @ref NEComplexPixelWiseMultiplication
Georgios Pinitasf790fdb2019-04-24 12:41:25 +01001036 - @ref NECropKernel / @ref NECropResize
Michalis Spyrou60c3b0e2021-04-08 12:02:58 +01001037 - NEDepthwiseConvolutionAssemblyDispatch
Michalis Spyrouca82e622019-05-10 16:43:20 +01001038 - @ref NEFFTDigitReverseKernel
1039 - @ref NEFFTRadixStageKernel
1040 - @ref NEFFTScaleKernel
Manuel Bottinicfac51c2021-06-18 15:47:28 +01001041 - NEGEMMLowpOffsetContributionOutputStageKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001042 - NEHeightConcatenateLayerKernel
Georgios Pinitasf790fdb2019-04-24 12:41:25 +01001043 - @ref NESpaceToBatchLayerKernel / @ref NESpaceToBatchLayer
Michalis Spyroud7dd15c2019-05-30 14:53:58 +01001044 - @ref NEFFT1D
1045 - @ref NEFFT2D
1046 - @ref NEFFTConvolutionLayer
Georgios Pinitasf790fdb2019-04-24 12:41:25 +01001047 - New OpenCL kernels / functions:
Sheri Zhangf9ab9f92021-03-16 12:09:15 +00001048 - CLComplexPixelWiseMultiplicationKernel / @ref CLComplexPixelWiseMultiplication
Sheri Zhang7e20e292021-02-02 11:49:34 +00001049 - CLCropKernel / @ref CLCropResize
Michalis Spyroud7dd15c2019-05-30 14:53:58 +01001050 - @ref CLDeconvolutionReshapeOutputKernel
Georgios Pinitasf790fdb2019-04-24 12:41:25 +01001051 - @ref CLFFTDigitReverseKernel
1052 - @ref CLFFTRadixStageKernel
1053 - @ref CLFFTScaleKernel
Georgios Pinitas4a578b92021-06-25 12:13:49 +01001054 - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +01001055 - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +00001056 - CLHeightConcatenateLayerKernel
Georgios Pinitasf790fdb2019-04-24 12:41:25 +01001057 - @ref CLDirectDeconvolutionLayer
1058 - @ref CLFFT1D
1059 - @ref CLFFT2D
1060 - @ref CLFFTConvolutionLayer
Michalis Spyrouca82e622019-05-10 16:43:20 +01001061 - @ref CLGEMMDeconvolutionLayer
1062 - New OpenGLES kernels / functions:
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001063 - GCConcatenateLayer
Michalis Spyroua9c44722019-04-05 17:18:36 +01001064 - Deprecated functions/interfaces
Georgios Pinitas09f24972019-05-17 18:14:40 +01001065 - GCDepthConcatenateLayer
1066 - NEWidthConcatenateLayer
1067 - NEDepthConcatenateLayer
1068 - CLWidthConcatenateLayer
1069 - CLDepthConcatenateLayer
Gian Marco Iodice5fc07aa2019-05-15 17:08:02 +01001070 - CLGEMMInterleave4x4
1071 - CLGEMMTranspose1xW
Michalis Spyrouc6608ac2019-05-16 17:40:23 +01001072 - Support different quantization info in CLConcatLayer.
1073 - Add checks on different input/output quantization info were not supported.
1074 - Tensors have different quantization information.
1075 - Add FP16 support checks.
1076 - Fix output quantization CLDeptwiseConv3x3 when activation is fused.
1077 - New graph examples:
1078 - graph_convolution
1079 - graph_fully_connected
1080 - graph_depthwise_convolution
1081 - Deepspeech v0.4.1
1082 - Add support for QASYMM8 in NEArithmeticSubtractionKernel.
1083 - Add support for QASYMM8 in NEPixelWiseMultiplicationKernel.
1084 - Add support for QASYMM8 NEDeconvolution.
Sheri Zhangac6499a2021-02-10 15:32:38 +00001085 - Add support for DequantizationLayer for Neon/CL.
Michalis Spyrouc6608ac2019-05-16 17:40:23 +01001086 - Add support for dilation in CLDepthwiseConvolution.
1087 - Fuse offset contribution with the output stage when we use NEGEMMLowpMatrixMultiplyCore.
1088 - Optimize CLDeconvolution.
1089 - Add StackLayer to the graph API.
1090 - Add support for "reflect" padding mode in NEPad.
1091 - Winograd 7x7 NHWC on OpenCL.
1092 - Rework CL ML layers to run exclusively on CL.
1093 - Support different quantization info in PoolingLayer.
1094 - Implement and test import memory interfaces.
1095 - Added new tests and removed old ones.
1096 - Various clang-tidy fixes.
Michalis Spyroua9c44722019-04-05 17:18:36 +01001097
giuros01a69a88b2019-01-31 16:29:19 +00001098v19.02 Public major release
Isabella Gottardi62538972019-02-12 19:52:44 +00001099 - Various bug fixes.
1100 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001101 - New Arm® Neon™ kernels / functions:
Isabella Gottardi62538972019-02-12 19:52:44 +00001102 - @ref NETileKernel / @ref NETile
1103 - @ref NEFuseBatchNormalizationKernel / @ref NEFuseBatchNormalization
Sang-Hoon Park63001ac2021-01-18 14:20:27 +00001104 - NEElementwiseOperationKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001105 - @ref NEElementwiseMax
1106 - @ref NEElementwiseMin
1107 - @ref NEElementwiseSquaredDiff
1108 - @ref NESelectKernel / @ref NESelect
1109 - @ref NESplit
1110 - @ref NESlice
1111 - @ref NEUnstack
1112 - @ref NEStridedSliceKernel / @ref NEStridedSlice
Sang-Hoon Park7249f152021-01-22 11:55:03 +00001113 - NEElementwiseUnaryKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001114 - @ref NERsqrtLayer
1115 - @ref NEExpLayer
1116 - @ref NEReverseKernel / @ref NEReverse
1117 - @ref NEArgMinMaxLayer
1118 - @ref NEStackLayerKernel / @ref NEStackLayer
1119 - @ref NERangeKernel / @ref NERange
1120 - @ref NEPadLayer
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +00001121 - NEMemsetKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001122 - @ref NEGatherKernel / @ref NEGather
1123 - @ref NEElementwiseComparison
1124 - @ref NEElementwiseComparisonStatic
Sang-Hoon Park63001ac2021-01-18 14:20:27 +00001125 - NEComparisonOperationKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001126 - @ref NEElementwiseDivision
1127 - New OpenCL kernels / functions:
1128 - @ref CLSelectKernel / @ref CLSelect
1129 - @ref CLTileKernel / @ref CLTile
1130 - @ref CLComparisonKernel / @ref CLComparison
1131 - @ref CLArgMinMaxLayer
1132 - @ref CLElementwiseMax
1133 - @ref CLElementwiseMin
1134 - @ref CLElementwiseSquaredDiff
1135 - @ref CLStackLayerKernel / @ref CLStackLayer
1136 - @ref CLReverse / @ref CLReverseKernel
1137 - @ref CLRsqrtLayer
1138 - @ref CLExpLayer
Michele Di Giorgioc9c89052021-01-26 10:20:17 +00001139 - CLElementWiseUnaryLayerKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +01001140 - CLGEMMReshapeLHSMatrixKernel
1141 - CLGEMMReshapeRHSMatrixKernel
1142 - CLGEMMMatrixMultiplyReshapedKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001143 - @ref CLRangeKernel / @ref CLRange
1144 - @ref CLUnstack
1145 - @ref CLGatherKernel / @ref CLGather
Georgios Pinitas4a578b92021-06-25 12:13:49 +01001146 - CLGEMMLowpMatrixMultiplyReshapedKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001147 - New CPP kernels / functions:
1148 - @ref CPPDetectionOutputLayer
1149 - @ref CPPTopKV / @ref CPPTopKVKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001150 - Added new examples:
1151 - graph_ssd_mobilenet.cpp
1152 - graph_mobilenet_v2.cpp
1153 - graph_resnet12.cpp
1154 - graph_srcnn955.cpp
1155 - graph_vgg_vdsr.cpp
1156 - graph_inception_resnet_v1.cpp
1157 - Add 4D tensors support to
1158 - @ref NESoftmaxLayer
1159 - Fused activation in @ref CLWinogradConvolutionLayer
Jakub Sujakee301b32021-06-04 09:46:08 +01001160 - Extended @ref NEPermute to support more cases
1161 - Added Neon™/SVE GEMM Hybrid kernels
Isabella Gottardi62538972019-02-12 19:52:44 +00001162 - Added u8 and s8 hybrid assembly kernels
1163 - Introduced GEMM strategy name in NEGEMMAssemblyWrapper
1164 - Improved @ref CLTuner
1165 - Fused the bias addition within @ref CLGEMM
1166 - Added support for QASYMM8 LOGISTIC activation in @ref NEActivationLayer
1167 - Added NHWC data layout support to:
1168 - @ref NEScale for F16
1169 - @ref CLNormalizationLayer IN_MAP_2D for FP32/FP16
1170 - @ref NEL2NormalizeLayer for FP32/FP16
1171 - @ref NENormalizationLayer IN_MAP_2D for FP32/FP16
1172 - @ref CLROIAlignLayer
Manuel Bottini5209be52019-02-13 16:34:56 +00001173 - @ref CLGenerateProposalsLayer
Isabella Gottardi62538972019-02-12 19:52:44 +00001174 - Added QASYMM8 support to the following kernels:
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001175 - NEArithmeticAdditionKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001176 - @ref NEScale
1177 - Added new tests and improved validation and benchmarking suites.
giuros01a69a88b2019-01-31 16:29:19 +00001178 - Deprecated functions/interfaces
1179 - Usage of inner_border_right and inner_border_top has been deprecated in @ref CLDeconvolutionLayer and @ref NEDeconvolutionLayer
1180
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001181v18.11 Public major release
1182 - Various bug fixes.
1183 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001184 - New Arm® Neon™ kernels / functions:
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001185 - @ref NEChannelShuffleLayer / @ref NEChannelShuffleLayerKernel
1186 - @ref NEReduceMean
1187 - @ref NEReorgLayer / @ref NEReorgLayerKernel
1188 - @ref NEPriorBoxLayer / @ref NEPriorBoxLayerKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +00001189 - NEUpsampleLayer / NEUpsampleLayerKernel
Georgios Pinitas0b1c2db2020-12-04 15:51:34 +00001190 - NEYOLOLayer / NEYOLOLayerKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001191 - New OpenCL kernels / functions:
1192 - @ref CLBatchToSpaceLayer / @ref CLBatchToSpaceLayerKernel
1193 - @ref CLBoundingBoxTransform / @ref CLBoundingBoxTransformKernel
Manuel Bottini5209be52019-02-13 16:34:56 +00001194 - @ref CLComputeAllAnchorsKernel
1195 - @ref CLGenerateProposalsLayer
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001196 - @ref CLNormalizePlanarYUVLayer / @ref CLNormalizePlanarYUVLayerKernel
1197 - @ref CLReorgLayer / @ref CLReorgLayerKernel
1198 - @ref CLSpaceToBatchLayer / @ref CLSpaceToBatchLayerKernel
1199 - @ref CLPadLayer
1200 - @ref CLReduceMean
1201 - @ref CLPriorBoxLayer / @ref CLPriorBoxLayerKernel
1202 - @ref CLROIAlignLayer / @ref CLROIAlignLayerKernel
1203 - @ref CLSlice
1204 - @ref CLSplit
1205 - @ref CLStridedSlice / @ref CLStridedSliceKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +00001206 - CLUpsampleLayer / CLUpsampleLayerKernel
Georgios Pinitas0b1c2db2020-12-04 15:51:34 +00001207 - CLYOLOLayer / CLYOLOLayerKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001208 - New CPP kernels / functions:
1209 - @ref CPPBoxWithNonMaximaSuppressionLimit / @ref CPPBoxWithNonMaximaSuppressionLimitKernel
1210 - Added the validate method in:
1211 - @ref NEDepthConvertLayer
1212 - @ref NEFloor / @ref CLFloor
Michele Di Giorgio93b75e02021-06-21 12:00:43 +01001213 - NEGEMMMatrixAdditionKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001214 - @ref NEReshapeLayer / @ref CLReshapeLayer
1215 - @ref CLScale
1216 - Added new examples:
1217 - graph_shufflenet.cpp
1218 - graph_yolov3.cpp
1219 - Added documentation for add a new function or kernel.
1220 - Improved doxygen documentation adding a list of the existing functions.
1221 - Add 4D tensors support to
Georgios Pinitas09f24972019-05-17 18:14:40 +01001222 - CLWidthConcatenateLayer
Georgios Pinitase2696b12020-12-03 20:37:43 +00001223 - CLFlattenLayer
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001224 - @ref CLSoftmaxLayer
Gian Marco Iodice8155c022021-04-16 15:08:59 +01001225 - Add dot product support for CLDepthwiseConvolutionLayer3x3NHWCKernel non-unit stride
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001226 - Add SVE support
1227 - Fused batch normalization into convolution layer weights in @ref CLFuseBatchNormalization
Gian Marco Iodice8155c022021-04-16 15:08:59 +01001228 - Fuses activation in CLDepthwiseConvolutionLayer3x3NCHWKernel, CLDepthwiseConvolutionLayer3x3NHWCKernel and @ref NEGEMMConvolutionLayer
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001229 - Added NHWC data layout support to:
1230 - @ref CLChannelShuffleLayer
1231 - @ref CLDeconvolutionLayer
1232 - @ref CLL2NormalizeLayer
1233 - Added QASYMM8 support to the following kernels:
Manuel Bottini3b131ab2021-02-19 18:16:44 +00001234 - CLScaleKernel
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001235 - NEDepthwiseConvolutionLayer3x3Kernel
Sheri Zhangf9ab9f92021-03-16 12:09:15 +00001236 - CLPixelWiseMultiplicationKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001237 - Added FP16 support to the following kernels:
Gian Marco Iodice8155c022021-04-16 15:08:59 +01001238 - CLDepthwiseConvolutionLayer3x3NHWCKernel
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001239 - NEDepthwiseConvolutionLayer3x3Kernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001240 - @ref CLNormalizePlanarYUVLayerKernel
1241 - @ref CLWinogradConvolutionLayer (5x5 kernel)
1242 - More tests added to both validation and benchmarking suites.
1243
Anthony Barbierd51ea0a2018-08-07 17:48:03 +01001244v18.08 Public major release
1245 - Various bug fixes.
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001246 - Various optimisations.
Anthony Barbierd51ea0a2018-08-07 17:48:03 +01001247 - Updated recommended NDK version to r17b.
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001248 - Removed support for QS8/QS16 data types.
1249 - Added support for grouped convolution in @ref CLConvolutionLayer.
1250 - Added NHWC data layout support to:
Georgios Pinitas09f24972019-05-17 18:14:40 +01001251 - NEDepthConcatenateLayer / CLDepthConcatenateLayer
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001252 - @ref NEWinogradConvolutionLayer / @ref CLWinogradConvolutionLayer
1253 - @ref CLDepthwiseConvolutionLayer
1254 - @ref CLDirectConvolutionLayer
1255 - @ref CLConvolutionLayer
1256 - @ref CLScale
Manuel Bottinid844c082021-07-14 12:58:54 +01001257 - CLIm2ColKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001258 - New Arm® Neon™ kernels / functions:
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001259 - @ref NERNNLayer
1260 - New OpenCL kernels / functions:
1261 - @ref CLArithmeticDivision
1262 - Introduced prepare() stage support in the graph API for GLES.
1263 - Added support for memory reusage when trying to allocate smaller CLTensors.
1264 - Enabled NHWC execution on graph examples.
1265 - Added JPEG accessor for validation purposes.
1266 - Added validate methods to some kernels / functions.
Anthony Barbierd51ea0a2018-08-07 17:48:03 +01001267
1268v18.05 Public major release
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001269 - Various bug fixes.
1270 - Various optimisations.
Jakub Sujakee301b32021-06-04 09:46:08 +01001271 - Major redesign in the interface for the Neon™ kernels implemented in assembly.
Pablo Telloeb82fd22018-02-23 13:43:50 +00001272 - Removed arm_compute::NEGEMMLowpAArch64A53Kernel / arm_compute::NEGEMMLowpAArch64Kernel / arm_compute::NEGEMMLowpAArch64V8P4Kernel / arm_compute::NEGEMMInterleavedBlockedKernel / arm_compute::NEGEMMLowpAssemblyMatrixMultiplyCore / arm_compute::NEHGEMMAArch64FP16Kernel
Jakub Sujakee301b32021-06-04 09:46:08 +01001273 - Added NEGEMMAssemblyWrapper and AssemblyKernelGlue which are used to execute assembly kernels in Neon™ functions.
Pablo Telloeb82fd22018-02-23 13:43:50 +00001274 - Minor changes to the CPUInfo type to make it compatible with the new assembly gemm interface.
Jakub Sujakee301b32021-06-04 09:46:08 +01001275 - Moved Neon™ assembly kernels to the folder src/core/Neon/kernels/arm_gemm.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001276 - Improved doxygen documentation.
1277 - Improved memory management for layer's transitions.
1278 - Added support for NHWC data layout in tensors.
1279 - Added NHWC data layout support to:
1280 - @ref NEGEMMConvolutionLayer
1281 - @ref NEDirectConvolutionLayer
1282 - @ref NEPoolingLayer / @ref CLPoolingLayer
1283 - @ref NEBatchNormalizationLayer / @ref CLBatchNormalizationLayer
1284 - @ref NEDepthwiseConvolutionLayer
1285 - @ref NEScale
Georgios Pinitasf7c5a412020-12-03 14:38:33 +00001286 - NEIm2Col
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001287 - Added support for dilated convolutions in @ref NEConvolutionLayer and @ref CLConvolutionLayer.
1288 - New OpenCL kernels / functions:
1289 - @ref CLChannelShuffleLayer / @ref CLChannelShuffleLayerKernel
Teresa Charlin91b7f742021-04-12 13:57:00 +01001290 - CLConvertFullyConnectedWeightsKernel / @ref CLConvertFullyConnectedWeights
Sheri Zhang7e20e292021-02-02 11:49:34 +00001291 - @ref CLCopy / CLCopyKernel
Anthony Barbier38e7f1f2018-05-21 13:37:47 +01001292 - @ref CLLSTMLayer
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001293 - @ref CLRNNLayer
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +00001294 - CLWidthConcatenateLayer / CLWidthConcatenateLayerKernel
Manuel Bottinic6f4ec32021-05-18 18:41:56 +01001295 - CLWinogradFilterTransformKernel / @ref CLWinogradConvolutionLayer
1296 - CLWinogradInputTransformKernel / CLWinogradInputTransform
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001297 - New Arm® Neon™ kernels / functions:
Teresa Charlin562bee52021-04-13 17:44:15 +01001298 - NEConvertFullyConnectedWeightsKernel / @ref NEConvertFullyConnectedWeights.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001299 - Created the validate method in @ref CLDepthwiseConvolutionLayer.
1300 - Beta and gamma are no longer mandatory arguments in @ref NEBatchNormalizationLayer and @ref CLBatchNormalizationLayer.
1301 - Added depth multiplier support in @ref NEDepthwiseConvolutionLayer and @ref CLDepthwiseConvolutionLayer.
Sheri Zhang1e3ab422021-03-16 17:35:08 +00001302 - Added broadcast multiply support in @ref NEPixelWiseMultiplication / NEPixelWiseMultiplicationKernel.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001303 - Port mobilenet example to NHWC data layout.
1304 - Enabled Winograd method in @ref CLConvolutionLayer.
1305 - Renamed NEWinogradLayer to @ref NEWinogradConvolutionLayer.
Sheri Zhangac6499a2021-02-10 15:32:38 +00001306 - Updated @ref NEWinogradConvolutionLayer to use highly optimised assembly kernels in src/core/Neon/kernels/arm_gemm.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001307 - Added memory manager support in GLES functions.
1308 - Major refactoring of the graph API.
1309 - Added GLES backend in the graph API.
1310 - Added support for the memory manager in the graph API.
1311 - Enabled Winograd Convolution method in the graph API.
1312 - Added support for grouped convolutions in the graph API.
Manuel Bottini10b38262021-02-19 18:16:44 +00001313 - Replaced NEDeconvolutionLayerUpsampleKernel with NEScaleKernel in @ref NEDeconvolutionLayer.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001314 - Added fast maths flag in @ref CLConvolutionLayer.
1315 - Added new tests and benchmarks in validation and benchmark frameworks
Jakub Sujakee301b32021-06-04 09:46:08 +01001316 - Merge Activation layer with Convolution Layer (Neon™, CL, GLES)
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001317 - Added support to OpenCL 2.0 SVM
1318 - Added support to import memory in OpenCL tensors.
1319 - Added the prepare() method to perform any one off pre-processing before running the function.
1320 - Added new examples:
1321 - graph_inception_v4.cpp
Anthony Barbier38e7f1f2018-05-21 13:37:47 +01001322 - graph_resnext50.cpp
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001323 - Added memory measurement instrument for CL.
Pablo Telloeb82fd22018-02-23 13:43:50 +00001324
Anthony Barbier577fbdf2018-03-01 15:17:54 +00001325v18.03 Public maintenance release
1326 - Various bug fixes.
Anthony Barbier3762e742018-03-02 11:49:33 +00001327 - Fixed bug in @ref NEActivationLayer
1328 - Fix in @ref CLTuner when using batches.
Anthony Barbier577fbdf2018-03-01 15:17:54 +00001329 - Updated recommended NDK version to r16b (And fixed warnings).
1330 - Fixed bug in validation code.
1331 - Added Inception v4 graph example.
Georgios Pinitas9fb11592018-04-26 20:34:58 +01001332 - Renamed NEWinogradLayer.cpp to @ref NEWinogradConvolutionLayer
Anthony Barbier577fbdf2018-03-01 15:17:54 +00001333
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001334v18.02 Public major release
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001335 - Various Arm® Neon™ / OpenCL / GLES optimisations.
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001336 - Various bug fixes.
1337 - Changed default number of threads on big LITTLE systems.
1338 - Refactored examples and added:
1339 - graph_mobilenet_qassym8
1340 - graph_resnet
1341 - graph_squeezenet_v1_1
Anthony Barbier3762e742018-03-02 11:49:33 +00001342 - Renamed @ref CLConvolutionLayer into @ref CLGEMMConvolutionLayer and created a new @ref CLConvolutionLayer to select the fastest convolution method.
1343 - Renamed @ref NEConvolutionLayer into @ref NEGEMMConvolutionLayer and created a new @ref NEConvolutionLayer to select the fastest convolution method.
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001344 - Added in place support to:
Anthony Barbier3762e742018-03-02 11:49:33 +00001345 - @ref CLActivationLayer
1346 - @ref CLBatchNormalizationLayer
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001347 - Added QASYMM8 support to:
Anthony Barbier3762e742018-03-02 11:49:33 +00001348 - @ref CLActivationLayer
1349 - @ref CLDepthwiseConvolutionLayer
1350 - @ref NEDepthwiseConvolutionLayer
1351 - @ref NESoftmaxLayer
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001352 - Added FP16 support to:
Manuel Bottini387259a2020-05-21 17:14:36 +01001353 - CLDepthwiseConvolutionLayer3x3
Anthony Barbier3762e742018-03-02 11:49:33 +00001354 - @ref CLDepthwiseConvolutionLayer
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001355 - Added broadcasting support to NEArithmeticAddition / @ref CLArithmeticAddition / @ref CLPixelWiseMultiplication
Anthony Barbier3762e742018-03-02 11:49:33 +00001356 - Added fused batched normalization and activation to @ref CLBatchNormalizationLayer and @ref NEBatchNormalizationLayer
1357 - Added support for non-square pooling to @ref NEPoolingLayer and @ref CLPoolingLayer
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001358 - New OpenCL kernels / functions:
Michele Di Giorgioa046e162019-10-08 09:36:26 +01001359 - CLDirectConvolutionLayerOutputStageKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001360 - New Arm® Neon™ kernels / functions
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001361 - Added name() method to all kernels.
1362 - Added support for Winograd 5x5.
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +00001363 - NEPermuteKernel / @ref NEPermute
Michalis Spyrou96f977e2021-07-01 12:20:56 +01001364 - CpuWinogradConv2dTransformInputKernel / NEWinogradLayer
1365 - CpuWinogradConv2dTransformOutputKernel / NEWinogradLayer
1366 - CpuWinogradConv2dTransformWeightsKernel / NEWinogradLayer
Anthony Barbiere1553372018-07-16 18:53:52 +01001367 - Renamed NEWinogradLayerKernel into NEWinogradLayerBatchedGEMMKernel
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001368 - New GLES kernels / functions:
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001369 - GCTensorShiftKernel / GCTensorShift
Pablo Tellof6c572c2018-02-14 12:47:30 +00001370
Anthony Barbier64c95a02018-01-22 18:48:55 +00001371v18.01 Public maintenance release
1372 - Various bug fixes
1373 - Added some of the missing validate() methods
Anthony Barbier3762e742018-03-02 11:49:33 +00001374 - Added @ref CLDeconvolutionLayerUpsampleKernel / @ref CLDeconvolutionLayer @ref CLDeconvolutionLayerUpsample
Sheri Zhang7e20e292021-02-02 11:49:34 +00001375 - Added CLPermuteKernel / @ref CLPermute
Anthony Barbier64c95a02018-01-22 18:48:55 +00001376 - Added method to clean the programs cache in the CL Kernel library.
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001377 - Added GCArithmeticAdditionKernel / GCArithmeticAddition
1378 - Added GCDepthwiseConvolutionLayer3x3Kernel / GCDepthwiseConvolutionLayer3x3
1379 - Added GCNormalizePlanarYUVLayerKernel / GCNormalizePlanarYUVLayer
1380 - Added GCScaleKernel / GCScale
1381 - Added GCWeightsReshapeKernel / GCConvolutionLayer
Anthony Barbier64c95a02018-01-22 18:48:55 +00001382 - Added FP16 support to the following GLES compute kernels:
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001383 - GCCol2ImKernel
1384 - GCGEMMInterleave4x4Kernel
1385 - GCGEMMTranspose1xWKernel
1386 - GCIm2ColKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001387 - Refactored Arm® Neon™ Winograd (NEWinogradLayerKernel)
Manuel Bottini327225d2021-04-13 13:09:30 +01001388 - Added NEDirectConvolutionLayerOutputStageKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001389 - Added QASYMM8 support to the following Arm® Neon™ kernels:
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001390 - NEDepthwiseConvolutionLayer3x3Kernel
Anthony Barbier3762e742018-03-02 11:49:33 +00001391 - @ref NEFillBorderKernel
Michele Di Giorgio19289042021-02-03 16:05:00 +00001392 - NEPoolingLayerKernel
Anthony Barbier64c95a02018-01-22 18:48:55 +00001393 - Added new examples:
1394 - graph_cl_mobilenet_qasymm8.cpp
1395 - graph_inception_v3.cpp
1396 - gc_dc.cpp
1397 - More tests added to both validation and benchmarking suites.
1398
Gian Marcoff850932017-12-11 12:37:17 +00001399v17.12 Public major release
1400 - Most machine learning functions on OpenCL support the new data type QASYMM8
1401 - Introduced logging interface
1402 - Introduced opencl timer
1403 - Reworked GEMMLowp interface
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001404 - Added new Arm® Neon™ assembly kernels for GEMMLowp, SGEMM and HGEMM
Gian Marcoff850932017-12-11 12:37:17 +00001405 - Added validation method for most Machine Learning kernels / functions
1406 - Added new graph examples such as googlenet, mobilenet, squeezenet, vgg16 and vgg19
1407 - Added sgemm example for OpenCL
1408 - Added absolute difference example for GLES compute
1409 - Added new tests and benchmarks in validation and benchmark frameworks
1410 - Added new kernels / functions for GLES compute
1411
1412 - New OpenGL ES kernels / functions
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001413 - GCAbsoluteDifferenceKernel / GCAbsoluteDifference
1414 - GCActivationLayerKernel / GCActivationLayer
1415 - GCBatchNormalizationLayerKernel / GCBatchNormalizationLayer
1416 - GCCol2ImKernel
1417 - GCDepthConcatenateLayerKernel / GCDepthConcatenateLayer
1418 - GCDirectConvolutionLayerKernel / GCDirectConvolutionLayer
1419 - GCDropoutLayerKernel / GCDropoutLayer
1420 - GCFillBorderKernel / GCFillBorder
1421 - GCGEMMInterleave4x4Kernel / GCGEMMInterleave4x4
1422 - GCGEMMMatrixAccumulateBiasesKernel / GCGEMMMatrixAdditionKernel / GCGEMMMatrixMultiplyKernel / GCGEMM
1423 - GCGEMMTranspose1xWKernel / GCGEMMTranspose1xW
1424 - GCIm2ColKernel
1425 - GCNormalizationLayerKernel / GCNormalizationLayer
1426 - GCPixelWiseMultiplicationKernel / GCPixelWiseMultiplication
1427 - GCPoolingLayerKernel / GCPoolingLayer
1428 - GCLogits1DMaxKernel / GCLogits1DShiftExpSumKernel / GCLogits1DNormKernel / GCSoftmaxLayer
1429 - GCTransposeKernel / GCTranspose
Gian Marcoff850932017-12-11 12:37:17 +00001430
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001431 - New Arm® Neon™ kernels / functions
Pablo Telloeb82fd22018-02-23 13:43:50 +00001432 - arm_compute::NEGEMMLowpAArch64A53Kernel / arm_compute::NEGEMMLowpAArch64Kernel / arm_compute::NEGEMMLowpAArch64V8P4Kernel / arm_compute::NEGEMMInterleavedBlockedKernel / arm_compute::NEGEMMLowpAssemblyMatrixMultiplyCore
1433 - arm_compute::NEHGEMMAArch64FP16Kernel
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001434 - NEDepthwiseConvolutionLayer3x3Kernel / NEDepthwiseIm2ColKernel / NEGEMMMatrixVectorMultiplyKernel / NEDepthwiseVectorToTensorKernel / @ref NEDepthwiseConvolutionLayer
Manuel Bottinicfac51c2021-06-18 15:47:28 +01001435 - NEGEMMLowpOffsetContributionKernel / NEGEMMLowpMatrixAReductionKernel / NEGEMMLowpMatrixBReductionKernel / NEGEMMLowpMatrixMultiplyCore
Manuel Bottiniae58bdf2021-06-17 17:18:45 +01001436 - NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
Georgios Pinitas9fb11592018-04-26 20:34:58 +01001437 - NEWinogradLayer / NEWinogradLayerKernel
Gian Marcoff850932017-12-11 12:37:17 +00001438
1439 - New OpenCL kernels / functions
Georgios Pinitas4a578b92021-06-25 12:13:49 +01001440 - CLGEMMLowpOffsetContributionKernel / CLGEMMLowpMatrixAReductionKernel / CLGEMMLowpMatrixBReductionKernel / CLGEMMLowpMatrixMultiplyCore
1441 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
Gian Marcoff850932017-12-11 12:37:17 +00001442
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001443 - New graph nodes for Arm® Neon™ and OpenCL
Georgios Pinitasd9eb2752018-04-03 13:44:29 +01001444 - graph::BranchLayer
1445 - graph::DepthConvertLayer
1446 - graph::DepthwiseConvolutionLayer
1447 - graph::DequantizationLayer
1448 - graph::FlattenLayer
1449 - graph::QuantizationLayer
1450 - graph::ReshapeLayer
Gian Marcoff850932017-12-11 12:37:17 +00001451
Anthony Barbier3c5b4ff2017-10-12 13:20:52 +01001452v17.10 Public maintenance release
1453 - Bug fixes:
1454 - Check the maximum local workgroup size supported by OpenCL devices
1455 - Minor documentation updates (Fixed instructions to build the examples)
Anthony Barbier3762e742018-03-02 11:49:33 +00001456 - Introduced a graph::GraphContext
Anthony Barbier3c5b4ff2017-10-12 13:20:52 +01001457 - Added a few new Graph nodes, support for branches and grouping.
1458 - Automatically enable cl_printf in debug builds
1459 - Fixed bare metal builds for armv7a
1460 - Added AlexNet and cartoon effect examples
1461 - Fixed library builds: libraries are no longer built as supersets of each other.(It means application using the Runtime part of the library now need to link against both libarm_compute_core and libarm_compute)
1462
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001463v17.09 Public major release
1464 - Experimental Graph support: initial implementation of a simple stream API to easily chain machine learning layers.
Anthony Barbier3762e742018-03-02 11:49:33 +00001465 - Memory Manager (@ref BlobLifetimeManager, @ref BlobMemoryPool, @ref ILifetimeManager, @ref IMemoryGroup, @ref IMemoryManager, @ref IMemoryPool, @ref IPoolManager, @ref MemoryManagerOnDemand, @ref PoolManager)
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001466 - New validation and benchmark frameworks (Boost and Google frameworks replaced by homemade framework).
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001467 - Most machine learning functions support both fixed point 8 and 16 bit (QS8, QS16) for both Arm® Neon™ and OpenCL.
1468 - New Arm® Neon™ kernels / functions:
Pablo Telloeb82fd22018-02-23 13:43:50 +00001469 - arm_compute::NEGEMMAssemblyBaseKernel arm_compute::NEGEMMAArch64Kernel
Manuel Bottini00f4dfc2021-03-10 09:55:14 +00001470 - NEDequantizationLayerKernel / @ref NEDequantizationLayer
Georgios Pinitas70eb53b2021-01-06 19:42:21 +00001471 - NEFloorKernel / @ref NEFloor
Anthony Barbier3762e742018-03-02 11:49:33 +00001472 - @ref NEL2NormalizeLayerKernel / @ref NEL2NormalizeLayer
Georgios Pinitasb6af4822021-09-14 12:33:34 +01001473 - NEQuantizationLayerKernel NEMinMaxLayerKernel / @ref NEQuantizationLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001474 - @ref NEROIPoolingLayerKernel / @ref NEROIPoolingLayer
1475 - @ref NEReductionOperationKernel / @ref NEReductionOperation
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +00001476 - NEReshapeLayerKernel / @ref NEReshapeLayer
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001477
1478 - New OpenCL kernels / functions:
Gian Marco Iodice8155c022021-04-16 15:08:59 +01001479 - CLDepthwiseConvolutionLayer3x3NCHWKernel CLDepthwiseConvolutionLayer3x3NHWCKernel CLDepthwiseIm2ColKernel CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer
Manuel Bottini9e73c932021-03-02 17:40:42 +00001480 - CLDequantizationLayerKernel / CLDequantizationLayer
Sheri Zhang1efed922021-03-10 22:43:38 +00001481 - CLDirectConvolutionLayerKernel / @ref CLDirectConvolutionLayer
Georgios Pinitase2696b12020-12-03 20:37:43 +00001482 - CLFlattenLayer
Georgios Pinitasf47f7182021-01-15 09:29:50 +00001483 - CLFloorKernel / @ref CLFloor
Gian Marco Iodice5fc07aa2019-05-15 17:08:02 +01001484 - CLGEMMTranspose1xW
Michele Di Giorgioee82d342021-01-05 16:14:28 +00001485 - CLGEMMMatrixVectorMultiplyKernel
Anthony Barbier3762e742018-03-02 11:49:33 +00001486 - @ref CLL2NormalizeLayerKernel / @ref CLL2NormalizeLayer
Georgios Pinitasb6af4822021-09-14 12:33:34 +01001487 - CLQuantizationLayerKernel CLMinMaxLayerKernel / @ref CLQuantizationLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001488 - @ref CLROIPoolingLayerKernel / @ref CLROIPoolingLayer
1489 - @ref CLReductionOperationKernel / @ref CLReductionOperation
Sheri Zhang7e20e292021-02-02 11:49:34 +00001490 - CLReshapeLayerKernel / @ref CLReshapeLayer
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001491
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001492v17.06 Public major release
1493 - Various bug fixes
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001494 - Added support for fixed point 8 bit (QS8) to the various Arm® Neon™ machine learning kernels.
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001495 - Added unit tests and benchmarks (AlexNet, LeNet)
1496 - Added support for sub tensors.
1497 - Added infrastructure to provide GPU specific optimisation for some OpenCL kernels.
Sheri Zhangac6499a2021-02-10 15:32:38 +00001498 - Added @ref OMPScheduler (OpenMP) scheduler for Neon
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001499 - Added @ref SingleThreadScheduler scheduler for Arm® Neon™ (For bare metal)
ramelg01b2eba7f2021-12-23 08:32:08 +00001500 - User can specify their own scheduler by implementing the @ref IScheduler interface.
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001501 - New OpenCL kernels / functions:
Anthony Barbier3762e742018-03-02 11:49:33 +00001502 - @ref CLBatchNormalizationLayerKernel / @ref CLBatchNormalizationLayer
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +00001503 - CLDepthConcatenateLayerKernel / CLDepthConcatenateLayer
Michalis Spyrou473cb012021-02-23 11:48:12 +00001504 - CLHOGOrientationBinningKernel CLHOGBlockNormalizationKernel, CLHOGDetectorKernel / CLHOGDescriptor CLHOGDetector CLHOGGradient CLHOGMultiDetection
Georgios Pinitas96b16b62020-12-01 17:41:34 +00001505 - CLLocallyConnectedMatrixMultiplyKernel / CLLocallyConnectedLayer
Manuel Bottinid87aded2021-07-16 10:23:31 +01001506 - CLWeightsReshapeKernel / CLConvolutionLayerReshapeWeights
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001507 - New C++ kernels:
Georgios Pinitasc6f95102021-03-30 10:03:01 +01001508 - CPPDetectionWindowNonMaximaSuppressionKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001509 - New Arm® Neon™ kernels / functions:
Anthony Barbier3762e742018-03-02 11:49:33 +00001510 - @ref NEBatchNormalizationLayerKernel / @ref NEBatchNormalizationLayer
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001511 - NEDepthConcatenateLayerKernel / NEDepthConcatenateLayer
Manuel Bottini327225d2021-04-13 13:09:30 +01001512 - NEDirectConvolutionLayerKernel / @ref NEDirectConvolutionLayer
Georgios Pinitas96b16b62020-12-01 17:41:34 +00001513 - NELocallyConnectedMatrixMultiplyKernel / NELocallyConnectedLayer
Manuel Bottini29599d02021-07-06 15:01:35 +01001514 - NEWeightsReshapeKernel / NEConvolutionLayerReshapeWeights
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001515
1516v17.05 Public bug fixes release
1517 - Various bug fixes
1518 - Remaining of the functions ported to use accurate padding.
1519 - Library does not link against OpenCL anymore (It uses dlopen / dlsym at runtime instead to determine whether or not OpenCL is available).
1520 - Added "free" method to allocator.
1521 - Minimum version of g++ required for armv7 Linux changed from 4.8 to 4.9
1522
1523v17.04 Public bug fixes release
1524
1525 The following functions have been ported to use the new accurate padding:
Michalis Spyrou473cb012021-02-23 11:48:12 +00001526 - CLColorConvertKernel
1527 - CLEdgeNonMaxSuppressionKernel
1528 - CLEdgeTraceKernel
1529 - CLGaussianPyramidHorKernel
1530 - CLGaussianPyramidVertKernel
1531 - CLGradientKernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001532 - NEChannelCombineKernel
Georgios Pinitasc6f95102021-03-30 10:03:01 +01001533 - NEFillArrayKernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001534 - NEGaussianPyramidHorKernel
1535 - NEGaussianPyramidVertKernel
Georgios Pinitas09d34512018-08-30 16:02:11 +01001536 - NEHarrisScoreFP16Kernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001537 - NEHarrisScoreKernel
1538 - NEHOGDetectorKernel
Michalis Spyrou373b4072021-01-20 16:41:12 +00001539 - NELogits1DMaxKernel
Anthony Barbier3762e742018-03-02 11:49:33 +00001540 - NELogits1DShiftExpSumKernel
1541 - NELogits1DNormKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +00001542 - NENonMaximaSuppression3x3FP16Kernel
1543 - NENonMaximaSuppression3x3Kernel
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001544
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001545v17.03.1 First Major public release of the sources
1546 - Renamed the library to arm_compute
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001547 - New CPP target introduced for C++ kernels shared between Arm® Neon™ and CL functions.
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001548 - New padding calculation interface introduced and ported most kernels / functions to use it.
1549 - New OpenCL kernels / functions:
Gian Marco Iodiceeb65f6d2020-04-15 11:42:15 +01001550 - CLGEMMLowpMatrixMultiplyKernel / CLGEMMLowp
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001551 - New Arm® Neon™ kernels / functions:
Anthony Barbier3762e742018-03-02 11:49:33 +00001552 - @ref NENormalizationLayerKernel / @ref NENormalizationLayer
Teresa Charlind1dc09c2021-03-04 15:24:45 +00001553 - NETransposeKernel / @ref NETranspose
Michalis Spyrou373b4072021-01-20 16:41:12 +00001554 - NELogits1DMaxKernel, NELogits1DShiftExpSumKernel, NELogits1DNormKernel / @ref NESoftmaxLayer
Manuel Bottini24b89202021-07-01 18:13:33 +01001555 - NEIm2ColKernel, NECol2ImKernel, NEConvolutionLayerWeightsReshapeKernel / @ref NEConvolutionLayer
Michele Di Giorgiof22f6722020-07-03 16:29:24 +01001556 - NEGEMMMatrixAccumulateBiasesKernel / @ref NEFullyConnectedLayer
Manuel Bottinicfac51c2021-06-18 15:47:28 +01001557 - NEGEMMLowpMatrixMultiplyKernel / NEGEMMLowp
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001558
1559v17.03 Sources preview
1560 - New OpenCL kernels / functions:
Michalis Spyrou473cb012021-02-23 11:48:12 +00001561 - CLGradientKernel, CLEdgeNonMaxSuppressionKernel, CLEdgeTraceKernel / CLCannyEdge
Georgios Pinitas856f66e2021-04-22 21:13:21 +01001562 - GEMM refactoring + FP16 support: CLGEMMInterleave4x4Kernel, CLGEMMTranspose1xWKernel, CLGEMMMatrixMultiplyKernel, CLGEMMMatrixAdditionKernel / @ref CLGEMM
Michele Di Giorgiof6f78762020-07-06 11:27:21 +01001563 - CLGEMMMatrixAccumulateBiasesKernel / @ref CLFullyConnectedLayer
Teresa Charlin27886092021-02-25 20:15:01 +00001564 - CLTransposeKernel / @ref CLTranspose
Georgios Pinitasc6f95102021-03-30 10:03:01 +01001565 - CLLKTrackerInitKernel, CLLKTrackerStage0Kernel, CLLKTrackerStage1Kernel, CLLKTrackerFinalizeKernel / CLOpticalFlow
Anthony Barbier3762e742018-03-02 11:49:33 +00001566 - @ref CLNormalizationLayerKernel / @ref CLNormalizationLayer
Michalis Spyrou473cb012021-02-23 11:48:12 +00001567 - CLLaplacianPyramid, CLLaplacianReconstruct
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001568 - New Arm® Neon™ kernels / functions:
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001569 - NEActivationLayerKernel / @ref NEActivationLayer
Michele Di Giorgio93b75e02021-06-21 12:00:43 +01001570 - GEMM refactoring + FP16 support (Requires armv8.2 CPU): NEGEMMInterleave4x4Kernel, NEGEMMTranspose1xWKernel, NEGEMMMatrixMultiplyKernel, NEGEMMMatrixAdditionKernel / @ref NEGEMM
Michele Di Giorgio19289042021-02-03 16:05:00 +00001571 - NEPoolingLayerKernel / @ref NEPoolingLayer
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001572
1573v17.02.1 Sources preview
1574 - New OpenCL kernels / functions:
Sang-Hoon Park201e0fe2021-01-27 13:14:56 +00001575 - CLLogits1DMaxKernel, CLLogits1DShiftExpSumKernel, CLLogits1DNormKernel / @ref CLSoftmaxLayer
Michele Di Giorgioe1314662021-02-01 17:09:32 +00001576 - CLPoolingLayerKernel / @ref CLPoolingLayer
Manuel Bottinid844c082021-07-14 12:58:54 +01001577 - CLIm2ColKernel, CLCol2ImKernel, CLConvolutionLayerWeightsReshapeKernel / CLConvolutionLayer
Adnan AlSinan6863fa02022-02-04 13:04:55 +00001578 - CLRemapKernel / CLRemap
Michalis Spyrou473cb012021-02-23 11:48:12 +00001579 - CLGaussianPyramidHorKernel, CLGaussianPyramidVertKernel / CLGaussianPyramid, CLGaussianPyramidHalf, CLGaussianPyramidOrb
1580 - CLMinMaxKernel, CLMinMaxLocationKernel / CLMinMaxLocation
1581 - CLNonLinearFilterKernel / CLNonLinearFilter
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001582 - New Arm® Neon™ FP16 kernels (Requires armv8.2 CPU)
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001583 - NEAccumulateWeightedFP16Kernel
1584 - NEBox3x3FP16Kernel
Michalis Spyrou473cb012021-02-23 11:48:12 +00001585 - NENonMaximaSuppression3x3FP16Kernel
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001586
1587v17.02 Sources preview
1588 - New OpenCL kernels / functions:
Georgios Pinitasf47f7182021-01-15 09:29:50 +00001589 - CLActivationLayerKernel / @ref CLActivationLayer
Michalis Spyrou473cb012021-02-23 11:48:12 +00001590 - CLChannelCombineKernel / CLChannelCombine
1591 - CLDerivativeKernel / CLChannelExtract
1592 - CLFastCornersKernel / CLFastCorners
1593 - CLMeanStdDevKernel / CLMeanStdDev
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001594 - New Arm® Neon™ kernels / functions:
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001595 - HOG / SVM: NEHOGOrientationBinningKernel, NEHOGBlockNormalizationKernel, NEHOGDetectorKernel, NEHOGNonMaximaSuppressionKernel / NEHOGDescriptor, NEHOGDetector, NEHOGGradient, NEHOGMultiDetection
1596 - NENonLinearFilterKernel / NENonLinearFilter
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001597 - Introduced a CLScheduler to manage the default context and command queue used by the runtime library and create synchronisation events.
1598 - Switched all the kernels / functions to use tensors instead of images.
1599 - Updated documentation to include instructions to build the library from sources.
1600
1601v16.12 Binary preview release
1602 - Original release
1603
Sheri Zhangd813bab2021-04-30 16:53:41 +01001604 */
Ramy Elgammal0d274b72022-08-05 13:14:57 +01001605} // namespace arm_compute