blob: 2470b452035da411de5e545e3c43bce34a60445e [file] [log] [blame]
Vidhya Sudhan Loganathand646ae12018-11-19 15:18:20 +00001///
Gian Marco Iodice716b1be2021-02-10 17:33:27 +00002/// Copyright (c) 2017-2021 Arm Limited.
Vidhya Sudhan Loganathand646ae12018-11-19 15:18:20 +00003///
4/// SPDX-License-Identifier: MIT
5///
6/// Permission is hereby granted, free of charge, to any person obtaining a copy
7/// of this software and associated documentation files (the "Software"), to
8/// deal in the Software without restriction, including without limitation the
9/// rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
10/// sell copies of the Software, and to permit persons to whom the Software is
11/// furnished to do so, subject to the following conditions:
12///
13/// The above copyright notice and this permission notice shall be included in all
14/// copies or substantial portions of the Software.
15///
16/// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17/// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18/// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19/// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20/// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21/// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22/// SOFTWARE.
23///
Anthony Barbier3762e742018-03-02 11:49:33 +000024namespace arm_compute
25{
Sheri Zhangd813bab2021-04-30 16:53:41 +010026/** @page versions_changelogs Release Versions and Changelog
Anthony Barbier6ff3b192017-09-04 18:44:23 +010027
28@tableofcontents
29
Sheri Zhangd813bab2021-04-30 16:53:41 +010030@section S2_1_versions Release versions
Anthony Barbier6ff3b192017-09-04 18:44:23 +010031
32All releases are numbered vYY.MM Where YY are the last two digits of the year, and MM the month number.
33If there is more than one release in a month then an extra sequential number is appended at the end:
34
35 v17.03 (First release of March 2017)
36 v17.03.1 (Second release of March 2017)
37 v17.04 (First release of April 2017)
38
39@note We're aiming at releasing one major public release with new features per quarter. All releases in between will only contain bug fixes.
40
Sheri Zhangd813bab2021-04-30 16:53:41 +010041@section S2_2_changelog Changelog
Anthony Barbier6ff3b192017-09-04 18:44:23 +010042
Sheri Zhang5dda2172021-10-15 19:54:17 +010043v21.11 Public major release
44 - Various bug fixes.
45 - Various optimizations.
46 - New OpenCL kernels / functions:
47 - @ref CLConv3D
48 - New Arm® Neon™ kernels / functions:
49 - @ref NEConv3D
50
Freddie Liardet77014ff2021-08-05 15:50:31 +010051v21.08 Public major release
52 - Various bug fixes.
53 - Various optimizations:
54 - Improve LWS (Local-Workgroup-Size) heuristic in OpenCL for GeMM, Direct Convolution and Winograd Transformations when OpenCL tuner is not used
55 - Improve QASYMM8/QSYMM8 performance on OpenCL for various Arm® Mali™ GPU architectures
56 - Add dynamic weights support in Fully connected layer (CPU/GPU)
57 - Various performance optimizations for floating-point data types (CPU/GPU)
58 - Add a reduced core library build arm_compute_core_v2
59 - Expose Operator API
60 - Support fat binary build for arm8.2-a via fat_binary build flag
61 - Add CPU discovery capabilities
62 - Add data type f16 support for:
63 - @ref CLRemapKernel
64 - Port the following functions to stateless API:
65 - @ref CLConvolutionLayer
66 - @ref CLFlattenLayer
67 - @ref CLFullyConnectedLayer
68 - @ref CLGEMM
69 - @ref CLGEMMConvolutionLayer
70 - @ref CLGEMMLowpMatrixMultiplyCore
71 - @ref CLWinogradConvolutionLayer
72 - @ref NEConvolutionLayer
73 - @ref NEFlattenLayer
74 - @ref NEFullyConnectedLayer
75 - @ref NEGEMM
76 - @ref NEGEMMConv2d
77 - @ref NEGEMMConvolutionLayer
78 - @ref NEGEMMLowpMatrixMultiplyCore
79 - @ref NEWinogradConvolutionLayer
80 - Remove the following functions:
81 - CLWinogradInputTransform
82 - Remove CLCoreRuntimeContext
83 - Remove ICPPSimpleKernel
84 - Rename file arm_compute/runtime/CL/functions/CLElementWiseUnaryLayer.h to arm_compute/runtime/CL/functions/CLElementwiseUnaryLayer.h
85
Michalis Spyrou27e67f02021-02-16 11:34:39 +000086v21.05 Public major release
Sheri Zhangc2bed952021-05-06 12:12:38 +010087 - Various bug fixes.
88 - Various optimisations.
89 - Various documentation updates:
Jakub Sujakee301b32021-06-04 09:46:08 +010090 - Add supported operators and corresponding Android NNAPI operators.
91 - Documentation reorg into user guide and contributor guide.
Sheri Zhangc2bed952021-05-06 12:12:38 +010092 - Add support for a global allocator for OpenCL tensors
93 - Add experimental support for [CLVK](https://github.com/kpet/clvk).
94 - Add data type S32 support for:
95 - @ref opencl::kernels::ClArithmeticKernel
96 - Add data type QASYMM8 support for:
97 - @ref CLROIPoolingLayer
98 - @ref CLROIPoolingLayerKernel
99 - @ref NEROIPoolingLayer
100 - @ref NEROIPoolingLayerKernel
101 - Add per-channel quantization support for:
102 - @ref CLDeconvolutionLayer
103 - @ref CLDirectDeconvolutionLayer
104 - @ref NEConvolutionLayer
105 - @ref NEDeconvolutionLayer
106 - Remove padding from OpenCL kernels:
107 - @ref CLL2NormalizeLayerKernel
Gian Marco Iodice8155c022021-04-16 15:08:59 +0100108 - CLDepthwiseConvolutionLayer3x3NHWCKernel
Sheri Zhangc2bed952021-05-06 12:12:38 +0100109 - @ref CLNormalizationLayerKernel
110 - @ref CLNormalizePlanarYUVLayerKernel
111 - @ref opencl::kernels::ClMulKernel
112 - @ref CLReductionOperationKernel
113 - @ref CLROIPoolingLayerKernel
114 - Remove computer vision support from Arm® Neon™ backend
115 - Remove the following functions:
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000116 - NEAbsoluteDifference
117 - NEAccumulate
118 - NEBox3x3
119 - NECannyEdge
120 - NEChannelCombine
121 - NEChannelExtract
122 - NEColorConvert
Michalis Spyrou473cb012021-02-23 11:48:12 +0000123 - NEConvolution
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000124 - NEDerivative
125 - NEDilate
126 - NEEqualizeHistogram
127 - NEErode
128 - NEFastCorners
129 - NEGaussian3x3
130 - NEGaussian5x5
131 - NEGaussianPyramid
132 - NEHOGDescriptor
133 - NEHOGDetector
134 - NEHOGGradient
135 - NEHOGMultiDetection
136 - NEHarrisCorners
137 - NEHistogram
138 - NEIntegralImage
139 - NELaplacianPyramid
140 - NELaplacianReconstruct
141 - NEMagnitude
142 - NEMeanStdDev
143 - NEMedian3x3
144 - NEMinMaxLocation
145 - NENonLinearFilter
146 - NEOpticalFlow
147 - NEPhase
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000148 - NEScharr3x3
149 - NESobel3x3
150 - NESobel5x5
151 - NESobel7x7
152 - NETableLookup
153 - NEThreshold
154 - NEWarpAffine
Michalis Spyrou473cb012021-02-23 11:48:12 +0000155 - NEWarpPerspectiveKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000156 - Remove all GLES kernels / functions / tests / examples
Sheri Zhangc2bed952021-05-06 12:12:38 +0100157 - Remove computer vision support from CL backend
158 - Remove the following functions:
Michalis Spyrou473cb012021-02-23 11:48:12 +0000159 - CLAbsoluteDifference
160 - CLAccumulate
161 - CLBox3x3
162 - CLCannyEdge
163 - CLChannelCombine
164 - CLChannelExtract
165 - CLColorConvert
166 - CLConvolution
167 - CLDerivative
168 - CLDilate
169 - CLEqualizeHistogram
170 - CLErode
171 - CLFastCorners
172 - CLGaussian3x3
173 - CLGaussian5x5
174 - CLGaussianPyramid
175 - CLHOGDescriptor
176 - CLHOGDetector
177 - CLHOGGradient
178 - CLHOGMultiDetection
179 - CLHarrisCorners
180 - CLHistogram
181 - CLIntegralImage
182 - CLLaplacianPyramid
183 - CLLaplacianReconstruct
184 - CLMagnitude
185 - CLMeanStdDev
186 - CLMedian3x3
187 - CLMinMaxLocation
188 - CLNonLinearFilter
189 - CLOpticalFlow
190 - CLPhase
191 - CLScharr3x3
192 - CLSobel3x3
193 - CLSobel5x5
194 - CLSobel7x7
195 - CLTableLookup
196 - CLThreshold
197 - CLWarpAffine
198 - CLWarpPerspective
199
Georgios Pinitas40f51a62020-11-21 03:04:18 +0000200v21.02 Public major release
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000201 - Various bug fixes.
202 - Various optimisations.
Georgios Pinitas45514032020-12-30 00:03:09 +0000203 - Upgrade C++ standard to C++14
204 - Add macOS support
Giorgio Arena1055dc12021-02-19 09:53:06 +0000205 - Add Armv8-R AArch64 architecture support
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000206 - Add SVE/SVE2 support for:
Manuel Bottini10b38262021-02-19 18:16:44 +0000207 - NEScaleKernel
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000208 - @ref NEActivationLayer
209 - @ref NEArithmeticAddition
210 - @ref NEBatchNormalizationLayerKernel
Giorgio Arena1055dc12021-02-19 09:53:06 +0000211 - @ref cpu::kernels::CpuLogits1DSoftmaxKernel
212 - @ref cpu::kernels::CpuLogits1DMaxKernel
213 - @ref cpu::kernels::CpuElementwiseUnaryKernel
Sheri Zhangdda69142021-02-01 19:06:57 +0000214 - Remove padding from OpenCL kernels:
Sheri Zhang1efed922021-03-10 22:43:38 +0000215 - CLDirectConvolutionLayerKernel
Sheri Zhangdda69142021-02-01 19:06:57 +0000216 - @ref CLArgMinMaxLayerKernel
217 - @ref CLPadLayerKernel
218 - @ref CLROIAlignLayerKernel
219 - @ref CLRangeKernel
Manuel Bottini3b131ab2021-02-19 18:16:44 +0000220 - CLScaleKernel
Sheri Zhangdda69142021-02-01 19:06:57 +0000221 - @ref CLSelectKernel
222 - @ref CLBitwiseKernel
Giorgio Arena1055dc12021-02-19 09:53:06 +0000223 - @ref opencl::kernels::ClFloorKernel
Teresa Charlin27886092021-02-25 20:15:01 +0000224 - CLTransposeKernel
Giorgio Arena5b50f422021-02-17 11:43:05 +0000225 - Deprecate functions in CLTuner:
226 - add_lws_to_table
227 - import_lws_table
228 - lws_table
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000229 - Remove functions:
Georgios Pinitas96b16b62020-12-01 17:41:34 +0000230 - NELocallyConnectedLayer / CLLocallyConnectedLayer
Georgios Pinitasf7c5a412020-12-03 14:38:33 +0000231 - NEIm2Col
232 - NECol2Im
233 - NEGEMMInterleave4x4
234 - NEGEMMTranspose1xW
Georgios Pinitas8c3c0e72020-12-03 20:11:53 +0000235 - NEComputeAllAnchors / CLComputeAllAnchors
Georgios Pinitasec2256b2020-12-03 18:51:58 +0000236 - NEGEMMAssemblyDispatch
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000237 - NEUpsampleLayer / CLUpsampleLayer
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000238 - Remove kernels:
Georgios Pinitasd308df32020-12-01 16:56:36 +0000239 - NEGEMMMatrixVectorMultiplyKernel
Georgios Pinitas96b16b62020-12-01 17:41:34 +0000240 - NELocallyConnectedMatrixMultiplyKernel / CLLocallyConnectedMatrixMultiplyKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000241 - NEUpsampleLayerKernel / CLUpsampleLayerKernel
Gian Marco Iodicef5aad512021-02-08 17:34:40 +0000242 - Extend OpenCL tuner with workgroup batch size support
243 - Experimental extension for the OpenCL tuner to tune the batches of work groups distribute to compute units
Gian Marco Iodice716b1be2021-02-10 17:33:27 +0000244 - Add functionality to load the OpenCL GEMM heuristics at runtime
245 - The GEMM heuristic file (MLGO) can be used to update the default GEMM heuristics available for OpenCL
Giorgio Arenacd7d1782021-02-22 14:58:37 +0000246 - Note: there might be performance regressions against v20.08 in Inception v3 using int8 data types on Arm Mali-G77 GPUs. Currently under investigation
Jakub Sujakee301b32021-06-04 09:46:08 +0100247 - Note: data-type decoupling is in progress and experimental. Warning of unused symbols might be raised
Georgios Pinitas40f51a62020-11-21 03:04:18 +0000248
SiCong Li96209c72020-08-21 12:28:30 +0100249v20.11 Public major release
morgolock70b1eb82020-11-24 13:54:19 +0000250 - Various bug fixes.
251 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000252 - Performance regressions can be noted when executing Depthwise Convolution on Arm® Neon™ with a depth multiplier > 1 for quantized data type.
morgolock0e728492020-11-20 11:03:33 +0000253 This is planned to be resolved in 21.02 release.
morgolock70b1eb82020-11-24 13:54:19 +0000254 - Added new data type QASYMM8_SIGNED support for @ref NEROIAlignLayer.
SiCong Li903f8cc2020-08-27 10:17:10 +0100255 - Added new data type S32 support for:
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000256 - NEArithmeticSubtraction
257 - NEArithmeticSubtractionKernel
SiCong Libb88f892020-08-28 11:18:47 +0100258 - @ref NEPixelWiseMultiplication
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000259 - NEPixelWiseMultiplicationKernel
Sang-Hoon Park63001ac2021-01-18 14:20:27 +0000260 - NEElementwiseDivision
261 - NEDivisionOperationKernel
SiCong Li96209c72020-08-21 12:28:30 +0100262 - Interface change
263 - Properly support softmax axis to have the same meaning as other major frameworks. That is, axis now defines the dimension
264 on which Softmax/Logsoftmax is performed. E.g. for input of shape 4x5x6 and axis=1, softmax will be applied to 4x6=24 vectors of size 5.
265 The supported value range of axis is [-rank, rank).
266 This change applies to the following functions:
267 - @ref NESoftmaxLayer
268 - @ref NELogSoftmaxLayer
269 - @ref CLSoftmaxLayer
270 - @ref CLLogSoftmaxLayer
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +0000271 - GCSoftmaxLayer
Sheri Zhang824061d2020-10-26 15:46:37 +0000272 - New OpenCL kernels / functions:
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100273 - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
morgolock0e728492020-11-20 11:03:33 +0000274 - @ref CLLogicalNot
275 - @ref CLLogicalAnd
276 - @ref CLLogicalOr
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000277 - New Arm® Neon™ kernels / functions:
morgolock0e728492020-11-20 11:03:33 +0000278 - @ref NELogicalNot
279 - @ref NELogicalAnd
280 - @ref NELogicalOr
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000281 - Removed padding from Arm® Neon™ kernels:
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000282 - NEComplexPixelWiseMultiplicationKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000283 - NENonMaximaSuppression3x3Kernel
284 - @ref NERemapKernel
Michele Di Giorgio93b75e02021-06-21 12:00:43 +0100285 - NEGEMMInterleave4x4Kernel
Manuel Bottini327225d2021-04-13 13:09:30 +0100286 - NEDirectConvolutionLayerKernel
Manuel Bottini10b38262021-02-19 18:16:44 +0000287 - NEScaleKernel
Georgios Pinitas96b16b62020-12-01 17:41:34 +0000288 - NELocallyConnectedMatrixMultiplyKernel
Manuel Bottinicfac51c2021-06-18 15:47:28 +0100289 - NEGEMMLowpOffsetContributionKernel
Michele Di Giorgio93b75e02021-06-21 12:00:43 +0100290 - NEGEMMTranspose1xWKernel
Michele Di Giorgio19289042021-02-03 16:05:00 +0000291 - NEPoolingLayerKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000292 - NEConvolutionKernel
Michalis Spyrou60c3b0e2021-04-08 12:02:58 +0100293 - NEDepthwiseConvolutionLayerNativeKernel
Manuel Bottinicfac51c2021-06-18 15:47:28 +0100294 - NEGEMMLowpMatrixMultiplyKernel
Michele Di Giorgio53832b22021-06-21 14:45:44 +0100295 - NEGEMMMatrixMultiplyKernel
Manuel Bottini327225d2021-04-13 13:09:30 +0100296 - NEDirectConvolutionLayerOutputStageKernel
Sheri Zhanged367132020-10-08 15:46:16 +0100297 - @ref NEReductionOperationKernel
Manuel Bottinicfac51c2021-06-18 15:47:28 +0100298 - NEGEMMLowpMatrixAReductionKernel
299 - NEGEMMLowpMatrixBReductionKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000300 - Removed padding from OpenCL kernels:
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000301 - CLBatchConcatenateLayerKernel
Michele Di Giorgio1e0208a2021-01-22 15:42:59 +0000302 - CLElementwiseOperationKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000303 - @ref CLBatchNormalizationLayerKernel
Michele Di Giorgioe1314662021-02-01 17:09:32 +0000304 - CLPoolingLayerKernel
Manuel Bottinic6f4ec32021-05-18 18:41:56 +0100305 - CLWinogradInputTransformKernel
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100306 - CLGEMMLowpMatrixMultiplyNativeKernel
307 - CLGEMMLowpMatrixAReductionKernel
308 - CLGEMMLowpMatrixBReductionKernel
309 - CLGEMMLowpOffsetContributionOutputStageKernel
310 - CLGEMMLowpOffsetContributionKernel
Manuel Bottinic6f4ec32021-05-18 18:41:56 +0100311 - CLWinogradOutputTransformKernel
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100312 - CLGEMMLowpMatrixMultiplyReshapedKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000313 - @ref CLFuseBatchNormalizationKernel
314 - @ref CLDepthwiseConvolutionLayerNativeKernel
Georgios Pinitas11d84152021-04-28 10:20:18 +0100315 - CLDepthConvertLayerKernel
Sheri Zhang7e20e292021-02-02 11:49:34 +0000316 - CLCopyKernel
Gian Marco Iodice8155c022021-04-16 15:08:59 +0100317 - CLDepthwiseConvolutionLayer3x3NHWCKernel
Georgios Pinitasf47f7182021-01-15 09:29:50 +0000318 - CLActivationLayerKernel
Manuel Bottinic6f4ec32021-05-18 18:41:56 +0100319 - CLWinogradFilterTransformKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000320 - CLWidthConcatenateLayerKernel
321 - CLWidthConcatenate4TensorsKernel
322 - CLWidthConcatenate2TensorsKernel
Sang-Hoon Park201e0fe2021-01-27 13:14:56 +0000323 - CLLogits1DMaxShiftExpSumKernel
324 - CLLogits1DNormKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000325 - CLHeightConcatenateLayerKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100326 - CLGEMMMatrixMultiplyKernel
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100327 - CLGEMMLowpQuantizeDownInt32ScaleKernel
328 - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
329 - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000330 - CLDepthConcatenateLayerKernel
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100331 - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000332 - Removed OpenCL kernels / functions:
333 - CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
334 - CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel
335 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel
morgolock00c76012020-11-06 10:40:12 +0000336 - Deprecated OpenCL kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together):
Georgios Pinitas2d221392020-09-03 15:16:37 +0100337 - CLLocallyConnectedLayer
338 - CLLocallyConnectedMatrixMultiplyKernel
morgolock00c76012020-11-06 10:40:12 +0000339 - CLAbsoluteDifference
340 - CLAbsoluteDifferenceKernel
341 - CLAccumulate
342 - CLAccumulateKernel
343 - CLAccumulateSquared
344 - CLAccumulateSquaredKernel
345 - CLAccumulateWeighted
346 - CLAccumulateWeightedKernel
347 - CLAccumulateWeightedFP16Kernel
348 - CLBox3x3
349 - CLBox3x3Kernel
350 - CLBox3x3FP16Kernel
351 - CLCannyEdge
352 - CLChannelCombine
353 - CLChannelCombineKernel
354 - CLChannelExtract
355 - CLChannelExtractKernel
356 - CLColorConvert
357 - CLColorConvertKernel
358 - CLConvolution3x3
359 - CLConvolutionRectangle
360 - CLConvolutionRectangleKernel
361 - CLConvolutionSquare
362 - CLConvolutionKernel
363 - CLDerivative
364 - CLDerivativeKernel
365 - CLDilate
366 - CLDilateKernel
367 - CLEqualizeHistogram
368 - CLErode
369 - CLErodeKernel
370 - CLFastCorners
371 - CLFastCornersKernel
372 - CLGaussian3x3
373 - CLGaussian3x3Kernel
374 - CLGaussian5x5
375 - CLGaussian5x5HorKernel
376 - CLGaussian5x5VertKernel
377 - CLGaussianPyramid
378 - CLGaussianPyramidHalf
379 - CLGaussianPyramidOrb
380 - CLHarrisCorners
381 - CLHarrisScoreKernel
382 - CLHarrisScoreFP16Kernel
383 - CLHistogram
384 - CLHistogramKernel
385 - CLHOGOrientationBinningKernel
386 - CLHOGBlockNormalizationKernel
387 - CLHOGDetectorKernel
388 - CLHOGNonMaximaSuppressionKernel
389 - CLHOGDescriptor
390 - CLHOGDetector
391 - CLHOGGradient
392 - CLHOGMultiDetection
393 - CLHOGOrientationBinningKernel
394 - CLHOGBlockNormalizationKernel
395 - CLHOGDetectorKernel
396 - CLIntegralImage
397 - CLIntegralImageKernel
398 - CLLaplacianReconstruct
399 - CLLaplacianPyramid
400 - CLMagnitude
401 - CLMagnitudePhaseKernel
402 - CLMedian3x3
403 - CLMedian3x3Kernel
404 - CLMinMaxLocation
405 - CLMinMaxLocationKernel
406 - CLNonLinearFilter
407 - CLNonLinearFilterKernel
408 - CLNonMaximaSuppression3x3
409 - CLNonMaximaSuppression3x3FP16Kernel
410 - CLNonMaximaSuppression3x3Kernel
411 - CLOpticalFlow
412 - CLPhase
413 - CLRemap
414 - CLRemapKernel
415 - CLScharr3x3
416 - CLScharr3x3Kernel
417 - CLSobel3x3
418 - CLSobel3x3Kernel
419 - CLSobel5x5
420 - CLSobel5x5HorKernel
421 - CLSobel5x5VertKernel
422 - CLSobel7x7
423 - CLSobel7x7HorKernel
424 - CLSobel7x7VertKernel
425 - CLThreshold
426 - CLThresholdKernel
427 - CLWarpAffine
428 - CLWarpAffineKernel
429 - CLWarpPerspective
430 - CLWarpPerspectiveKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000431 - Deprecated Arm® Neon™ kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together):
Georgios Pinitas2d221392020-09-03 15:16:37 +0100432 - NELocallyConnectedLayer
433 - NELocallyConnectedMatrixMultiplyKernel
morgolock0c862652020-11-06 08:59:45 +0000434 - NEAbsoluteDifference
435 - NEAbsoluteDifferenceKernel
436 - NEAccumulate
437 - NEAccumulateKernel
438 - NEAccumulateSquared
439 - NEAccumulateSquaredKernel
440 - NEAccumulateWeighted
441 - NEAccumulateWeightedKernel
442 - NEAccumulateWeightedFP16Kernel
443 - NEBox3x3
444 - NEBox3x3Kernel
445 - NEBox3x3FP16Kernel
446 - NECannyEdge
447 - NEChannelCombine
448 - NEChannelCombineKernel
449 - NEChannelExtract
450 - NEChannelExtractKernel
451 - NEColorConvert
452 - NEColorConvertKernel
453 - NEConvolution3x3
454 - NEConvolutionRectangle
455 - NEConvolutionRectangleKernel
456 - NEConvolutionSquare
457 - NEConvolutionKernel
458 - NEDerivative
459 - NEDerivativeKernel
460 - NEDilate
461 - NEDilateKernel
462 - NEEqualizeHistogram
463 - NEErode
464 - NEErodeKernel
465 - NEFastCorners
466 - NEFastCornersKernel
467 - NEGaussian3x3
468 - NEGaussian3x3Kernel
469 - NEGaussian5x5
470 - NEGaussian5x5HorKernel
471 - NEGaussian5x5VertKernel
472 - NEGaussianPyramid
473 - NEGaussianPyramidHalf
474 - NEGaussianPyramidOrb
475 - NEHarrisCorners
476 - NEHarrisScoreKernel
477 - NEHarrisScoreFP16Kernel
478 - NEHistogram
479 - NEHistogramKernel
480 - NEHOGOrientationBinningKernel
481 - NEHOGBlockNormalizationKernel
482 - NEHOGDetectorKernel
483 - NEHOGNonMaximaSuppressionKernel
484 - NEHOGDescriptor
485 - NEHOGDetector
486 - NEHOGGradient
487 - NEHOGMultiDetection
488 - NEHOGOrientationBinningKernel
489 - NEHOGBlockNormalizationKernel
490 - NEHOGDetectorKernel
491 - NEIntegralImage
492 - NEIntegralImageKernel
493 - NELaplacianReconstruct
494 - NELaplacianPyramid
495 - NEMagnitude
496 - NEMagnitudePhaseKernel
497 - NEMedian3x3
498 - NEMedian3x3Kernel
499 - NEMinMaxLocation
500 - NEMinMaxLocationKernel
501 - NENonLinearFilter
502 - NENonLinearFilterKernel
503 - NENonMaximaSuppression3x3
504 - NENonMaximaSuppression3x3FP16Kernel
505 - NENonMaximaSuppression3x3Kernel
506 - NEOpticalFlow
507 - NEPhase
508 - NERemap
509 - NERemapKernel
510 - NEScharr3x3
511 - NEScharr3x3Kernel
512 - NESobel3x3
513 - NESobel3x3Kernel
514 - NESobel5x5
515 - NESobel5x5HorKernel
516 - NESobel5x5VertKernel
517 - NESobel7x7
518 - NESobel7x7HorKernel
519 - NESobel7x7VertKernel
520 - NEThreshold
521 - NEThresholdKernel
522 - NEWarpAffine
523 - NEWarpAffineKernel
524 - NEWarpPerspective
525 - NEWarpPerspectiveKernel
morgolockd6ee9ed2020-11-19 10:07:14 +0000526 - Deprecated GLES kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together):
527 - GCAbsoluteDifference
528 - GCActivationLayer
529 - GCArithmeticAddition
530 - GCBatchNormalizationLayer
531 - GCConcatenateLayer
532 - GCConvolutionLayer
533 - GCDepthwiseConvolutionLayer
534 - GCDirectConvolutionLayer
535 - GCDropoutLayer
536 - GCFillBorder
537 - GCFullyConnectedLayer
538 - GCGEMM
539 - GCGEMMInterleave4x4
540 - GCGEMMTranspose1xW
541 - GCNormalizationLayer
542 - GCNormalizePlanarYUVLayer
543 - GCPixelWiseMultiplication
544 - GCPoolingLayer
545 - GCScale
546 - GCSoftmaxLayer
547 - GCTensorShift
548 - GCTranspose
549
SiCong Li96209c72020-08-21 12:28:30 +0100550
Georgios Pinitas25ef7212020-06-02 23:00:41 +0100551v20.08 Public major release
552 - Various bug fixes.
553 - Various optimisations.
Sheri Zhang3ef9b5f2020-07-09 16:32:58 +0100554 - Added new data type QASYMM8_SIGNED support for:
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100555 - @ref CLArgMinMaxLayer
556 - @ref CLArgMinMaxLayerKernel
557 - Added new data type U8 support for:
558 - @ref NECropKernel
Sheri Zhang7e20e292021-02-02 11:49:34 +0000559 - CLCropKernel
Jakub Sujakee301b32021-06-04 09:46:08 +0100560 - Added align_corner support for nearest neighbor interpolation in:
Manuel Bottini10b38262021-02-19 18:16:44 +0000561 - NEScaleKernel
Manuel Bottini3b131ab2021-02-19 18:16:44 +0000562 - CLScaleKernel
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100563 - New OpenCL kernels / functions:
564 - @ref CLMaxUnpoolingLayerKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000565 - New Arm® Neon™ kernels / functions:
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100566 - @ref NEMaxUnpoolingLayerKernel
Sheri Zhang3ef9b5f2020-07-09 16:32:58 +0100567 - New graph example:
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100568 - graph_yolov3_output_detector
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100569 - GEMMTuner improvements:
570 - Added fp16 support
571 - Output json files for easier integration
572 - Enabled tuning for export_to_cl_image_rhs option for RHS tensors
573 - More robust script for running benchmarks
Sheri Zhang3ef9b5f2020-07-09 16:32:58 +0100574 - Removed padding from:
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000575 - NEPixelWiseMultiplicationKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000576 - NEHeightConcatenateLayerKernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000577 - NEThresholdKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000578 - NEBatchConcatenateLayerKernel
Teresa Charlind1dc09c2021-03-04 15:24:45 +0000579 - NETransposeKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100580 - @ref NEBatchNormalizationLayerKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000581 - NEArithmeticSubtractionKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100582 - @ref NEBoundingBoxTransformKernel
Michalis Spyrou373b4072021-01-20 16:41:12 +0000583 - NELogits1DMaxKernel
584 - NELogits1DSoftmaxKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100585 - @ref NEROIPoolingLayerKernel
586 - @ref NEROIAlignLayerKernel
Georgios Pinitas0b1c2db2020-12-04 15:51:34 +0000587 - NEYOLOLayerKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000588 - NEUpsampleLayerKernel
Georgios Pinitas70eb53b2021-01-06 19:42:21 +0000589 - NEFloorKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000590 - NEWidthConcatenateLayerKernel
591 - NEDepthConcatenateLayerKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100592 - @ref NENormalizationLayerKernel
593 - @ref NEL2NormalizeLayerKernel
Georgios Pinitasc6f95102021-03-30 10:03:01 +0100594 - NEFillArrayKernel
Georgios Pinitas11d84152021-04-28 10:20:18 +0100595 - NEDepthConvertLayerKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100596 - @ref NERangeKernel
597 - @ref NEPriorBoxLayer
Sheri Zhanged367132020-10-08 15:46:16 +0100598 - Removed OpenCL kernels / functions:
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100599 - CLGEMMLowpQuantizeDownInt32ToUint8Scale
600 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000601 - Removed Arm® Neon™ kernels / functions:
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100602 - NEGEMMLowpQuantizeDownInt32ToUint8Scale
603 - NEGEMMMatrixAccumulateBiasesKernel
SiCong Lid004a7a2020-05-28 15:26:41 +0100604 - Deprecated functions / interfaces:
Michalis Spyrou473cb012021-02-23 11:48:12 +0000605 - Non-descriptor based interfaces for NEThreshold, CLThreshold
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +0000606 - Non-descriptor based interfaces for @ref NEScale, @ref CLScale and GCScale
607 - In @ref NESoftmaxLayer, @ref NELogSoftmaxLayer, @ref CLSoftmaxLayer, @ref CLLogSoftmaxLayer and GCSoftmaxLayer :
608 The default "axis" value for @ref CLSoftmaxLayer, @ref CLLogSoftmaxLayer and GCSoftmaxLayer is changed from 1 to 0.
morgolock9c7fed82020-08-05 12:30:56 +0100609 Only axis 0 is supported.
610 The default "axis" value for @ref NESoftmaxLayer, @ref NELogSoftmaxLayer is changed from 1 to 0.
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100611 Only axis 0 is supported.
Sang-Hoon Parka0205b92020-07-07 09:36:09 +0100612 - The support for quantized data types has been removed from @ref CLLogSoftmaxLayer due to implementation complexity.
Manuel Bottinid844c082021-07-14 12:58:54 +0100613 - Removed padding requirement for the input (e.g. LHS of GEMM) and output in CLGEMMMatrixMultiplyNativeKernel, CLGEMMMatrixMultiplyReshapedKernel, CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and CLIm2ColKernel (NHWC only)
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100614 - This change allows to use @ref CLGEMMConvolutionLayer without extra padding for the input and output.
615 - Only the weights/bias of @ref CLGEMMConvolutionLayer could require padding for the computation.
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100616 - Only on Arm® Mali™ Midgard GPUs, @ref CLGEMMConvolutionLayer could require padding since CLGEMMMatrixMultiplyKernel is called and currently requires padding.
617 - Added support for exporting the OpenCL buffer object to the OpenCL image object in CLGEMMMatrixMultiplyReshapedKernel and CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100618 - This support allows to export the OpenCL buffer used for the reshaped RHS matrix to the OpenCL image object.
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100619 - The padding requirement for the OpenCL image object is considered into the CLGEMMReshapeRHSMatrixKernel.
620 - The reshaped RHS matrix stores the weights when GEMM is used to accelerate CLGEMMConvolutionLayer.
Georgios Pinitas25ef7212020-06-02 23:00:41 +0100621
Georgios Pinitasfd7780d2020-03-17 11:41:00 +0000622v20.05 Public major release
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000623 - Various bug fixes.
624 - Various optimisations.
Michele Di Giorgio36a551f2020-04-23 11:55:29 +0100625 - Updated recommended NDK version to r18b.
626 - Updated recommended gcc version to Linaro 6.3.1.
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000627 - Added Bfloat16 type support
628 - Added Bfloat16 support in:
Manuel Bottini29599d02021-07-06 15:01:35 +0100629 - NEWeightsReshapeKernel
630 - NEConvolutionLayerReshapeWeights
Manuel Bottini90028992021-06-30 18:29:18 +0100631 - NEIm2ColKernel
Georgios Pinitasf7c5a412020-12-03 14:38:33 +0000632 - NEIm2Col
Georgios Pinitas11d84152021-04-28 10:20:18 +0100633 - NEDepthConvertLayerKernel
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000634 - @ref NEDepthConvertLayer
635 - @ref NEGEMMConvolutionLayer
Georgios Pinitasec2256b2020-12-03 18:51:58 +0000636 - NEGEMMAssemblyDispatch
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000637 - Added new data type QASYMM8_SIGNED support for:
638 - @ref CLDirectConvolutionLayer
639 - @ref CLDeconvolutionLayer
640 - @ref CLDirectDeconvolutionLayer
641 - @ref CLGEMMDeconvolutionLayer
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100642 - CLGEMMLowpMatrixMultiplyReshapedKernel
643 - CLGEMMLowpQuantizeDownInt32ScaleKernel
644 - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000645 - @ref CLReductionOperation
646 - @ref CLReduceMean
Sheri Zhang359c48e2020-04-30 22:53:39 +0100647 - @ref NEScale
Manuel Bottini10b38262021-02-19 18:16:44 +0000648 - NEScaleKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000649 - NEUpsampleLayer
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000650 - @ref NECast
651 - @ref NEReductionOperation
652 - @ref NEReduceMean
653 - @ref NEArgMinMaxLayer
654 - @ref NEDeconvolutionLayer
Manuel Bottiniae58bdf2021-06-17 17:18:45 +0100655 - NEGEMMLowpQuantizeDownInt32ScaleKernel
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000656 - @ref CPPBoxWithNonMaximaSuppressionLimit
657 - @ref CPPDetectionPostProcessLayer
658 - @ref CPPPermuteKernel
659 - @ref CPPPermute
660 - @ref CPPTopKVKernel
661 - @ref CPPTopKV
Sheri Zhang359c48e2020-04-30 22:53:39 +0100662 - @ref CPPUpsample
663 - @ref CPPUpsampleKernel
Sheri Zhang31b49ca2020-04-24 11:15:10 +0100664 - New OpenCL kernels / functions:
665 - @ref CLQLSTMLayer
666 - @ref CLQLSTMLayerNormalizationKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000667 - New Arm® Neon™ kernels / functions:
Sheri Zhang31b49ca2020-04-24 11:15:10 +0100668 - @ref NEQLSTMLayer
669 - @ref NEQLSTMLayerNormalizationKernel
670 - Added HARD_SWISH support in:
Georgios Pinitasf47f7182021-01-15 09:29:50 +0000671 - CLActivationLayerKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000672 - NEActivationLayerKernel
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000673 - Deprecated OpenCL kernels / functions:
674 - CLGEMMLowpQuantizeDownInt32ToUint8Scale
675 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000676 - Deprecated Arm® Neon™ kernels / functions:
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000677 - NEGEMMLowpQuantizeDownInt32ToUint8Scale
678 - Removed CPP kernels / functions:
679 - CPPFlipWeightsKernel
Manuel Bottini387259a2020-05-21 17:14:36 +0100680 - Removed PoolingLayerInfo constructors without Data Layout.
681 - Removed CLDepthwiseConvolutionLayer3x3
682 - Removed NEDepthwiseConvolutionLayerOptimized
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000683 - Added support for Winograd 3x3,4x4 on Arm® Neon™ FP16:
Manuel Bottini075253a2020-05-22 12:57:18 +0100684 - @ref NEWinogradConvolutionLayer
Michalis Spyrou96f977e2021-07-01 12:20:56 +0100685 - CpuWinogradConv2dTransformInputKernel
686 - CpuWinogradConv2dTransformOutputKernel
687 - CpuWinogradConv2dTransformWeightsKernel
Manuel Bottini075253a2020-05-22 12:57:18 +0100688 - Added CLCompileContext
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000689 - Added Arm® Neon™ GEMM kernel with 2D window support
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000690
Michele Di Giorgio740872e2020-03-04 15:29:49 +0000691v20.02.1 Maintenance release
692 - Added Android-NN build script.
693
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000694v20.02 Public major release
695 - Various bug fixes.
696 - Various optimisations.
697 - Added new data type QASYMM8_SIGNED support for:
698 - @ref CLDepthwiseConvolutionLayer
Manuel Bottini387259a2020-05-21 17:14:36 +0100699 - CLDepthwiseConvolutionLayer3x3
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000700 - @ref CLGEMMConvolutionLayer
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100701 - CLGEMMLowpMatrixMultiplyCore
702 - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
703 - CLGEMMLowpMatrixMultiplyNativeKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000704 - @ref NEActivationLayer
Sang-Hoon Park63001ac2021-01-18 14:20:27 +0000705 - NEComparisonOperationKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000706 - @ref NEConvolutionLayer
707 - @ref NEDepthwiseConvolutionLayer
Georgios Pinitas7d0adc62020-09-04 15:25:24 +0100708 - NEDepthwiseConvolutionLayer3x3Kernel
Manuel Bottini327225d2021-04-13 13:09:30 +0100709 - NEDirectConvolutionLayerOutputStageKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000710 - @ref NEElementwiseComparison
711 - @ref NEElementwiseMax
712 - @ref NEElementwiseMin
713 - @ref NEElementwiseSquaredDiff
714 - @ref NEFullyConnectedLayer
Michele Di Giorgiof22f6722020-07-03 16:29:24 +0100715 - NEGEMMMatrixVectorMultiplyKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000716 - @ref NEPixelWiseMultiplication
717 - @ref NEPoolingLayer
718 - @ref NEPReluLayer
719 - Added support for QSYMM8_PER_CHANNEL in:
Georgios Pinitas7d0adc62020-09-04 15:25:24 +0100720 - NEDepthwiseConvolutionLayer3x3Kernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000721 - Added support for split sizes in:
722 - @ref CLSplit
723 - @ref NESplit
724 - New OpenCL kernels / functions:
725 - @ref CLFill
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100726 - CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000727 - New Arm® Neon™ kernels / functions:
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000728 - @ref NEFill
Manuel Bottiniae58bdf2021-06-17 17:18:45 +0100729 - NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000730 - Deprecated Arm® Neon™ functions / interfaces:
Manuel Bottini387259a2020-05-21 17:14:36 +0100731 - CLDepthwiseConvolutionLayer3x3
732 - NEDepthwiseConvolutionLayerOptimized
733 - PoolingLayerInfo constructors without Data Layout.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000734 - Added support for quantization with multiplier greater than 1 on Arm® Neon™ and CL.
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000735 - Added support for quantized inputs of type QASYMM8_SIGNED and QASYMM8 to @ref CLQuantizationLayer.
736 - Added the ability to build bootcode for bare metal.
737 - Added support for generating synthetic QASYMM8 graphs.
738 - Added support for F16 datatype in VGG16.
739 - Removed pre-built binaries for GLES.
740
Michele Di Giorgiod374ff22020-01-21 10:03:20 +0000741v19.11.1 Public maintenance release
742 - Fix offset calculation in NEReductionOperationKernel.
743 - Fix data layout in NEScaleKernel for nhwc.
744 - Retain configuration step data layout to avoid side-effects.
745 - Perform sqrt in double domain for L2 pooling.
746 - Fix output shape calculation for Reduce Mean
747 - Restrict cases where optimized NEPadLayer runs.
748
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100749v19.11 Public major release
SiCong Lica1f98c2019-11-28 11:06:11 +0000750 - Various bug fixes.
751 - Various optimisations.
SiCong Li1f7f9882019-11-28 14:59:35 +0000752 - Updated recommended NDK version to r17c.
SiCong Lica1f98c2019-11-28 11:06:11 +0000753 - Deprecated OpenCL kernels / functions:
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100754 - CLDepthwiseConvolutionLayerReshapeWeightsGenericKernel
755 - CLDepthwiseIm2ColKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000756 - CLDepthwiseSeparableConvolutionLayer
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100757 - CLDepthwiseVectorToTensorKernel
758 - CLDirectConvolutionLayerOutputStageKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000759 - Deprecated Arm® Neon™ kernels / functions:
Giorgio Arenad93e2632019-10-15 11:09:33 +0100760 - NEDepthwiseWeightsReshapeKernel
761 - NEDepthwiseIm2ColKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000762 - NEDepthwiseSeparableConvolutionLayer
Giorgio Arenad93e2632019-10-15 11:09:33 +0100763 - NEDepthwiseVectorToTensorKernel
Manuel Bottini05069f02019-09-26 17:18:26 +0100764 - NEDepthwiseConvolutionLayer3x3
SiCong Lica1f98c2019-11-28 11:06:11 +0000765 - New OpenCL kernels / functions:
766 - @ref CLInstanceNormalizationLayerKernel / @ref CLInstanceNormalizationLayer
767 - @ref CLDepthwiseConvolutionLayerNativeKernel to replace the old generic depthwise convolution (see Deprecated
768 OpenCL kernels / functions)
769 - @ref CLLogSoftmaxLayer
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000770 - New Arm® Neon™ kernels / functions:
SiCong Lica1f98c2019-11-28 11:06:11 +0000771 - @ref NEBoundingBoxTransformKernel / @ref NEBoundingBoxTransform
Georgios Pinitas8c3c0e72020-12-03 20:11:53 +0000772 - @ref NEComputeAllAnchorsKernel / NEComputeAllAnchors
SiCong Lica1f98c2019-11-28 11:06:11 +0000773 - @ref NEDetectionPostProcessLayer
774 - @ref NEGenerateProposalsLayer
775 - @ref NEInstanceNormalizationLayerKernel / @ref NEInstanceNormalizationLayer
776 - @ref NELogSoftmaxLayer
777 - @ref NEROIAlignLayerKernel / @ref NEROIAlignLayer
778 - Added QASYMM8 support for:
779 - @ref CLGenerateProposalsLayer
780 - @ref CLROIAlignLayer
781 - @ref CPPBoxWithNonMaximaSuppressionLimit
782 - Added QASYMM16 support for:
783 - @ref CLBoundingBoxTransform
784 - Added FP16 support for:
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100785 - CLGEMMMatrixMultiplyReshapedKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000786 - Added new data type QASYMM8_PER_CHANNEL support for:
Manuel Bottini9e73c932021-03-02 17:40:42 +0000787 - CLDequantizationLayer
SiCong Lica1f98c2019-11-28 11:06:11 +0000788 - @ref NEDequantizationLayer
789 - Added new data type QSYMM8_PER_CHANNEL support for:
790 - @ref CLConvolutionLayer
791 - @ref NEConvolutionLayer
792 - @ref CLDepthwiseConvolutionLayer
793 - @ref NEDepthwiseConvolutionLayer
794 - Added FP16 mixed-precision support for:
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100795 - CLGEMMMatrixMultiplyReshapedKernel
Michele Di Giorgioe1314662021-02-01 17:09:32 +0000796 - CLPoolingLayerKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000797 - Added FP32 and FP16 ELU activation for:
798 - @ref CLActivationLayer
799 - @ref NEActivationLayer
800 - Added asymmetric padding support for:
801 - @ref CLDirectDeconvolutionLayer
802 - @ref CLGEMMDeconvolutionLayer
803 - @ref NEDeconvolutionLayer
804 - Added SYMMETRIC and REFLECT modes for @ref CLPadLayerKernel / @ref CLPadLayer.
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +0000805 - Replaced the calls to NECopyKernel and NEMemsetKernel with @ref NEPadLayer in @ref NEGenerateProposalsLayer.
806 - Replaced the calls to CLCopyKernel and CLMemsetKernel with @ref CLPadLayer in @ref CLGenerateProposalsLayer.
SiCong Lica1f98c2019-11-28 11:06:11 +0000807 - Improved performance for CL Inception V3 - FP16.
808 - Improved accuracy for CL Inception V3 - FP16 by enabling FP32 accumulator (mixed-precision).
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000809 - Improved Arm® Neon™ performance by enabling fusing batch normalization with convolution and depth-wise convolution layer.
810 - Improved Arm® Neon™ performance for MobileNet-SSD by improving the output detection performance.
SiCong Lica1f98c2019-11-28 11:06:11 +0000811 - Optimized @ref CLPadLayer.
812 - Optimized CL generic depthwise convolution layer by introducing @ref CLDepthwiseConvolutionLayerNativeKernel.
813 - Reduced memory consumption by implementing weights sharing.
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100814
Michele Di Giorgiod374ff22020-01-21 10:03:20 +0000815v19.08.1 Public maintenance release
816 - Fix offset calculation in NEReductionOperationKernel.
817 - Fix data layout in NEScaleKernel for nhwc.
818 - Retain configuration step data layout to avoid side-effects.
819 - Perform sqrt in double domain for L2 pooling.
820 - Fix output shape calculation for Reduce Mean
821 - Fix broadcast CLPixelwiseMultiplication with 5D tensors
822
Georgios Pinitas3d13af82019-06-04 13:04:16 +0100823v19.08 Public major release
824 - Various bug fixes.
825 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000826 - Deprecated Arm® Neon™ functions
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100827 - NEDepthConcatenateLayer
828 - NEWidthConcatenateLayer
829 - Deprecated OpenCL kernels / functions
830 - CLDepthConcatenateLayer
831 - CLGEMMInterleave4x4Kernel / CLGEMMInterleave4x4
832 - CLGEMMTranspose1xWKernel / CLGEMMTranspose1xW
833 - CLWidthConcatenateLayer
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000834 - New Arm® Neon™ kernels / functions:
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100835 - @ref NEAbsLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100836 - @ref NECast
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100837 - @ref NEElementwisePower
838 - @ref NELogLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100839 - @ref NELSTMLayerQuantized
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100840 - @ref NENegLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100841 - @ref NEPReluLayer
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100842 - @ref NESinLayer
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000843 - NEBatchConcatenateLayerKernel
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100844 - @ref NEDepthToSpaceLayerKernel / @ref NEDepthToSpaceLayer
Michalis Spyrou60c3b0e2021-04-08 12:02:58 +0100845 - NEDepthwiseConvolutionLayerNativeKernel
Manuel Bottiniae58bdf2021-06-17 17:18:45 +0100846 - NEGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100847 - @ref NEMeanStdDevNormalizationKernel / @ref NEMeanStdDevNormalizationLayer
848 - @ref NESpaceToDepthLayerKernel / @ref NESpaceToDepthLayer
849 - New OpenCL kernels / functions:
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100850 - @ref CLAbsLayer
851 - @ref CLElementwisePower
852 - @ref CLLogLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100853 - @ref CLLSTMLayerQuantized
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100854 - @ref CLNegLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100855 - @ref CLPReluLayer
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100856 - @ref CLSinLayer
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000857 - CLBatchConcatenateLayerKernel
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100858 - @ref CLDepthToSpaceLayerKernel / @ref CLDepthToSpaceLayer
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100859 - CLGEMMLowpMatrixMultiplyNativeKernel
Michele Di Giorgioba14c922020-10-12 13:27:57 +0100860 - CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100861 - CLGEMMMatrixMultiplyNativeKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000862 - CLMeanStdDevNormalizationKernel /CLMeanStdDevNormalizationLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100863 - @ref CLSpaceToDepthLayerKernel / @ref CLSpaceToDepthLayer
864 - New examples:
865 - neon_opticalflow
866 - cl_cache
867 - neon_permute
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100868 - Added support for FP16 in @ref NEDeconvolutionLayer
869 - Added support for FP16 in @ref CLDeconvolutionLayer
870 - Added support for REDUCE_MIN and REDUCE_MAX in @ref ReductionOperation
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100871 - Enable the fusion of batch normalization with convolution and depthwise convolution layer for FP32 in the graph API (OpenCL only)
872 - Added support for fusing activation function and broadcast addition with the matrix multiplication for FP32 (OpenCL only)
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000873 - Re-factored the depthwise convolution layer kernel on Arm® Neon™ for generic cases
Jakub Sujakee301b32021-06-04 09:46:08 +0100874 - Added an optimized depthwise convolution layer kernel for 5x5 filters (Neon™ only)
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100875 - Added support to enable OpenCL kernel cache. Added example showing how to load the prebuilt OpenCL kernels from a binary cache file
876 - Altered @ref QuantizationInfo interface to support per-channel quantization.
Manuel Bottini387259a2020-05-21 17:14:36 +0100877 - The CLDepthwiseConvolutionLayer3x3 will be included by @ref CLDepthwiseConvolutionLayer to accommodate for future optimizations.
878 - The NEDepthwiseConvolutionLayerOptimized will be included by @ref NEDepthwiseConvolutionLayer to accommodate for future optimizations.
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100879 - Removed inner_border_right and inner_border_top parameters from @ref CLDeconvolutionLayer interface
880 - Removed inner_border_right and inner_border_top parameters from @ref NEDeconvolutionLayer interface
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000881 - Optimized the Arm® Neon™ assembly kernel for GEMMLowp. The new implementation fuses the output stage and quantization with the matrix multiplication kernel
Georgios Pinitas3d13af82019-06-04 13:04:16 +0100882
Michalis Spyroua9c44722019-04-05 17:18:36 +0100883v19.05 Public major release
Michalis Spyrouc6608ac2019-05-16 17:40:23 +0100884 - Various bug fixes.
885 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000886 - New Arm® Neon™ kernels / functions:
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100887 - @ref NEBatchToSpaceLayerKernel / @ref NEBatchToSpaceLayer
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000888 - NEComplexPixelWiseMultiplicationKernel / @ref NEComplexPixelWiseMultiplication
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100889 - @ref NECropKernel / @ref NECropResize
Michalis Spyrou60c3b0e2021-04-08 12:02:58 +0100890 - NEDepthwiseConvolutionAssemblyDispatch
Michalis Spyrouca82e622019-05-10 16:43:20 +0100891 - @ref NEFFTDigitReverseKernel
892 - @ref NEFFTRadixStageKernel
893 - @ref NEFFTScaleKernel
Manuel Bottinicfac51c2021-06-18 15:47:28 +0100894 - NEGEMMLowpOffsetContributionOutputStageKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000895 - NEHeightConcatenateLayerKernel
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100896 - @ref NESpaceToBatchLayerKernel / @ref NESpaceToBatchLayer
Michalis Spyroud7dd15c2019-05-30 14:53:58 +0100897 - @ref NEFFT1D
898 - @ref NEFFT2D
899 - @ref NEFFTConvolutionLayer
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100900 - New OpenCL kernels / functions:
Sheri Zhangf9ab9f92021-03-16 12:09:15 +0000901 - CLComplexPixelWiseMultiplicationKernel / @ref CLComplexPixelWiseMultiplication
Sheri Zhang7e20e292021-02-02 11:49:34 +0000902 - CLCropKernel / @ref CLCropResize
Michalis Spyroud7dd15c2019-05-30 14:53:58 +0100903 - @ref CLDeconvolutionReshapeOutputKernel
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100904 - @ref CLFFTDigitReverseKernel
905 - @ref CLFFTRadixStageKernel
906 - @ref CLFFTScaleKernel
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100907 - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100908 - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000909 - CLHeightConcatenateLayerKernel
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100910 - @ref CLDirectDeconvolutionLayer
911 - @ref CLFFT1D
912 - @ref CLFFT2D
913 - @ref CLFFTConvolutionLayer
Michalis Spyrouca82e622019-05-10 16:43:20 +0100914 - @ref CLGEMMDeconvolutionLayer
915 - New OpenGLES kernels / functions:
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +0000916 - GCConcatenateLayer
Michalis Spyroua9c44722019-04-05 17:18:36 +0100917 - Deprecated functions/interfaces
Georgios Pinitas09f24972019-05-17 18:14:40 +0100918 - GCDepthConcatenateLayer
919 - NEWidthConcatenateLayer
920 - NEDepthConcatenateLayer
921 - CLWidthConcatenateLayer
922 - CLDepthConcatenateLayer
Gian Marco Iodice5fc07aa2019-05-15 17:08:02 +0100923 - CLGEMMInterleave4x4
924 - CLGEMMTranspose1xW
Michalis Spyrouc6608ac2019-05-16 17:40:23 +0100925 - Support different quantization info in CLConcatLayer.
926 - Add checks on different input/output quantization info were not supported.
927 - Tensors have different quantization information.
928 - Add FP16 support checks.
929 - Fix output quantization CLDeptwiseConv3x3 when activation is fused.
930 - New graph examples:
931 - graph_convolution
932 - graph_fully_connected
933 - graph_depthwise_convolution
934 - Deepspeech v0.4.1
935 - Add support for QASYMM8 in NEArithmeticSubtractionKernel.
936 - Add support for QASYMM8 in NEPixelWiseMultiplicationKernel.
937 - Add support for QASYMM8 NEDeconvolution.
Sheri Zhangac6499a2021-02-10 15:32:38 +0000938 - Add support for DequantizationLayer for Neon/CL.
Michalis Spyrouc6608ac2019-05-16 17:40:23 +0100939 - Add support for dilation in CLDepthwiseConvolution.
940 - Fuse offset contribution with the output stage when we use NEGEMMLowpMatrixMultiplyCore.
941 - Optimize CLDeconvolution.
942 - Add StackLayer to the graph API.
943 - Add support for "reflect" padding mode in NEPad.
944 - Winograd 7x7 NHWC on OpenCL.
945 - Rework CL ML layers to run exclusively on CL.
946 - Support different quantization info in PoolingLayer.
947 - Implement and test import memory interfaces.
948 - Added new tests and removed old ones.
949 - Various clang-tidy fixes.
Michalis Spyroua9c44722019-04-05 17:18:36 +0100950
giuros01a69a88b2019-01-31 16:29:19 +0000951v19.02 Public major release
Isabella Gottardi62538972019-02-12 19:52:44 +0000952 - Various bug fixes.
953 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000954 - New Arm® Neon™ kernels / functions:
Isabella Gottardi62538972019-02-12 19:52:44 +0000955 - @ref NETileKernel / @ref NETile
956 - @ref NEFuseBatchNormalizationKernel / @ref NEFuseBatchNormalization
Sang-Hoon Park63001ac2021-01-18 14:20:27 +0000957 - NEElementwiseOperationKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000958 - @ref NEElementwiseMax
959 - @ref NEElementwiseMin
960 - @ref NEElementwiseSquaredDiff
961 - @ref NESelectKernel / @ref NESelect
962 - @ref NESplit
963 - @ref NESlice
964 - @ref NEUnstack
965 - @ref NEStridedSliceKernel / @ref NEStridedSlice
Sang-Hoon Park7249f152021-01-22 11:55:03 +0000966 - NEElementwiseUnaryKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000967 - @ref NERsqrtLayer
968 - @ref NEExpLayer
969 - @ref NEReverseKernel / @ref NEReverse
970 - @ref NEArgMinMaxLayer
971 - @ref NEStackLayerKernel / @ref NEStackLayer
972 - @ref NERangeKernel / @ref NERange
973 - @ref NEPadLayer
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +0000974 - NEMemsetKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000975 - @ref NEGatherKernel / @ref NEGather
976 - @ref NEElementwiseComparison
977 - @ref NEElementwiseComparisonStatic
Sang-Hoon Park63001ac2021-01-18 14:20:27 +0000978 - NEComparisonOperationKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000979 - @ref NEElementwiseDivision
980 - New OpenCL kernels / functions:
981 - @ref CLSelectKernel / @ref CLSelect
982 - @ref CLTileKernel / @ref CLTile
983 - @ref CLComparisonKernel / @ref CLComparison
984 - @ref CLArgMinMaxLayer
985 - @ref CLElementwiseMax
986 - @ref CLElementwiseMin
987 - @ref CLElementwiseSquaredDiff
988 - @ref CLStackLayerKernel / @ref CLStackLayer
989 - @ref CLReverse / @ref CLReverseKernel
990 - @ref CLRsqrtLayer
991 - @ref CLExpLayer
Michele Di Giorgioc9c89052021-01-26 10:20:17 +0000992 - CLElementWiseUnaryLayerKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100993 - CLGEMMReshapeLHSMatrixKernel
994 - CLGEMMReshapeRHSMatrixKernel
995 - CLGEMMMatrixMultiplyReshapedKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000996 - @ref CLRangeKernel / @ref CLRange
997 - @ref CLUnstack
998 - @ref CLGatherKernel / @ref CLGather
Georgios Pinitas4a578b92021-06-25 12:13:49 +0100999 - CLGEMMLowpMatrixMultiplyReshapedKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001000 - New CPP kernels / functions:
1001 - @ref CPPDetectionOutputLayer
1002 - @ref CPPTopKV / @ref CPPTopKVKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001003 - Added new examples:
1004 - graph_ssd_mobilenet.cpp
1005 - graph_mobilenet_v2.cpp
1006 - graph_resnet12.cpp
1007 - graph_srcnn955.cpp
1008 - graph_vgg_vdsr.cpp
1009 - graph_inception_resnet_v1.cpp
1010 - Add 4D tensors support to
1011 - @ref NESoftmaxLayer
1012 - Fused activation in @ref CLWinogradConvolutionLayer
Jakub Sujakee301b32021-06-04 09:46:08 +01001013 - Extended @ref NEPermute to support more cases
1014 - Added Neon™/SVE GEMM Hybrid kernels
Isabella Gottardi62538972019-02-12 19:52:44 +00001015 - Added u8 and s8 hybrid assembly kernels
1016 - Introduced GEMM strategy name in NEGEMMAssemblyWrapper
1017 - Improved @ref CLTuner
1018 - Fused the bias addition within @ref CLGEMM
1019 - Added support for QASYMM8 LOGISTIC activation in @ref NEActivationLayer
1020 - Added NHWC data layout support to:
1021 - @ref NEScale for F16
1022 - @ref CLNormalizationLayer IN_MAP_2D for FP32/FP16
1023 - @ref NEL2NormalizeLayer for FP32/FP16
1024 - @ref NENormalizationLayer IN_MAP_2D for FP32/FP16
1025 - @ref CLROIAlignLayer
Manuel Bottini5209be52019-02-13 16:34:56 +00001026 - @ref CLGenerateProposalsLayer
Isabella Gottardi62538972019-02-12 19:52:44 +00001027 - Added QASYMM8 support to the following kernels:
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001028 - NEArithmeticAdditionKernel
Isabella Gottardi62538972019-02-12 19:52:44 +00001029 - @ref NEScale
1030 - Added new tests and improved validation and benchmarking suites.
giuros01a69a88b2019-01-31 16:29:19 +00001031 - Deprecated functions/interfaces
1032 - Usage of inner_border_right and inner_border_top has been deprecated in @ref CLDeconvolutionLayer and @ref NEDeconvolutionLayer
1033
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001034v18.11 Public major release
1035 - Various bug fixes.
1036 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001037 - New Arm® Neon™ kernels / functions:
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001038 - @ref NEChannelShuffleLayer / @ref NEChannelShuffleLayerKernel
1039 - @ref NEReduceMean
1040 - @ref NEReorgLayer / @ref NEReorgLayerKernel
1041 - @ref NEPriorBoxLayer / @ref NEPriorBoxLayerKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +00001042 - NEUpsampleLayer / NEUpsampleLayerKernel
Georgios Pinitas0b1c2db2020-12-04 15:51:34 +00001043 - NEYOLOLayer / NEYOLOLayerKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001044 - New OpenCL kernels / functions:
1045 - @ref CLBatchToSpaceLayer / @ref CLBatchToSpaceLayerKernel
1046 - @ref CLBoundingBoxTransform / @ref CLBoundingBoxTransformKernel
Manuel Bottini5209be52019-02-13 16:34:56 +00001047 - @ref CLComputeAllAnchorsKernel
1048 - @ref CLGenerateProposalsLayer
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001049 - @ref CLNormalizePlanarYUVLayer / @ref CLNormalizePlanarYUVLayerKernel
1050 - @ref CLReorgLayer / @ref CLReorgLayerKernel
1051 - @ref CLSpaceToBatchLayer / @ref CLSpaceToBatchLayerKernel
1052 - @ref CLPadLayer
1053 - @ref CLReduceMean
1054 - @ref CLPriorBoxLayer / @ref CLPriorBoxLayerKernel
1055 - @ref CLROIAlignLayer / @ref CLROIAlignLayerKernel
1056 - @ref CLSlice
1057 - @ref CLSplit
1058 - @ref CLStridedSlice / @ref CLStridedSliceKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +00001059 - CLUpsampleLayer / CLUpsampleLayerKernel
Georgios Pinitas0b1c2db2020-12-04 15:51:34 +00001060 - CLYOLOLayer / CLYOLOLayerKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001061 - New CPP kernels / functions:
1062 - @ref CPPBoxWithNonMaximaSuppressionLimit / @ref CPPBoxWithNonMaximaSuppressionLimitKernel
1063 - Added the validate method in:
1064 - @ref NEDepthConvertLayer
1065 - @ref NEFloor / @ref CLFloor
Michele Di Giorgio93b75e02021-06-21 12:00:43 +01001066 - NEGEMMMatrixAdditionKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001067 - @ref NEReshapeLayer / @ref CLReshapeLayer
1068 - @ref CLScale
1069 - Added new examples:
1070 - graph_shufflenet.cpp
1071 - graph_yolov3.cpp
1072 - Added documentation for add a new function or kernel.
1073 - Improved doxygen documentation adding a list of the existing functions.
1074 - Add 4D tensors support to
Georgios Pinitas09f24972019-05-17 18:14:40 +01001075 - CLWidthConcatenateLayer
Georgios Pinitase2696b12020-12-03 20:37:43 +00001076 - CLFlattenLayer
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001077 - @ref CLSoftmaxLayer
Gian Marco Iodice8155c022021-04-16 15:08:59 +01001078 - Add dot product support for CLDepthwiseConvolutionLayer3x3NHWCKernel non-unit stride
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001079 - Add SVE support
1080 - Fused batch normalization into convolution layer weights in @ref CLFuseBatchNormalization
Gian Marco Iodice8155c022021-04-16 15:08:59 +01001081 - Fuses activation in CLDepthwiseConvolutionLayer3x3NCHWKernel, CLDepthwiseConvolutionLayer3x3NHWCKernel and @ref NEGEMMConvolutionLayer
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001082 - Added NHWC data layout support to:
1083 - @ref CLChannelShuffleLayer
1084 - @ref CLDeconvolutionLayer
1085 - @ref CLL2NormalizeLayer
1086 - Added QASYMM8 support to the following kernels:
Manuel Bottini3b131ab2021-02-19 18:16:44 +00001087 - CLScaleKernel
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001088 - NEDepthwiseConvolutionLayer3x3Kernel
Sheri Zhangf9ab9f92021-03-16 12:09:15 +00001089 - CLPixelWiseMultiplicationKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001090 - Added FP16 support to the following kernels:
Gian Marco Iodice8155c022021-04-16 15:08:59 +01001091 - CLDepthwiseConvolutionLayer3x3NHWCKernel
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001092 - NEDepthwiseConvolutionLayer3x3Kernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001093 - @ref CLNormalizePlanarYUVLayerKernel
1094 - @ref CLWinogradConvolutionLayer (5x5 kernel)
1095 - More tests added to both validation and benchmarking suites.
1096
Anthony Barbierd51ea0a2018-08-07 17:48:03 +01001097v18.08 Public major release
1098 - Various bug fixes.
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001099 - Various optimisations.
Anthony Barbierd51ea0a2018-08-07 17:48:03 +01001100 - Updated recommended NDK version to r17b.
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001101 - Removed support for QS8/QS16 data types.
1102 - Added support for grouped convolution in @ref CLConvolutionLayer.
1103 - Added NHWC data layout support to:
Georgios Pinitas09f24972019-05-17 18:14:40 +01001104 - NEDepthConcatenateLayer / CLDepthConcatenateLayer
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001105 - @ref NEWinogradConvolutionLayer / @ref CLWinogradConvolutionLayer
1106 - @ref CLDepthwiseConvolutionLayer
1107 - @ref CLDirectConvolutionLayer
1108 - @ref CLConvolutionLayer
1109 - @ref CLScale
Manuel Bottinid844c082021-07-14 12:58:54 +01001110 - CLIm2ColKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001111 - New Arm® Neon™ kernels / functions:
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001112 - @ref NERNNLayer
1113 - New OpenCL kernels / functions:
1114 - @ref CLArithmeticDivision
1115 - Introduced prepare() stage support in the graph API for GLES.
1116 - Added support for memory reusage when trying to allocate smaller CLTensors.
1117 - Enabled NHWC execution on graph examples.
1118 - Added JPEG accessor for validation purposes.
1119 - Added validate methods to some kernels / functions.
Anthony Barbierd51ea0a2018-08-07 17:48:03 +01001120
1121v18.05 Public major release
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001122 - Various bug fixes.
1123 - Various optimisations.
Jakub Sujakee301b32021-06-04 09:46:08 +01001124 - Major redesign in the interface for the Neon™ kernels implemented in assembly.
Pablo Telloeb82fd22018-02-23 13:43:50 +00001125 - Removed arm_compute::NEGEMMLowpAArch64A53Kernel / arm_compute::NEGEMMLowpAArch64Kernel / arm_compute::NEGEMMLowpAArch64V8P4Kernel / arm_compute::NEGEMMInterleavedBlockedKernel / arm_compute::NEGEMMLowpAssemblyMatrixMultiplyCore / arm_compute::NEHGEMMAArch64FP16Kernel
Jakub Sujakee301b32021-06-04 09:46:08 +01001126 - Added NEGEMMAssemblyWrapper and AssemblyKernelGlue which are used to execute assembly kernels in Neon™ functions.
Pablo Telloeb82fd22018-02-23 13:43:50 +00001127 - Minor changes to the CPUInfo type to make it compatible with the new assembly gemm interface.
Jakub Sujakee301b32021-06-04 09:46:08 +01001128 - Moved Neon™ assembly kernels to the folder src/core/Neon/kernels/arm_gemm.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001129 - Improved doxygen documentation.
1130 - Improved memory management for layer's transitions.
1131 - Added support for NHWC data layout in tensors.
1132 - Added NHWC data layout support to:
1133 - @ref NEGEMMConvolutionLayer
1134 - @ref NEDirectConvolutionLayer
1135 - @ref NEPoolingLayer / @ref CLPoolingLayer
1136 - @ref NEBatchNormalizationLayer / @ref CLBatchNormalizationLayer
1137 - @ref NEDepthwiseConvolutionLayer
1138 - @ref NEScale
Georgios Pinitasf7c5a412020-12-03 14:38:33 +00001139 - NEIm2Col
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001140 - Added support for dilated convolutions in @ref NEConvolutionLayer and @ref CLConvolutionLayer.
1141 - New OpenCL kernels / functions:
1142 - @ref CLChannelShuffleLayer / @ref CLChannelShuffleLayerKernel
Teresa Charlin91b7f742021-04-12 13:57:00 +01001143 - CLConvertFullyConnectedWeightsKernel / @ref CLConvertFullyConnectedWeights
Sheri Zhang7e20e292021-02-02 11:49:34 +00001144 - @ref CLCopy / CLCopyKernel
Anthony Barbier38e7f1f2018-05-21 13:37:47 +01001145 - @ref CLLSTMLayer
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001146 - @ref CLRNNLayer
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +00001147 - CLWidthConcatenateLayer / CLWidthConcatenateLayerKernel
Manuel Bottinic6f4ec32021-05-18 18:41:56 +01001148 - CLWinogradFilterTransformKernel / @ref CLWinogradConvolutionLayer
1149 - CLWinogradInputTransformKernel / CLWinogradInputTransform
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001150 - New Arm® Neon™ kernels / functions:
Teresa Charlin562bee52021-04-13 17:44:15 +01001151 - NEConvertFullyConnectedWeightsKernel / @ref NEConvertFullyConnectedWeights.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001152 - Created the validate method in @ref CLDepthwiseConvolutionLayer.
1153 - Beta and gamma are no longer mandatory arguments in @ref NEBatchNormalizationLayer and @ref CLBatchNormalizationLayer.
1154 - Added depth multiplier support in @ref NEDepthwiseConvolutionLayer and @ref CLDepthwiseConvolutionLayer.
Sheri Zhang1e3ab422021-03-16 17:35:08 +00001155 - Added broadcast multiply support in @ref NEPixelWiseMultiplication / NEPixelWiseMultiplicationKernel.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001156 - Port mobilenet example to NHWC data layout.
1157 - Enabled Winograd method in @ref CLConvolutionLayer.
1158 - Renamed NEWinogradLayer to @ref NEWinogradConvolutionLayer.
Sheri Zhangac6499a2021-02-10 15:32:38 +00001159 - Updated @ref NEWinogradConvolutionLayer to use highly optimised assembly kernels in src/core/Neon/kernels/arm_gemm.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001160 - Added memory manager support in GLES functions.
1161 - Major refactoring of the graph API.
1162 - Added GLES backend in the graph API.
1163 - Added support for the memory manager in the graph API.
1164 - Enabled Winograd Convolution method in the graph API.
1165 - Added support for grouped convolutions in the graph API.
Manuel Bottini10b38262021-02-19 18:16:44 +00001166 - Replaced NEDeconvolutionLayerUpsampleKernel with NEScaleKernel in @ref NEDeconvolutionLayer.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001167 - Added fast maths flag in @ref CLConvolutionLayer.
1168 - Added new tests and benchmarks in validation and benchmark frameworks
Jakub Sujakee301b32021-06-04 09:46:08 +01001169 - Merge Activation layer with Convolution Layer (Neon™, CL, GLES)
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001170 - Added support to OpenCL 2.0 SVM
1171 - Added support to import memory in OpenCL tensors.
1172 - Added the prepare() method to perform any one off pre-processing before running the function.
1173 - Added new examples:
1174 - graph_inception_v4.cpp
Anthony Barbier38e7f1f2018-05-21 13:37:47 +01001175 - graph_resnext50.cpp
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001176 - Added memory measurement instrument for CL.
Pablo Telloeb82fd22018-02-23 13:43:50 +00001177
Anthony Barbier577fbdf2018-03-01 15:17:54 +00001178v18.03 Public maintenance release
1179 - Various bug fixes.
Anthony Barbier3762e742018-03-02 11:49:33 +00001180 - Fixed bug in @ref NEActivationLayer
1181 - Fix in @ref CLTuner when using batches.
Anthony Barbier577fbdf2018-03-01 15:17:54 +00001182 - Updated recommended NDK version to r16b (And fixed warnings).
1183 - Fixed bug in validation code.
1184 - Added Inception v4 graph example.
Georgios Pinitas9fb11592018-04-26 20:34:58 +01001185 - Renamed NEWinogradLayer.cpp to @ref NEWinogradConvolutionLayer
Anthony Barbier577fbdf2018-03-01 15:17:54 +00001186
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001187v18.02 Public major release
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001188 - Various Arm® Neon™ / OpenCL / GLES optimisations.
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001189 - Various bug fixes.
1190 - Changed default number of threads on big LITTLE systems.
1191 - Refactored examples and added:
1192 - graph_mobilenet_qassym8
1193 - graph_resnet
1194 - graph_squeezenet_v1_1
Anthony Barbier3762e742018-03-02 11:49:33 +00001195 - Renamed @ref CLConvolutionLayer into @ref CLGEMMConvolutionLayer and created a new @ref CLConvolutionLayer to select the fastest convolution method.
1196 - Renamed @ref NEConvolutionLayer into @ref NEGEMMConvolutionLayer and created a new @ref NEConvolutionLayer to select the fastest convolution method.
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001197 - Added in place support to:
Anthony Barbier3762e742018-03-02 11:49:33 +00001198 - @ref CLActivationLayer
1199 - @ref CLBatchNormalizationLayer
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001200 - Added QASYMM8 support to:
Anthony Barbier3762e742018-03-02 11:49:33 +00001201 - @ref CLActivationLayer
1202 - @ref CLDepthwiseConvolutionLayer
1203 - @ref NEDepthwiseConvolutionLayer
1204 - @ref NESoftmaxLayer
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001205 - Added FP16 support to:
Manuel Bottini387259a2020-05-21 17:14:36 +01001206 - CLDepthwiseConvolutionLayer3x3
Anthony Barbier3762e742018-03-02 11:49:33 +00001207 - @ref CLDepthwiseConvolutionLayer
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001208 - Added broadcasting support to NEArithmeticAddition / @ref CLArithmeticAddition / @ref CLPixelWiseMultiplication
Anthony Barbier3762e742018-03-02 11:49:33 +00001209 - Added fused batched normalization and activation to @ref CLBatchNormalizationLayer and @ref NEBatchNormalizationLayer
1210 - Added support for non-square pooling to @ref NEPoolingLayer and @ref CLPoolingLayer
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001211 - New OpenCL kernels / functions:
Michele Di Giorgioa046e162019-10-08 09:36:26 +01001212 - CLDirectConvolutionLayerOutputStageKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001213 - New Arm® Neon™ kernels / functions
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001214 - Added name() method to all kernels.
1215 - Added support for Winograd 5x5.
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +00001216 - NEPermuteKernel / @ref NEPermute
Michalis Spyrou96f977e2021-07-01 12:20:56 +01001217 - CpuWinogradConv2dTransformInputKernel / NEWinogradLayer
1218 - CpuWinogradConv2dTransformOutputKernel / NEWinogradLayer
1219 - CpuWinogradConv2dTransformWeightsKernel / NEWinogradLayer
Anthony Barbiere1553372018-07-16 18:53:52 +01001220 - Renamed NEWinogradLayerKernel into NEWinogradLayerBatchedGEMMKernel
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001221 - New GLES kernels / functions:
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001222 - GCTensorShiftKernel / GCTensorShift
Pablo Tellof6c572c2018-02-14 12:47:30 +00001223
Anthony Barbier64c95a02018-01-22 18:48:55 +00001224v18.01 Public maintenance release
1225 - Various bug fixes
1226 - Added some of the missing validate() methods
Anthony Barbier3762e742018-03-02 11:49:33 +00001227 - Added @ref CLDeconvolutionLayerUpsampleKernel / @ref CLDeconvolutionLayer @ref CLDeconvolutionLayerUpsample
Sheri Zhang7e20e292021-02-02 11:49:34 +00001228 - Added CLPermuteKernel / @ref CLPermute
Anthony Barbier64c95a02018-01-22 18:48:55 +00001229 - Added method to clean the programs cache in the CL Kernel library.
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001230 - Added GCArithmeticAdditionKernel / GCArithmeticAddition
1231 - Added GCDepthwiseConvolutionLayer3x3Kernel / GCDepthwiseConvolutionLayer3x3
1232 - Added GCNormalizePlanarYUVLayerKernel / GCNormalizePlanarYUVLayer
1233 - Added GCScaleKernel / GCScale
1234 - Added GCWeightsReshapeKernel / GCConvolutionLayer
Anthony Barbier64c95a02018-01-22 18:48:55 +00001235 - Added FP16 support to the following GLES compute kernels:
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001236 - GCCol2ImKernel
1237 - GCGEMMInterleave4x4Kernel
1238 - GCGEMMTranspose1xWKernel
1239 - GCIm2ColKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001240 - Refactored Arm® Neon™ Winograd (NEWinogradLayerKernel)
Manuel Bottini327225d2021-04-13 13:09:30 +01001241 - Added NEDirectConvolutionLayerOutputStageKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001242 - Added QASYMM8 support to the following Arm® Neon™ kernels:
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001243 - NEDepthwiseConvolutionLayer3x3Kernel
Anthony Barbier3762e742018-03-02 11:49:33 +00001244 - @ref NEFillBorderKernel
Michele Di Giorgio19289042021-02-03 16:05:00 +00001245 - NEPoolingLayerKernel
Anthony Barbier64c95a02018-01-22 18:48:55 +00001246 - Added new examples:
1247 - graph_cl_mobilenet_qasymm8.cpp
1248 - graph_inception_v3.cpp
1249 - gc_dc.cpp
1250 - More tests added to both validation and benchmarking suites.
1251
Gian Marcoff850932017-12-11 12:37:17 +00001252v17.12 Public major release
1253 - Most machine learning functions on OpenCL support the new data type QASYMM8
1254 - Introduced logging interface
1255 - Introduced opencl timer
1256 - Reworked GEMMLowp interface
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001257 - Added new Arm® Neon™ assembly kernels for GEMMLowp, SGEMM and HGEMM
Gian Marcoff850932017-12-11 12:37:17 +00001258 - Added validation method for most Machine Learning kernels / functions
1259 - Added new graph examples such as googlenet, mobilenet, squeezenet, vgg16 and vgg19
1260 - Added sgemm example for OpenCL
1261 - Added absolute difference example for GLES compute
1262 - Added new tests and benchmarks in validation and benchmark frameworks
1263 - Added new kernels / functions for GLES compute
1264
1265 - New OpenGL ES kernels / functions
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001266 - GCAbsoluteDifferenceKernel / GCAbsoluteDifference
1267 - GCActivationLayerKernel / GCActivationLayer
1268 - GCBatchNormalizationLayerKernel / GCBatchNormalizationLayer
1269 - GCCol2ImKernel
1270 - GCDepthConcatenateLayerKernel / GCDepthConcatenateLayer
1271 - GCDirectConvolutionLayerKernel / GCDirectConvolutionLayer
1272 - GCDropoutLayerKernel / GCDropoutLayer
1273 - GCFillBorderKernel / GCFillBorder
1274 - GCGEMMInterleave4x4Kernel / GCGEMMInterleave4x4
1275 - GCGEMMMatrixAccumulateBiasesKernel / GCGEMMMatrixAdditionKernel / GCGEMMMatrixMultiplyKernel / GCGEMM
1276 - GCGEMMTranspose1xWKernel / GCGEMMTranspose1xW
1277 - GCIm2ColKernel
1278 - GCNormalizationLayerKernel / GCNormalizationLayer
1279 - GCPixelWiseMultiplicationKernel / GCPixelWiseMultiplication
1280 - GCPoolingLayerKernel / GCPoolingLayer
1281 - GCLogits1DMaxKernel / GCLogits1DShiftExpSumKernel / GCLogits1DNormKernel / GCSoftmaxLayer
1282 - GCTransposeKernel / GCTranspose
Gian Marcoff850932017-12-11 12:37:17 +00001283
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001284 - New Arm® Neon™ kernels / functions
Pablo Telloeb82fd22018-02-23 13:43:50 +00001285 - arm_compute::NEGEMMLowpAArch64A53Kernel / arm_compute::NEGEMMLowpAArch64Kernel / arm_compute::NEGEMMLowpAArch64V8P4Kernel / arm_compute::NEGEMMInterleavedBlockedKernel / arm_compute::NEGEMMLowpAssemblyMatrixMultiplyCore
1286 - arm_compute::NEHGEMMAArch64FP16Kernel
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001287 - NEDepthwiseConvolutionLayer3x3Kernel / NEDepthwiseIm2ColKernel / NEGEMMMatrixVectorMultiplyKernel / NEDepthwiseVectorToTensorKernel / @ref NEDepthwiseConvolutionLayer
Manuel Bottinicfac51c2021-06-18 15:47:28 +01001288 - NEGEMMLowpOffsetContributionKernel / NEGEMMLowpMatrixAReductionKernel / NEGEMMLowpMatrixBReductionKernel / NEGEMMLowpMatrixMultiplyCore
Manuel Bottiniae58bdf2021-06-17 17:18:45 +01001289 - NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
Georgios Pinitas9fb11592018-04-26 20:34:58 +01001290 - NEWinogradLayer / NEWinogradLayerKernel
Gian Marcoff850932017-12-11 12:37:17 +00001291
1292 - New OpenCL kernels / functions
Georgios Pinitas4a578b92021-06-25 12:13:49 +01001293 - CLGEMMLowpOffsetContributionKernel / CLGEMMLowpMatrixAReductionKernel / CLGEMMLowpMatrixBReductionKernel / CLGEMMLowpMatrixMultiplyCore
1294 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
Gian Marcoff850932017-12-11 12:37:17 +00001295
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001296 - New graph nodes for Arm® Neon™ and OpenCL
Georgios Pinitasd9eb2752018-04-03 13:44:29 +01001297 - graph::BranchLayer
1298 - graph::DepthConvertLayer
1299 - graph::DepthwiseConvolutionLayer
1300 - graph::DequantizationLayer
1301 - graph::FlattenLayer
1302 - graph::QuantizationLayer
1303 - graph::ReshapeLayer
Gian Marcoff850932017-12-11 12:37:17 +00001304
Anthony Barbier3c5b4ff2017-10-12 13:20:52 +01001305v17.10 Public maintenance release
1306 - Bug fixes:
1307 - Check the maximum local workgroup size supported by OpenCL devices
1308 - Minor documentation updates (Fixed instructions to build the examples)
Anthony Barbier3762e742018-03-02 11:49:33 +00001309 - Introduced a graph::GraphContext
Anthony Barbier3c5b4ff2017-10-12 13:20:52 +01001310 - Added a few new Graph nodes, support for branches and grouping.
1311 - Automatically enable cl_printf in debug builds
1312 - Fixed bare metal builds for armv7a
1313 - Added AlexNet and cartoon effect examples
1314 - Fixed library builds: libraries are no longer built as supersets of each other.(It means application using the Runtime part of the library now need to link against both libarm_compute_core and libarm_compute)
1315
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001316v17.09 Public major release
1317 - Experimental Graph support: initial implementation of a simple stream API to easily chain machine learning layers.
Anthony Barbier3762e742018-03-02 11:49:33 +00001318 - Memory Manager (@ref BlobLifetimeManager, @ref BlobMemoryPool, @ref ILifetimeManager, @ref IMemoryGroup, @ref IMemoryManager, @ref IMemoryPool, @ref IPoolManager, @ref MemoryManagerOnDemand, @ref PoolManager)
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001319 - New validation and benchmark frameworks (Boost and Google frameworks replaced by homemade framework).
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001320 - Most machine learning functions support both fixed point 8 and 16 bit (QS8, QS16) for both Arm® Neon™ and OpenCL.
1321 - New Arm® Neon™ kernels / functions:
Pablo Telloeb82fd22018-02-23 13:43:50 +00001322 - arm_compute::NEGEMMAssemblyBaseKernel arm_compute::NEGEMMAArch64Kernel
Manuel Bottini00f4dfc2021-03-10 09:55:14 +00001323 - NEDequantizationLayerKernel / @ref NEDequantizationLayer
Georgios Pinitas70eb53b2021-01-06 19:42:21 +00001324 - NEFloorKernel / @ref NEFloor
Anthony Barbier3762e742018-03-02 11:49:33 +00001325 - @ref NEL2NormalizeLayerKernel / @ref NEL2NormalizeLayer
Georgios Pinitasb6af4822021-09-14 12:33:34 +01001326 - NEQuantizationLayerKernel NEMinMaxLayerKernel / @ref NEQuantizationLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001327 - @ref NEROIPoolingLayerKernel / @ref NEROIPoolingLayer
1328 - @ref NEReductionOperationKernel / @ref NEReductionOperation
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +00001329 - NEReshapeLayerKernel / @ref NEReshapeLayer
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001330
1331 - New OpenCL kernels / functions:
Gian Marco Iodice8155c022021-04-16 15:08:59 +01001332 - CLDepthwiseConvolutionLayer3x3NCHWKernel CLDepthwiseConvolutionLayer3x3NHWCKernel CLDepthwiseIm2ColKernel CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer
Manuel Bottini9e73c932021-03-02 17:40:42 +00001333 - CLDequantizationLayerKernel / CLDequantizationLayer
Sheri Zhang1efed922021-03-10 22:43:38 +00001334 - CLDirectConvolutionLayerKernel / @ref CLDirectConvolutionLayer
Georgios Pinitase2696b12020-12-03 20:37:43 +00001335 - CLFlattenLayer
Georgios Pinitasf47f7182021-01-15 09:29:50 +00001336 - CLFloorKernel / @ref CLFloor
Gian Marco Iodice5fc07aa2019-05-15 17:08:02 +01001337 - CLGEMMTranspose1xW
Michele Di Giorgioee82d342021-01-05 16:14:28 +00001338 - CLGEMMMatrixVectorMultiplyKernel
Anthony Barbier3762e742018-03-02 11:49:33 +00001339 - @ref CLL2NormalizeLayerKernel / @ref CLL2NormalizeLayer
Georgios Pinitasb6af4822021-09-14 12:33:34 +01001340 - CLQuantizationLayerKernel CLMinMaxLayerKernel / @ref CLQuantizationLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001341 - @ref CLROIPoolingLayerKernel / @ref CLROIPoolingLayer
1342 - @ref CLReductionOperationKernel / @ref CLReductionOperation
Sheri Zhang7e20e292021-02-02 11:49:34 +00001343 - CLReshapeLayerKernel / @ref CLReshapeLayer
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001344
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001345v17.06 Public major release
1346 - Various bug fixes
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001347 - Added support for fixed point 8 bit (QS8) to the various Arm® Neon™ machine learning kernels.
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001348 - Added unit tests and benchmarks (AlexNet, LeNet)
1349 - Added support for sub tensors.
1350 - Added infrastructure to provide GPU specific optimisation for some OpenCL kernels.
Sheri Zhangac6499a2021-02-10 15:32:38 +00001351 - Added @ref OMPScheduler (OpenMP) scheduler for Neon
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001352 - Added @ref SingleThreadScheduler scheduler for Arm® Neon™ (For bare metal)
Anthony Barbier3762e742018-03-02 11:49:33 +00001353 - User can specify his own scheduler by implementing the @ref IScheduler interface.
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001354 - New OpenCL kernels / functions:
Anthony Barbier3762e742018-03-02 11:49:33 +00001355 - @ref CLBatchNormalizationLayerKernel / @ref CLBatchNormalizationLayer
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +00001356 - CLDepthConcatenateLayerKernel / CLDepthConcatenateLayer
Michalis Spyrou473cb012021-02-23 11:48:12 +00001357 - CLHOGOrientationBinningKernel CLHOGBlockNormalizationKernel, CLHOGDetectorKernel / CLHOGDescriptor CLHOGDetector CLHOGGradient CLHOGMultiDetection
Georgios Pinitas96b16b62020-12-01 17:41:34 +00001358 - CLLocallyConnectedMatrixMultiplyKernel / CLLocallyConnectedLayer
Manuel Bottinid87aded2021-07-16 10:23:31 +01001359 - CLWeightsReshapeKernel / CLConvolutionLayerReshapeWeights
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001360 - New C++ kernels:
Georgios Pinitasc6f95102021-03-30 10:03:01 +01001361 - CPPDetectionWindowNonMaximaSuppressionKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001362 - New Arm® Neon™ kernels / functions:
Anthony Barbier3762e742018-03-02 11:49:33 +00001363 - @ref NEBatchNormalizationLayerKernel / @ref NEBatchNormalizationLayer
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001364 - NEDepthConcatenateLayerKernel / NEDepthConcatenateLayer
Manuel Bottini327225d2021-04-13 13:09:30 +01001365 - NEDirectConvolutionLayerKernel / @ref NEDirectConvolutionLayer
Georgios Pinitas96b16b62020-12-01 17:41:34 +00001366 - NELocallyConnectedMatrixMultiplyKernel / NELocallyConnectedLayer
Manuel Bottini29599d02021-07-06 15:01:35 +01001367 - NEWeightsReshapeKernel / NEConvolutionLayerReshapeWeights
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001368
1369v17.05 Public bug fixes release
1370 - Various bug fixes
1371 - Remaining of the functions ported to use accurate padding.
1372 - Library does not link against OpenCL anymore (It uses dlopen / dlsym at runtime instead to determine whether or not OpenCL is available).
1373 - Added "free" method to allocator.
1374 - Minimum version of g++ required for armv7 Linux changed from 4.8 to 4.9
1375
1376v17.04 Public bug fixes release
1377
1378 The following functions have been ported to use the new accurate padding:
Michalis Spyrou473cb012021-02-23 11:48:12 +00001379 - CLColorConvertKernel
1380 - CLEdgeNonMaxSuppressionKernel
1381 - CLEdgeTraceKernel
1382 - CLGaussianPyramidHorKernel
1383 - CLGaussianPyramidVertKernel
1384 - CLGradientKernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001385 - NEChannelCombineKernel
Georgios Pinitasc6f95102021-03-30 10:03:01 +01001386 - NEFillArrayKernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001387 - NEGaussianPyramidHorKernel
1388 - NEGaussianPyramidVertKernel
Georgios Pinitas09d34512018-08-30 16:02:11 +01001389 - NEHarrisScoreFP16Kernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001390 - NEHarrisScoreKernel
1391 - NEHOGDetectorKernel
Michalis Spyrou373b4072021-01-20 16:41:12 +00001392 - NELogits1DMaxKernel
Anthony Barbier3762e742018-03-02 11:49:33 +00001393 - NELogits1DShiftExpSumKernel
1394 - NELogits1DNormKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +00001395 - NENonMaximaSuppression3x3FP16Kernel
1396 - NENonMaximaSuppression3x3Kernel
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001397
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001398v17.03.1 First Major public release of the sources
1399 - Renamed the library to arm_compute
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001400 - New CPP target introduced for C++ kernels shared between Arm® Neon™ and CL functions.
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001401 - New padding calculation interface introduced and ported most kernels / functions to use it.
1402 - New OpenCL kernels / functions:
Gian Marco Iodiceeb65f6d2020-04-15 11:42:15 +01001403 - CLGEMMLowpMatrixMultiplyKernel / CLGEMMLowp
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001404 - New Arm® Neon™ kernels / functions:
Anthony Barbier3762e742018-03-02 11:49:33 +00001405 - @ref NENormalizationLayerKernel / @ref NENormalizationLayer
Teresa Charlind1dc09c2021-03-04 15:24:45 +00001406 - NETransposeKernel / @ref NETranspose
Michalis Spyrou373b4072021-01-20 16:41:12 +00001407 - NELogits1DMaxKernel, NELogits1DShiftExpSumKernel, NELogits1DNormKernel / @ref NESoftmaxLayer
Manuel Bottini24b89202021-07-01 18:13:33 +01001408 - NEIm2ColKernel, NECol2ImKernel, NEConvolutionLayerWeightsReshapeKernel / @ref NEConvolutionLayer
Michele Di Giorgiof22f6722020-07-03 16:29:24 +01001409 - NEGEMMMatrixAccumulateBiasesKernel / @ref NEFullyConnectedLayer
Manuel Bottinicfac51c2021-06-18 15:47:28 +01001410 - NEGEMMLowpMatrixMultiplyKernel / NEGEMMLowp
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001411
1412v17.03 Sources preview
1413 - New OpenCL kernels / functions:
Michalis Spyrou473cb012021-02-23 11:48:12 +00001414 - CLGradientKernel, CLEdgeNonMaxSuppressionKernel, CLEdgeTraceKernel / CLCannyEdge
Georgios Pinitas856f66e2021-04-22 21:13:21 +01001415 - GEMM refactoring + FP16 support: CLGEMMInterleave4x4Kernel, CLGEMMTranspose1xWKernel, CLGEMMMatrixMultiplyKernel, CLGEMMMatrixAdditionKernel / @ref CLGEMM
Michele Di Giorgiof6f78762020-07-06 11:27:21 +01001416 - CLGEMMMatrixAccumulateBiasesKernel / @ref CLFullyConnectedLayer
Teresa Charlin27886092021-02-25 20:15:01 +00001417 - CLTransposeKernel / @ref CLTranspose
Georgios Pinitasc6f95102021-03-30 10:03:01 +01001418 - CLLKTrackerInitKernel, CLLKTrackerStage0Kernel, CLLKTrackerStage1Kernel, CLLKTrackerFinalizeKernel / CLOpticalFlow
Anthony Barbier3762e742018-03-02 11:49:33 +00001419 - @ref CLNormalizationLayerKernel / @ref CLNormalizationLayer
Michalis Spyrou473cb012021-02-23 11:48:12 +00001420 - CLLaplacianPyramid, CLLaplacianReconstruct
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001421 - New Arm® Neon™ kernels / functions:
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001422 - NEActivationLayerKernel / @ref NEActivationLayer
Michele Di Giorgio93b75e02021-06-21 12:00:43 +01001423 - GEMM refactoring + FP16 support (Requires armv8.2 CPU): NEGEMMInterleave4x4Kernel, NEGEMMTranspose1xWKernel, NEGEMMMatrixMultiplyKernel, NEGEMMMatrixAdditionKernel / @ref NEGEMM
Michele Di Giorgio19289042021-02-03 16:05:00 +00001424 - NEPoolingLayerKernel / @ref NEPoolingLayer
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001425
1426v17.02.1 Sources preview
1427 - New OpenCL kernels / functions:
Sang-Hoon Park201e0fe2021-01-27 13:14:56 +00001428 - CLLogits1DMaxKernel, CLLogits1DShiftExpSumKernel, CLLogits1DNormKernel / @ref CLSoftmaxLayer
Michele Di Giorgioe1314662021-02-01 17:09:32 +00001429 - CLPoolingLayerKernel / @ref CLPoolingLayer
Manuel Bottinid844c082021-07-14 12:58:54 +01001430 - CLIm2ColKernel, CLCol2ImKernel, CLConvolutionLayerWeightsReshapeKernel / CLConvolutionLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001431 - @ref CLRemapKernel / @ref CLRemap
Michalis Spyrou473cb012021-02-23 11:48:12 +00001432 - CLGaussianPyramidHorKernel, CLGaussianPyramidVertKernel / CLGaussianPyramid, CLGaussianPyramidHalf, CLGaussianPyramidOrb
1433 - CLMinMaxKernel, CLMinMaxLocationKernel / CLMinMaxLocation
1434 - CLNonLinearFilterKernel / CLNonLinearFilter
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001435 - New Arm® Neon™ FP16 kernels (Requires armv8.2 CPU)
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001436 - NEAccumulateWeightedFP16Kernel
1437 - NEBox3x3FP16Kernel
Michalis Spyrou473cb012021-02-23 11:48:12 +00001438 - NENonMaximaSuppression3x3FP16Kernel
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001439
1440v17.02 Sources preview
1441 - New OpenCL kernels / functions:
Georgios Pinitasf47f7182021-01-15 09:29:50 +00001442 - CLActivationLayerKernel / @ref CLActivationLayer
Michalis Spyrou473cb012021-02-23 11:48:12 +00001443 - CLChannelCombineKernel / CLChannelCombine
1444 - CLDerivativeKernel / CLChannelExtract
1445 - CLFastCornersKernel / CLFastCorners
1446 - CLMeanStdDevKernel / CLMeanStdDev
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001447 - New Arm® Neon™ kernels / functions:
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001448 - HOG / SVM: NEHOGOrientationBinningKernel, NEHOGBlockNormalizationKernel, NEHOGDetectorKernel, NEHOGNonMaximaSuppressionKernel / NEHOGDescriptor, NEHOGDetector, NEHOGGradient, NEHOGMultiDetection
1449 - NENonLinearFilterKernel / NENonLinearFilter
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001450 - Introduced a CLScheduler to manage the default context and command queue used by the runtime library and create synchronisation events.
1451 - Switched all the kernels / functions to use tensors instead of images.
1452 - Updated documentation to include instructions to build the library from sources.
1453
1454v16.12 Binary preview release
1455 - Original release
1456
Sheri Zhangd813bab2021-04-30 16:53:41 +01001457 */
1458} // namespace arm_compute