blob: 557eff07793251ede22ebf80ecffd0a8de6cd47f [file] [log] [blame]
Vidhya Sudhan Loganathand646ae12018-11-19 15:18:20 +00001///
Gian Marco Iodice716b1be2021-02-10 17:33:27 +00002/// Copyright (c) 2017-2021 Arm Limited.
Vidhya Sudhan Loganathand646ae12018-11-19 15:18:20 +00003///
4/// SPDX-License-Identifier: MIT
5///
6/// Permission is hereby granted, free of charge, to any person obtaining a copy
7/// of this software and associated documentation files (the "Software"), to
8/// deal in the Software without restriction, including without limitation the
9/// rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
10/// sell copies of the Software, and to permit persons to whom the Software is
11/// furnished to do so, subject to the following conditions:
12///
13/// The above copyright notice and this permission notice shall be included in all
14/// copies or substantial portions of the Software.
15///
16/// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17/// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18/// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19/// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20/// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21/// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22/// SOFTWARE.
23///
Anthony Barbier3762e742018-03-02 11:49:33 +000024namespace arm_compute
25{
Sheri Zhangd813bab2021-04-30 16:53:41 +010026/** @page versions_changelogs Release Versions and Changelog
Anthony Barbier6ff3b192017-09-04 18:44:23 +010027
28@tableofcontents
29
Sheri Zhangd813bab2021-04-30 16:53:41 +010030@section S2_1_versions Release versions
Anthony Barbier6ff3b192017-09-04 18:44:23 +010031
32All releases are numbered vYY.MM Where YY are the last two digits of the year, and MM the month number.
33If there is more than one release in a month then an extra sequential number is appended at the end:
34
35 v17.03 (First release of March 2017)
36 v17.03.1 (Second release of March 2017)
37 v17.04 (First release of April 2017)
38
39@note We're aiming at releasing one major public release with new features per quarter. All releases in between will only contain bug fixes.
40
Sheri Zhangd813bab2021-04-30 16:53:41 +010041@section S2_2_changelog Changelog
Anthony Barbier6ff3b192017-09-04 18:44:23 +010042
Michalis Spyrou27e67f02021-02-16 11:34:39 +000043v21.05 Public major release
Sheri Zhangc2bed952021-05-06 12:12:38 +010044 - Various bug fixes.
45 - Various optimisations.
46 - Various documentation updates:
47 - Add supported operators and coressponding Android NNAPI operators.
48 - Documentaiton reorg into user guide and contributor guide.
49 - Add support for a global allocator for OpenCL tensors
50 - Add experimental support for [CLVK](https://github.com/kpet/clvk).
51 - Add data type S32 support for:
52 - @ref opencl::kernels::ClArithmeticKernel
53 - Add data type QASYMM8 support for:
54 - @ref CLROIPoolingLayer
55 - @ref CLROIPoolingLayerKernel
56 - @ref NEROIPoolingLayer
57 - @ref NEROIPoolingLayerKernel
58 - Add per-channel quantization support for:
59 - @ref CLDeconvolutionLayer
60 - @ref CLDirectDeconvolutionLayer
61 - @ref NEConvolutionLayer
62 - @ref NEDeconvolutionLayer
63 - Remove padding from OpenCL kernels:
64 - @ref CLL2NormalizeLayerKernel
65 - @ref CLDepthwiseConvolutionLayer3x3NHWCKernel
66 - @ref CLNormalizationLayerKernel
67 - @ref CLNormalizePlanarYUVLayerKernel
68 - @ref opencl::kernels::ClMulKernel
69 - @ref CLReductionOperationKernel
70 - @ref CLROIPoolingLayerKernel
71 - Remove computer vision support from Arm® Neon™ backend
72 - Remove the following functions:
Michalis Spyrou27e67f02021-02-16 11:34:39 +000073 - NEAbsoluteDifference
74 - NEAccumulate
75 - NEBox3x3
76 - NECannyEdge
77 - NEChannelCombine
78 - NEChannelExtract
79 - NEColorConvert
Michalis Spyrou473cb012021-02-23 11:48:12 +000080 - NEConvolution
Michalis Spyrou27e67f02021-02-16 11:34:39 +000081 - NEDerivative
82 - NEDilate
83 - NEEqualizeHistogram
84 - NEErode
85 - NEFastCorners
86 - NEGaussian3x3
87 - NEGaussian5x5
88 - NEGaussianPyramid
89 - NEHOGDescriptor
90 - NEHOGDetector
91 - NEHOGGradient
92 - NEHOGMultiDetection
93 - NEHarrisCorners
94 - NEHistogram
95 - NEIntegralImage
96 - NELaplacianPyramid
97 - NELaplacianReconstruct
98 - NEMagnitude
99 - NEMeanStdDev
100 - NEMedian3x3
101 - NEMinMaxLocation
102 - NENonLinearFilter
103 - NEOpticalFlow
104 - NEPhase
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000105 - NEScharr3x3
106 - NESobel3x3
107 - NESobel5x5
108 - NESobel7x7
109 - NETableLookup
110 - NEThreshold
111 - NEWarpAffine
Michalis Spyrou473cb012021-02-23 11:48:12 +0000112 - NEWarpPerspectiveKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000113 - Remove all GLES kernels / functions / tests / examples
Sheri Zhangc2bed952021-05-06 12:12:38 +0100114 - Remove computer vision support from CL backend
115 - Remove the following functions:
Michalis Spyrou473cb012021-02-23 11:48:12 +0000116 - CLAbsoluteDifference
117 - CLAccumulate
118 - CLBox3x3
119 - CLCannyEdge
120 - CLChannelCombine
121 - CLChannelExtract
122 - CLColorConvert
123 - CLConvolution
124 - CLDerivative
125 - CLDilate
126 - CLEqualizeHistogram
127 - CLErode
128 - CLFastCorners
129 - CLGaussian3x3
130 - CLGaussian5x5
131 - CLGaussianPyramid
132 - CLHOGDescriptor
133 - CLHOGDetector
134 - CLHOGGradient
135 - CLHOGMultiDetection
136 - CLHarrisCorners
137 - CLHistogram
138 - CLIntegralImage
139 - CLLaplacianPyramid
140 - CLLaplacianReconstruct
141 - CLMagnitude
142 - CLMeanStdDev
143 - CLMedian3x3
144 - CLMinMaxLocation
145 - CLNonLinearFilter
146 - CLOpticalFlow
147 - CLPhase
148 - CLScharr3x3
149 - CLSobel3x3
150 - CLSobel5x5
151 - CLSobel7x7
152 - CLTableLookup
153 - CLThreshold
154 - CLWarpAffine
155 - CLWarpPerspective
156
Georgios Pinitas40f51a62020-11-21 03:04:18 +0000157v21.02 Public major release
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000158 - Various bug fixes.
159 - Various optimisations.
Georgios Pinitas45514032020-12-30 00:03:09 +0000160 - Upgrade C++ standard to C++14
161 - Add macOS support
Giorgio Arena1055dc12021-02-19 09:53:06 +0000162 - Add Armv8-R AArch64 architecture support
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000163 - Add SVE/SVE2 support for:
Manuel Bottini10b38262021-02-19 18:16:44 +0000164 - NEScaleKernel
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000165 - @ref NEActivationLayer
166 - @ref NEArithmeticAddition
167 - @ref NEBatchNormalizationLayerKernel
Giorgio Arena1055dc12021-02-19 09:53:06 +0000168 - @ref cpu::kernels::CpuLogits1DSoftmaxKernel
169 - @ref cpu::kernels::CpuLogits1DMaxKernel
170 - @ref cpu::kernels::CpuElementwiseUnaryKernel
Sheri Zhangdda69142021-02-01 19:06:57 +0000171 - Remove padding from OpenCL kernels:
Sheri Zhang1efed922021-03-10 22:43:38 +0000172 - CLDirectConvolutionLayerKernel
Sheri Zhangdda69142021-02-01 19:06:57 +0000173 - @ref CLArgMinMaxLayerKernel
174 - @ref CLPadLayerKernel
175 - @ref CLROIAlignLayerKernel
176 - @ref CLRangeKernel
Manuel Bottini3b131ab2021-02-19 18:16:44 +0000177 - CLScaleKernel
Sheri Zhangdda69142021-02-01 19:06:57 +0000178 - @ref CLSelectKernel
179 - @ref CLBitwiseKernel
Giorgio Arena1055dc12021-02-19 09:53:06 +0000180 - @ref opencl::kernels::ClFloorKernel
Teresa Charlin27886092021-02-25 20:15:01 +0000181 - CLTransposeKernel
Giorgio Arena5b50f422021-02-17 11:43:05 +0000182 - Deprecate functions in CLTuner:
183 - add_lws_to_table
184 - import_lws_table
185 - lws_table
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000186 - Remove functions:
Georgios Pinitas96b16b62020-12-01 17:41:34 +0000187 - NELocallyConnectedLayer / CLLocallyConnectedLayer
Georgios Pinitasf7c5a412020-12-03 14:38:33 +0000188 - NEIm2Col
189 - NECol2Im
190 - NEGEMMInterleave4x4
191 - NEGEMMTranspose1xW
Georgios Pinitas8c3c0e72020-12-03 20:11:53 +0000192 - NEComputeAllAnchors / CLComputeAllAnchors
Georgios Pinitasec2256b2020-12-03 18:51:58 +0000193 - NEGEMMAssemblyDispatch
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000194 - NEUpsampleLayer / CLUpsampleLayer
Sheri Zhangda6a6eb2021-01-06 11:15:06 +0000195 - Remove kernels:
Georgios Pinitasd308df32020-12-01 16:56:36 +0000196 - NEGEMMMatrixVectorMultiplyKernel
Georgios Pinitas96b16b62020-12-01 17:41:34 +0000197 - NELocallyConnectedMatrixMultiplyKernel / CLLocallyConnectedMatrixMultiplyKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000198 - NEUpsampleLayerKernel / CLUpsampleLayerKernel
Gian Marco Iodicef5aad512021-02-08 17:34:40 +0000199 - Extend OpenCL tuner with workgroup batch size support
200 - Experimental extension for the OpenCL tuner to tune the batches of work groups distribute to compute units
Gian Marco Iodice716b1be2021-02-10 17:33:27 +0000201 - Add functionality to load the OpenCL GEMM heuristics at runtime
202 - The GEMM heuristic file (MLGO) can be used to update the default GEMM heuristics available for OpenCL
Giorgio Arenacd7d1782021-02-22 14:58:37 +0000203 - Note: there might be performance regressions against v20.08 in Inception v3 using int8 data types on Arm Mali-G77 GPUs. Currently under investigation
Giorgio Arena1ffa5ac2021-02-23 12:31:54 +0000204 - Note: data-type decoupling is in progress and expiremental. Warning of unused symbols might be raised
Georgios Pinitas40f51a62020-11-21 03:04:18 +0000205
SiCong Li96209c72020-08-21 12:28:30 +0100206v20.11 Public major release
morgolock70b1eb82020-11-24 13:54:19 +0000207 - Various bug fixes.
208 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000209 - Performance regressions can be noted when executing Depthwise Convolution on Arm® Neon™ with a depth multiplier > 1 for quantized data type.
morgolock0e728492020-11-20 11:03:33 +0000210 This is planned to be resolved in 21.02 release.
morgolock70b1eb82020-11-24 13:54:19 +0000211 - Added new data type QASYMM8_SIGNED support for @ref NEROIAlignLayer.
SiCong Li903f8cc2020-08-27 10:17:10 +0100212 - Added new data type S32 support for:
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000213 - NEArithmeticSubtraction
214 - NEArithmeticSubtractionKernel
SiCong Libb88f892020-08-28 11:18:47 +0100215 - @ref NEPixelWiseMultiplication
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000216 - NEPixelWiseMultiplicationKernel
Sang-Hoon Park63001ac2021-01-18 14:20:27 +0000217 - NEElementwiseDivision
218 - NEDivisionOperationKernel
SiCong Li96209c72020-08-21 12:28:30 +0100219 - Interface change
220 - Properly support softmax axis to have the same meaning as other major frameworks. That is, axis now defines the dimension
221 on which Softmax/Logsoftmax is performed. E.g. for input of shape 4x5x6 and axis=1, softmax will be applied to 4x6=24 vectors of size 5.
222 The supported value range of axis is [-rank, rank).
223 This change applies to the following functions:
224 - @ref NESoftmaxLayer
225 - @ref NELogSoftmaxLayer
226 - @ref CLSoftmaxLayer
227 - @ref CLLogSoftmaxLayer
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +0000228 - GCSoftmaxLayer
Sheri Zhang824061d2020-10-26 15:46:37 +0000229 - New OpenCL kernels / functions:
230 - @ref CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
morgolock0e728492020-11-20 11:03:33 +0000231 - @ref CLLogicalNot
232 - @ref CLLogicalAnd
233 - @ref CLLogicalOr
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000234 - New Arm® Neon™ kernels / functions:
morgolock0e728492020-11-20 11:03:33 +0000235 - @ref NELogicalNot
236 - @ref NELogicalAnd
237 - @ref NELogicalOr
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000238 - Removed padding from Arm® Neon™ kernels:
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000239 - NEComplexPixelWiseMultiplicationKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000240 - NENonMaximaSuppression3x3Kernel
241 - @ref NERemapKernel
Sheri Zhanged367132020-10-08 15:46:16 +0100242 - @ref NEGEMMInterleave4x4Kernel
Manuel Bottini327225d2021-04-13 13:09:30 +0100243 - NEDirectConvolutionLayerKernel
Manuel Bottini10b38262021-02-19 18:16:44 +0000244 - NEScaleKernel
Georgios Pinitas96b16b62020-12-01 17:41:34 +0000245 - NELocallyConnectedMatrixMultiplyKernel
Sheri Zhanged367132020-10-08 15:46:16 +0100246 - @ref NEGEMMLowpOffsetContributionKernel
247 - @ref NEGEMMTranspose1xWKernel
Michele Di Giorgio19289042021-02-03 16:05:00 +0000248 - NEPoolingLayerKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000249 - NEConvolutionKernel
Michalis Spyrou60c3b0e2021-04-08 12:02:58 +0100250 - NEDepthwiseConvolutionLayerNativeKernel
Sheri Zhanged367132020-10-08 15:46:16 +0100251 - @ref NEGEMMLowpMatrixMultiplyKernel
252 - @ref NEGEMMMatrixMultiplyKernel
Manuel Bottini327225d2021-04-13 13:09:30 +0100253 - NEDirectConvolutionLayerOutputStageKernel
Sheri Zhanged367132020-10-08 15:46:16 +0100254 - @ref NEReductionOperationKernel
255 - @ref NEGEMMLowpMatrixAReductionKernel
256 - @ref NEGEMMLowpMatrixBReductionKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000257 - Removed padding from OpenCL kernels:
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000258 - CLBatchConcatenateLayerKernel
Michele Di Giorgio1e0208a2021-01-22 15:42:59 +0000259 - CLElementwiseOperationKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000260 - @ref CLBatchNormalizationLayerKernel
Michele Di Giorgioe1314662021-02-01 17:09:32 +0000261 - CLPoolingLayerKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000262 - @ref CLWinogradInputTransformKernel
263 - @ref CLGEMMLowpMatrixMultiplyNativeKernel
264 - @ref CLGEMMLowpMatrixAReductionKernel
265 - @ref CLGEMMLowpMatrixBReductionKernel
266 - @ref CLGEMMLowpOffsetContributionOutputStageKernel
267 - @ref CLGEMMLowpOffsetContributionKernel
268 - @ref CLWinogradOutputTransformKernel
269 - @ref CLGEMMLowpMatrixMultiplyReshapedKernel
270 - @ref CLFuseBatchNormalizationKernel
271 - @ref CLDepthwiseConvolutionLayerNativeKernel
Georgios Pinitas11d84152021-04-28 10:20:18 +0100272 - CLDepthConvertLayerKernel
Sheri Zhang7e20e292021-02-02 11:49:34 +0000273 - CLCopyKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000274 - @ref CLDepthwiseConvolutionLayer3x3NHWCKernel
Georgios Pinitasf47f7182021-01-15 09:29:50 +0000275 - CLActivationLayerKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000276 - @ref CLWinogradFilterTransformKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000277 - CLWidthConcatenateLayerKernel
278 - CLWidthConcatenate4TensorsKernel
279 - CLWidthConcatenate2TensorsKernel
Sang-Hoon Park201e0fe2021-01-27 13:14:56 +0000280 - CLLogits1DMaxShiftExpSumKernel
281 - CLLogits1DNormKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000282 - CLHeightConcatenateLayerKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100283 - CLGEMMMatrixMultiplyKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000284 - @ref CLGEMMLowpQuantizeDownInt32ScaleKernel
285 - @ref CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
286 - @ref CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000287 - CLDepthConcatenateLayerKernel
Sheri Zhang824061d2020-10-26 15:46:37 +0000288 - @ref CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
289 - Removed OpenCL kernels / functions:
290 - CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
291 - CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel
292 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel
morgolock00c76012020-11-06 10:40:12 +0000293 - Deprecated OpenCL kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together):
Georgios Pinitas2d221392020-09-03 15:16:37 +0100294 - CLLocallyConnectedLayer
295 - CLLocallyConnectedMatrixMultiplyKernel
morgolock00c76012020-11-06 10:40:12 +0000296 - CLAbsoluteDifference
297 - CLAbsoluteDifferenceKernel
298 - CLAccumulate
299 - CLAccumulateKernel
300 - CLAccumulateSquared
301 - CLAccumulateSquaredKernel
302 - CLAccumulateWeighted
303 - CLAccumulateWeightedKernel
304 - CLAccumulateWeightedFP16Kernel
305 - CLBox3x3
306 - CLBox3x3Kernel
307 - CLBox3x3FP16Kernel
308 - CLCannyEdge
309 - CLChannelCombine
310 - CLChannelCombineKernel
311 - CLChannelExtract
312 - CLChannelExtractKernel
313 - CLColorConvert
314 - CLColorConvertKernel
315 - CLConvolution3x3
316 - CLConvolutionRectangle
317 - CLConvolutionRectangleKernel
318 - CLConvolutionSquare
319 - CLConvolutionKernel
320 - CLDerivative
321 - CLDerivativeKernel
322 - CLDilate
323 - CLDilateKernel
324 - CLEqualizeHistogram
325 - CLErode
326 - CLErodeKernel
327 - CLFastCorners
328 - CLFastCornersKernel
329 - CLGaussian3x3
330 - CLGaussian3x3Kernel
331 - CLGaussian5x5
332 - CLGaussian5x5HorKernel
333 - CLGaussian5x5VertKernel
334 - CLGaussianPyramid
335 - CLGaussianPyramidHalf
336 - CLGaussianPyramidOrb
337 - CLHarrisCorners
338 - CLHarrisScoreKernel
339 - CLHarrisScoreFP16Kernel
340 - CLHistogram
341 - CLHistogramKernel
342 - CLHOGOrientationBinningKernel
343 - CLHOGBlockNormalizationKernel
344 - CLHOGDetectorKernel
345 - CLHOGNonMaximaSuppressionKernel
346 - CLHOGDescriptor
347 - CLHOGDetector
348 - CLHOGGradient
349 - CLHOGMultiDetection
350 - CLHOGOrientationBinningKernel
351 - CLHOGBlockNormalizationKernel
352 - CLHOGDetectorKernel
353 - CLIntegralImage
354 - CLIntegralImageKernel
355 - CLLaplacianReconstruct
356 - CLLaplacianPyramid
357 - CLMagnitude
358 - CLMagnitudePhaseKernel
359 - CLMedian3x3
360 - CLMedian3x3Kernel
361 - CLMinMaxLocation
362 - CLMinMaxLocationKernel
363 - CLNonLinearFilter
364 - CLNonLinearFilterKernel
365 - CLNonMaximaSuppression3x3
366 - CLNonMaximaSuppression3x3FP16Kernel
367 - CLNonMaximaSuppression3x3Kernel
368 - CLOpticalFlow
369 - CLPhase
370 - CLRemap
371 - CLRemapKernel
372 - CLScharr3x3
373 - CLScharr3x3Kernel
374 - CLSobel3x3
375 - CLSobel3x3Kernel
376 - CLSobel5x5
377 - CLSobel5x5HorKernel
378 - CLSobel5x5VertKernel
379 - CLSobel7x7
380 - CLSobel7x7HorKernel
381 - CLSobel7x7VertKernel
382 - CLThreshold
383 - CLThresholdKernel
384 - CLWarpAffine
385 - CLWarpAffineKernel
386 - CLWarpPerspective
387 - CLWarpPerspectiveKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000388 - Deprecated Arm® Neon™ kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together):
Georgios Pinitas2d221392020-09-03 15:16:37 +0100389 - NELocallyConnectedLayer
390 - NELocallyConnectedMatrixMultiplyKernel
morgolock0c862652020-11-06 08:59:45 +0000391 - NEAbsoluteDifference
392 - NEAbsoluteDifferenceKernel
393 - NEAccumulate
394 - NEAccumulateKernel
395 - NEAccumulateSquared
396 - NEAccumulateSquaredKernel
397 - NEAccumulateWeighted
398 - NEAccumulateWeightedKernel
399 - NEAccumulateWeightedFP16Kernel
400 - NEBox3x3
401 - NEBox3x3Kernel
402 - NEBox3x3FP16Kernel
403 - NECannyEdge
404 - NEChannelCombine
405 - NEChannelCombineKernel
406 - NEChannelExtract
407 - NEChannelExtractKernel
408 - NEColorConvert
409 - NEColorConvertKernel
410 - NEConvolution3x3
411 - NEConvolutionRectangle
412 - NEConvolutionRectangleKernel
413 - NEConvolutionSquare
414 - NEConvolutionKernel
415 - NEDerivative
416 - NEDerivativeKernel
417 - NEDilate
418 - NEDilateKernel
419 - NEEqualizeHistogram
420 - NEErode
421 - NEErodeKernel
422 - NEFastCorners
423 - NEFastCornersKernel
424 - NEGaussian3x3
425 - NEGaussian3x3Kernel
426 - NEGaussian5x5
427 - NEGaussian5x5HorKernel
428 - NEGaussian5x5VertKernel
429 - NEGaussianPyramid
430 - NEGaussianPyramidHalf
431 - NEGaussianPyramidOrb
432 - NEHarrisCorners
433 - NEHarrisScoreKernel
434 - NEHarrisScoreFP16Kernel
435 - NEHistogram
436 - NEHistogramKernel
437 - NEHOGOrientationBinningKernel
438 - NEHOGBlockNormalizationKernel
439 - NEHOGDetectorKernel
440 - NEHOGNonMaximaSuppressionKernel
441 - NEHOGDescriptor
442 - NEHOGDetector
443 - NEHOGGradient
444 - NEHOGMultiDetection
445 - NEHOGOrientationBinningKernel
446 - NEHOGBlockNormalizationKernel
447 - NEHOGDetectorKernel
448 - NEIntegralImage
449 - NEIntegralImageKernel
450 - NELaplacianReconstruct
451 - NELaplacianPyramid
452 - NEMagnitude
453 - NEMagnitudePhaseKernel
454 - NEMedian3x3
455 - NEMedian3x3Kernel
456 - NEMinMaxLocation
457 - NEMinMaxLocationKernel
458 - NENonLinearFilter
459 - NENonLinearFilterKernel
460 - NENonMaximaSuppression3x3
461 - NENonMaximaSuppression3x3FP16Kernel
462 - NENonMaximaSuppression3x3Kernel
463 - NEOpticalFlow
464 - NEPhase
465 - NERemap
466 - NERemapKernel
467 - NEScharr3x3
468 - NEScharr3x3Kernel
469 - NESobel3x3
470 - NESobel3x3Kernel
471 - NESobel5x5
472 - NESobel5x5HorKernel
473 - NESobel5x5VertKernel
474 - NESobel7x7
475 - NESobel7x7HorKernel
476 - NESobel7x7VertKernel
477 - NEThreshold
478 - NEThresholdKernel
479 - NEWarpAffine
480 - NEWarpAffineKernel
481 - NEWarpPerspective
482 - NEWarpPerspectiveKernel
morgolockd6ee9ed2020-11-19 10:07:14 +0000483 - Deprecated GLES kernels / functions (If a kernel is used only by the function that is being deprecated, the kernel is deprecated together):
484 - GCAbsoluteDifference
485 - GCActivationLayer
486 - GCArithmeticAddition
487 - GCBatchNormalizationLayer
488 - GCConcatenateLayer
489 - GCConvolutionLayer
490 - GCDepthwiseConvolutionLayer
491 - GCDirectConvolutionLayer
492 - GCDropoutLayer
493 - GCFillBorder
494 - GCFullyConnectedLayer
495 - GCGEMM
496 - GCGEMMInterleave4x4
497 - GCGEMMTranspose1xW
498 - GCNormalizationLayer
499 - GCNormalizePlanarYUVLayer
500 - GCPixelWiseMultiplication
501 - GCPoolingLayer
502 - GCScale
503 - GCSoftmaxLayer
504 - GCTensorShift
505 - GCTranspose
506
SiCong Li96209c72020-08-21 12:28:30 +0100507
Georgios Pinitas25ef7212020-06-02 23:00:41 +0100508v20.08 Public major release
509 - Various bug fixes.
510 - Various optimisations.
Sheri Zhang3ef9b5f2020-07-09 16:32:58 +0100511 - Added new data type QASYMM8_SIGNED support for:
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100512 - @ref CLArgMinMaxLayer
513 - @ref CLArgMinMaxLayerKernel
514 - Added new data type U8 support for:
515 - @ref NECropKernel
Sheri Zhang7e20e292021-02-02 11:49:34 +0000516 - CLCropKernel
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100517 - Added aligh_corner support for nearest neighbor interpolation in:
Manuel Bottini10b38262021-02-19 18:16:44 +0000518 - NEScaleKernel
Manuel Bottini3b131ab2021-02-19 18:16:44 +0000519 - CLScaleKernel
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100520 - New OpenCL kernels / functions:
521 - @ref CLMaxUnpoolingLayerKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000522 - New Arm® Neon™ kernels / functions:
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100523 - @ref NEMaxUnpoolingLayerKernel
Sheri Zhang3ef9b5f2020-07-09 16:32:58 +0100524 - New graph example:
Sheri Zhangdd4cfc02020-07-10 14:15:41 +0100525 - graph_yolov3_output_detector
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100526 - GEMMTuner improvements:
527 - Added fp16 support
528 - Output json files for easier integration
529 - Enabled tuning for export_to_cl_image_rhs option for RHS tensors
530 - More robust script for running benchmarks
Sheri Zhang3ef9b5f2020-07-09 16:32:58 +0100531 - Removed padding from:
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000532 - NEPixelWiseMultiplicationKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000533 - NEHeightConcatenateLayerKernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +0000534 - NEThresholdKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000535 - NEBatchConcatenateLayerKernel
Teresa Charlind1dc09c2021-03-04 15:24:45 +0000536 - NETransposeKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100537 - @ref NEBatchNormalizationLayerKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000538 - NEArithmeticSubtractionKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100539 - @ref NEBoundingBoxTransformKernel
Michalis Spyrou373b4072021-01-20 16:41:12 +0000540 - NELogits1DMaxKernel
541 - NELogits1DSoftmaxKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100542 - @ref NEROIPoolingLayerKernel
543 - @ref NEROIAlignLayerKernel
Georgios Pinitas0b1c2db2020-12-04 15:51:34 +0000544 - NEYOLOLayerKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000545 - NEUpsampleLayerKernel
Georgios Pinitas70eb53b2021-01-06 19:42:21 +0000546 - NEFloorKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000547 - NEWidthConcatenateLayerKernel
548 - NEDepthConcatenateLayerKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100549 - @ref NENormalizationLayerKernel
550 - @ref NEL2NormalizeLayerKernel
Georgios Pinitasc6f95102021-03-30 10:03:01 +0100551 - NEFillArrayKernel
Georgios Pinitas11d84152021-04-28 10:20:18 +0100552 - NEDepthConvertLayerKernel
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100553 - @ref NERangeKernel
554 - @ref NEPriorBoxLayer
Sheri Zhanged367132020-10-08 15:46:16 +0100555 - Removed OpenCL kernels / functions:
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100556 - CLGEMMLowpQuantizeDownInt32ToUint8Scale
557 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000558 - Removed Arm® Neon™ kernels / functions:
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100559 - NEGEMMLowpQuantizeDownInt32ToUint8Scale
560 - NEGEMMMatrixAccumulateBiasesKernel
SiCong Lid004a7a2020-05-28 15:26:41 +0100561 - Deprecated functions / interfaces:
Michalis Spyrou473cb012021-02-23 11:48:12 +0000562 - Non-descriptor based interfaces for NEThreshold, CLThreshold
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +0000563 - Non-descriptor based interfaces for @ref NEScale, @ref CLScale and GCScale
564 - In @ref NESoftmaxLayer, @ref NELogSoftmaxLayer, @ref CLSoftmaxLayer, @ref CLLogSoftmaxLayer and GCSoftmaxLayer :
565 The default "axis" value for @ref CLSoftmaxLayer, @ref CLLogSoftmaxLayer and GCSoftmaxLayer is changed from 1 to 0.
morgolock9c7fed82020-08-05 12:30:56 +0100566 Only axis 0 is supported.
567 The default "axis" value for @ref NESoftmaxLayer, @ref NELogSoftmaxLayer is changed from 1 to 0.
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100568 Only axis 0 is supported.
Sang-Hoon Parka0205b92020-07-07 09:36:09 +0100569 - The support for quantized data types has been removed from @ref CLLogSoftmaxLayer due to implementation complexity.
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100570 - Removed padding requirement for the input (e.g. LHS of GEMM) and output in CLGEMMMatrixMultiplyNativeKernel, CLGEMMMatrixMultiplyReshapedKernel, CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and @ref CLIm2ColKernel (NHWC only)
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100571 - This change allows to use @ref CLGEMMConvolutionLayer without extra padding for the input and output.
572 - Only the weights/bias of @ref CLGEMMConvolutionLayer could require padding for the computation.
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100573 - Only on Arm® Mali™ Midgard GPUs, @ref CLGEMMConvolutionLayer could require padding since CLGEMMMatrixMultiplyKernel is called and currently requires padding.
574 - Added support for exporting the OpenCL buffer object to the OpenCL image object in CLGEMMMatrixMultiplyReshapedKernel and CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.
Sang-Hoon Parkadfaefb2020-08-18 09:13:05 +0100575 - This support allows to export the OpenCL buffer used for the reshaped RHS matrix to the OpenCL image object.
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100576 - The padding requirement for the OpenCL image object is considered into the CLGEMMReshapeRHSMatrixKernel.
577 - The reshaped RHS matrix stores the weights when GEMM is used to accelerate CLGEMMConvolutionLayer.
Georgios Pinitas25ef7212020-06-02 23:00:41 +0100578
Georgios Pinitasfd7780d2020-03-17 11:41:00 +0000579v20.05 Public major release
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000580 - Various bug fixes.
581 - Various optimisations.
Michele Di Giorgio36a551f2020-04-23 11:55:29 +0100582 - Updated recommended NDK version to r18b.
583 - Updated recommended gcc version to Linaro 6.3.1.
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000584 - Added Bfloat16 type support
585 - Added Bfloat16 support in:
586 - @ref NEWeightsReshapeKernel
587 - @ref NEConvolutionLayerReshapeWeights
588 - @ref NEIm2ColKernel
Georgios Pinitasf7c5a412020-12-03 14:38:33 +0000589 - NEIm2Col
Georgios Pinitas11d84152021-04-28 10:20:18 +0100590 - NEDepthConvertLayerKernel
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000591 - @ref NEDepthConvertLayer
592 - @ref NEGEMMConvolutionLayer
Georgios Pinitasec2256b2020-12-03 18:51:58 +0000593 - NEGEMMAssemblyDispatch
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000594 - Added new data type QASYMM8_SIGNED support for:
595 - @ref CLDirectConvolutionLayer
596 - @ref CLDeconvolutionLayer
597 - @ref CLDirectDeconvolutionLayer
598 - @ref CLGEMMDeconvolutionLayer
599 - @ref CLGEMMLowpMatrixMultiplyReshapedKernel
600 - @ref CLGEMMLowpQuantizeDownInt32ScaleKernel
601 - @ref CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
602 - @ref CLReductionOperation
603 - @ref CLReduceMean
Sheri Zhang359c48e2020-04-30 22:53:39 +0100604 - @ref NEScale
Manuel Bottini10b38262021-02-19 18:16:44 +0000605 - NEScaleKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000606 - NEUpsampleLayer
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000607 - @ref NECast
608 - @ref NEReductionOperation
609 - @ref NEReduceMean
610 - @ref NEArgMinMaxLayer
611 - @ref NEDeconvolutionLayer
612 - @ref NEGEMMLowpQuantizeDownInt32ScaleKernel
613 - @ref CPPBoxWithNonMaximaSuppressionLimit
614 - @ref CPPDetectionPostProcessLayer
615 - @ref CPPPermuteKernel
616 - @ref CPPPermute
617 - @ref CPPTopKVKernel
618 - @ref CPPTopKV
Sheri Zhang359c48e2020-04-30 22:53:39 +0100619 - @ref CPPUpsample
620 - @ref CPPUpsampleKernel
Sheri Zhang31b49ca2020-04-24 11:15:10 +0100621 - New OpenCL kernels / functions:
622 - @ref CLQLSTMLayer
623 - @ref CLQLSTMLayerNormalizationKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000624 - New Arm® Neon™ kernels / functions:
Sheri Zhang31b49ca2020-04-24 11:15:10 +0100625 - @ref NEQLSTMLayer
626 - @ref NEQLSTMLayerNormalizationKernel
627 - Added HARD_SWISH support in:
Georgios Pinitasf47f7182021-01-15 09:29:50 +0000628 - CLActivationLayerKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000629 - NEActivationLayerKernel
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000630 - Deprecated OpenCL kernels / functions:
631 - CLGEMMLowpQuantizeDownInt32ToUint8Scale
632 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000633 - Deprecated Arm® Neon™ kernels / functions:
Sheri Zhang0f2522b2020-03-25 16:38:19 +0000634 - NEGEMMLowpQuantizeDownInt32ToUint8Scale
635 - Removed CPP kernels / functions:
636 - CPPFlipWeightsKernel
Manuel Bottini387259a2020-05-21 17:14:36 +0100637 - Removed PoolingLayerInfo constructors without Data Layout.
638 - Removed CLDepthwiseConvolutionLayer3x3
639 - Removed NEDepthwiseConvolutionLayerOptimized
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000640 - Added support for Winograd 3x3,4x4 on Arm® Neon™ FP16:
Manuel Bottini075253a2020-05-22 12:57:18 +0100641 - @ref NEWinogradConvolutionLayer
642 - @ref NEWinogradLayerTransformInputKernel
643 - @ref NEWinogradLayerTransformOutputKernel
644 - @ref NEWinogradLayerTransformWeightsKernel
645 - Added CLCompileContext
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000646 - Added Arm® Neon™ GEMM kernel with 2D window support
Georgios Pinitasc7b183a2020-03-06 18:12:09 +0000647
Michele Di Giorgio740872e2020-03-04 15:29:49 +0000648v20.02.1 Maintenance release
649 - Added Android-NN build script.
650
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000651v20.02 Public major release
652 - Various bug fixes.
653 - Various optimisations.
654 - Added new data type QASYMM8_SIGNED support for:
655 - @ref CLDepthwiseConvolutionLayer
Manuel Bottini387259a2020-05-21 17:14:36 +0100656 - CLDepthwiseConvolutionLayer3x3
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000657 - @ref CLGEMMConvolutionLayer
658 - @ref CLGEMMLowpMatrixMultiplyCore
659 - @ref CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
660 - @ref CLGEMMLowpMatrixMultiplyNativeKernel
661 - @ref NEActivationLayer
Sang-Hoon Park63001ac2021-01-18 14:20:27 +0000662 - NEComparisonOperationKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000663 - @ref NEConvolutionLayer
664 - @ref NEDepthwiseConvolutionLayer
Georgios Pinitas7d0adc62020-09-04 15:25:24 +0100665 - NEDepthwiseConvolutionLayer3x3Kernel
Manuel Bottini327225d2021-04-13 13:09:30 +0100666 - NEDirectConvolutionLayerOutputStageKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000667 - @ref NEElementwiseComparison
668 - @ref NEElementwiseMax
669 - @ref NEElementwiseMin
670 - @ref NEElementwiseSquaredDiff
671 - @ref NEFullyConnectedLayer
Michele Di Giorgiof22f6722020-07-03 16:29:24 +0100672 - NEGEMMMatrixVectorMultiplyKernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000673 - @ref NEPixelWiseMultiplication
674 - @ref NEPoolingLayer
675 - @ref NEPReluLayer
676 - Added support for QSYMM8_PER_CHANNEL in:
Georgios Pinitas7d0adc62020-09-04 15:25:24 +0100677 - NEDepthwiseConvolutionLayer3x3Kernel
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000678 - Added support for split sizes in:
679 - @ref CLSplit
680 - @ref NESplit
681 - New OpenCL kernels / functions:
682 - @ref CLFill
Michele Di Giorgioba14c922020-10-12 13:27:57 +0100683 - CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / @ref CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000684 - New Arm® Neon™ kernels / functions:
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000685 - @ref NEFill
686 - @ref NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / @ref NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000687 - Deprecated Arm® Neon™ functions / interfaces:
Manuel Bottini387259a2020-05-21 17:14:36 +0100688 - CLDepthwiseConvolutionLayer3x3
689 - NEDepthwiseConvolutionLayerOptimized
690 - PoolingLayerInfo constructors without Data Layout.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000691 - Added support for quantization with multiplier greater than 1 on Arm® Neon™ and CL.
Giuseppe Rossinif04ddbc2020-02-17 17:22:49 +0000692 - Added support for quantized inputs of type QASYMM8_SIGNED and QASYMM8 to @ref CLQuantizationLayer.
693 - Added the ability to build bootcode for bare metal.
694 - Added support for generating synthetic QASYMM8 graphs.
695 - Added support for F16 datatype in VGG16.
696 - Removed pre-built binaries for GLES.
697
Michele Di Giorgiod374ff22020-01-21 10:03:20 +0000698v19.11.1 Public maintenance release
699 - Fix offset calculation in NEReductionOperationKernel.
700 - Fix data layout in NEScaleKernel for nhwc.
701 - Retain configuration step data layout to avoid side-effects.
702 - Perform sqrt in double domain for L2 pooling.
703 - Fix output shape calculation for Reduce Mean
704 - Restrict cases where optimized NEPadLayer runs.
705
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100706v19.11 Public major release
SiCong Lica1f98c2019-11-28 11:06:11 +0000707 - Various bug fixes.
708 - Various optimisations.
SiCong Li1f7f9882019-11-28 14:59:35 +0000709 - Updated recommended NDK version to r17c.
SiCong Lica1f98c2019-11-28 11:06:11 +0000710 - Deprecated OpenCL kernels / functions:
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100711 - CLDepthwiseConvolutionLayerReshapeWeightsGenericKernel
712 - CLDepthwiseIm2ColKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000713 - CLDepthwiseSeparableConvolutionLayer
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100714 - CLDepthwiseVectorToTensorKernel
715 - CLDirectConvolutionLayerOutputStageKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000716 - Deprecated Arm® Neon™ kernels / functions:
Giorgio Arenad93e2632019-10-15 11:09:33 +0100717 - NEDepthwiseWeightsReshapeKernel
718 - NEDepthwiseIm2ColKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000719 - NEDepthwiseSeparableConvolutionLayer
Giorgio Arenad93e2632019-10-15 11:09:33 +0100720 - NEDepthwiseVectorToTensorKernel
Manuel Bottini05069f02019-09-26 17:18:26 +0100721 - NEDepthwiseConvolutionLayer3x3
SiCong Lica1f98c2019-11-28 11:06:11 +0000722 - New OpenCL kernels / functions:
723 - @ref CLInstanceNormalizationLayerKernel / @ref CLInstanceNormalizationLayer
724 - @ref CLDepthwiseConvolutionLayerNativeKernel to replace the old generic depthwise convolution (see Deprecated
725 OpenCL kernels / functions)
726 - @ref CLLogSoftmaxLayer
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000727 - New Arm® Neon™ kernels / functions:
SiCong Lica1f98c2019-11-28 11:06:11 +0000728 - @ref NEBoundingBoxTransformKernel / @ref NEBoundingBoxTransform
Georgios Pinitas8c3c0e72020-12-03 20:11:53 +0000729 - @ref NEComputeAllAnchorsKernel / NEComputeAllAnchors
SiCong Lica1f98c2019-11-28 11:06:11 +0000730 - @ref NEDetectionPostProcessLayer
731 - @ref NEGenerateProposalsLayer
732 - @ref NEInstanceNormalizationLayerKernel / @ref NEInstanceNormalizationLayer
733 - @ref NELogSoftmaxLayer
734 - @ref NEROIAlignLayerKernel / @ref NEROIAlignLayer
735 - Added QASYMM8 support for:
736 - @ref CLGenerateProposalsLayer
737 - @ref CLROIAlignLayer
738 - @ref CPPBoxWithNonMaximaSuppressionLimit
739 - Added QASYMM16 support for:
740 - @ref CLBoundingBoxTransform
741 - Added FP16 support for:
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100742 - CLGEMMMatrixMultiplyReshapedKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000743 - Added new data type QASYMM8_PER_CHANNEL support for:
Manuel Bottini9e73c932021-03-02 17:40:42 +0000744 - CLDequantizationLayer
SiCong Lica1f98c2019-11-28 11:06:11 +0000745 - @ref NEDequantizationLayer
746 - Added new data type QSYMM8_PER_CHANNEL support for:
747 - @ref CLConvolutionLayer
748 - @ref NEConvolutionLayer
749 - @ref CLDepthwiseConvolutionLayer
750 - @ref NEDepthwiseConvolutionLayer
751 - Added FP16 mixed-precision support for:
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100752 - CLGEMMMatrixMultiplyReshapedKernel
Michele Di Giorgioe1314662021-02-01 17:09:32 +0000753 - CLPoolingLayerKernel
SiCong Lica1f98c2019-11-28 11:06:11 +0000754 - Added FP32 and FP16 ELU activation for:
755 - @ref CLActivationLayer
756 - @ref NEActivationLayer
757 - Added asymmetric padding support for:
758 - @ref CLDirectDeconvolutionLayer
759 - @ref CLGEMMDeconvolutionLayer
760 - @ref NEDeconvolutionLayer
761 - Added SYMMETRIC and REFLECT modes for @ref CLPadLayerKernel / @ref CLPadLayer.
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +0000762 - Replaced the calls to NECopyKernel and NEMemsetKernel with @ref NEPadLayer in @ref NEGenerateProposalsLayer.
763 - Replaced the calls to CLCopyKernel and CLMemsetKernel with @ref CLPadLayer in @ref CLGenerateProposalsLayer.
SiCong Lica1f98c2019-11-28 11:06:11 +0000764 - Improved performance for CL Inception V3 - FP16.
765 - Improved accuracy for CL Inception V3 - FP16 by enabling FP32 accumulator (mixed-precision).
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000766 - Improved Arm® Neon™ performance by enabling fusing batch normalization with convolution and depth-wise convolution layer.
767 - Improved Arm® Neon™ performance for MobileNet-SSD by improving the output detection performance.
SiCong Lica1f98c2019-11-28 11:06:11 +0000768 - Optimized @ref CLPadLayer.
769 - Optimized CL generic depthwise convolution layer by introducing @ref CLDepthwiseConvolutionLayerNativeKernel.
770 - Reduced memory consumption by implementing weights sharing.
Michele Di Giorgioa046e162019-10-08 09:36:26 +0100771
Michele Di Giorgiod374ff22020-01-21 10:03:20 +0000772v19.08.1 Public maintenance release
773 - Fix offset calculation in NEReductionOperationKernel.
774 - Fix data layout in NEScaleKernel for nhwc.
775 - Retain configuration step data layout to avoid side-effects.
776 - Perform sqrt in double domain for L2 pooling.
777 - Fix output shape calculation for Reduce Mean
778 - Fix broadcast CLPixelwiseMultiplication with 5D tensors
779
Georgios Pinitas3d13af82019-06-04 13:04:16 +0100780v19.08 Public major release
781 - Various bug fixes.
782 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000783 - Deprecated Arm® Neon™ functions
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100784 - NEDepthConcatenateLayer
785 - NEWidthConcatenateLayer
786 - Deprecated OpenCL kernels / functions
787 - CLDepthConcatenateLayer
788 - CLGEMMInterleave4x4Kernel / CLGEMMInterleave4x4
789 - CLGEMMTranspose1xWKernel / CLGEMMTranspose1xW
790 - CLWidthConcatenateLayer
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000791 - New Arm® Neon™ kernels / functions:
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100792 - @ref NEAbsLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100793 - @ref NECast
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100794 - @ref NEElementwisePower
795 - @ref NELogLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100796 - @ref NELSTMLayerQuantized
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100797 - @ref NENegLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100798 - @ref NEPReluLayer
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100799 - @ref NESinLayer
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000800 - NEBatchConcatenateLayerKernel
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100801 - @ref NEDepthToSpaceLayerKernel / @ref NEDepthToSpaceLayer
Michalis Spyrou60c3b0e2021-04-08 12:02:58 +0100802 - NEDepthwiseConvolutionLayerNativeKernel
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100803 - @ref NEGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
804 - @ref NEMeanStdDevNormalizationKernel / @ref NEMeanStdDevNormalizationLayer
805 - @ref NESpaceToDepthLayerKernel / @ref NESpaceToDepthLayer
806 - New OpenCL kernels / functions:
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100807 - @ref CLAbsLayer
808 - @ref CLElementwisePower
809 - @ref CLLogLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100810 - @ref CLLSTMLayerQuantized
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100811 - @ref CLNegLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100812 - @ref CLPReluLayer
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100813 - @ref CLSinLayer
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000814 - CLBatchConcatenateLayerKernel
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100815 - @ref CLDepthToSpaceLayerKernel / @ref CLDepthToSpaceLayer
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100816 - CLGEMMLowpMatrixMultiplyNativeKernel
Michele Di Giorgioba14c922020-10-12 13:27:57 +0100817 - CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100818 - CLGEMMMatrixMultiplyNativeKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +0000819 - CLMeanStdDevNormalizationKernel /CLMeanStdDevNormalizationLayer
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100820 - @ref CLSpaceToDepthLayerKernel / @ref CLSpaceToDepthLayer
821 - New examples:
822 - neon_opticalflow
823 - cl_cache
824 - neon_permute
Gian Marco Iodicec5f48ad2019-09-02 09:52:12 +0100825 - Added support for FP16 in @ref NEDeconvolutionLayer
826 - Added support for FP16 in @ref CLDeconvolutionLayer
827 - Added support for REDUCE_MIN and REDUCE_MAX in @ref ReductionOperation
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100828 - Enable the fusion of batch normalization with convolution and depthwise convolution layer for FP32 in the graph API (OpenCL only)
829 - Added support for fusing activation function and broadcast addition with the matrix multiplication for FP32 (OpenCL only)
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000830 - Re-factored the depthwise convolution layer kernel on Arm® Neon™ for generic cases
Sheri Zhangac6499a2021-02-10 15:32:38 +0000831 - Added an optimized depthwise convolution layer kernel for 5x5 filters (Neon only)
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100832 - Added support to enable OpenCL kernel cache. Added example showing how to load the prebuilt OpenCL kernels from a binary cache file
833 - Altered @ref QuantizationInfo interface to support per-channel quantization.
Manuel Bottini387259a2020-05-21 17:14:36 +0100834 - The CLDepthwiseConvolutionLayer3x3 will be included by @ref CLDepthwiseConvolutionLayer to accommodate for future optimizations.
835 - The NEDepthwiseConvolutionLayerOptimized will be included by @ref NEDepthwiseConvolutionLayer to accommodate for future optimizations.
Gian Marco Iodicecc2f54b2019-08-22 10:10:52 +0100836 - Removed inner_border_right and inner_border_top parameters from @ref CLDeconvolutionLayer interface
837 - Removed inner_border_right and inner_border_top parameters from @ref NEDeconvolutionLayer interface
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000838 - Optimized the Arm® Neon™ assembly kernel for GEMMLowp. The new implementation fuses the output stage and quantization with the matrix multiplication kernel
Georgios Pinitas3d13af82019-06-04 13:04:16 +0100839
Michalis Spyroua9c44722019-04-05 17:18:36 +0100840v19.05 Public major release
Michalis Spyrouc6608ac2019-05-16 17:40:23 +0100841 - Various bug fixes.
842 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000843 - New Arm® Neon™ kernels / functions:
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100844 - @ref NEBatchToSpaceLayerKernel / @ref NEBatchToSpaceLayer
Sheri Zhang1e3ab422021-03-16 17:35:08 +0000845 - NEComplexPixelWiseMultiplicationKernel / @ref NEComplexPixelWiseMultiplication
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100846 - @ref NECropKernel / @ref NECropResize
Michalis Spyrou60c3b0e2021-04-08 12:02:58 +0100847 - NEDepthwiseConvolutionAssemblyDispatch
Michalis Spyrouca82e622019-05-10 16:43:20 +0100848 - @ref NEFFTDigitReverseKernel
849 - @ref NEFFTRadixStageKernel
850 - @ref NEFFTScaleKernel
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100851 - @ref NEGEMMLowpOffsetContributionOutputStageKernel
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000852 - NEHeightConcatenateLayerKernel
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100853 - @ref NESpaceToBatchLayerKernel / @ref NESpaceToBatchLayer
Michalis Spyroud7dd15c2019-05-30 14:53:58 +0100854 - @ref NEFFT1D
855 - @ref NEFFT2D
856 - @ref NEFFTConvolutionLayer
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100857 - New OpenCL kernels / functions:
Sheri Zhangf9ab9f92021-03-16 12:09:15 +0000858 - CLComplexPixelWiseMultiplicationKernel / @ref CLComplexPixelWiseMultiplication
Sheri Zhang7e20e292021-02-02 11:49:34 +0000859 - CLCropKernel / @ref CLCropResize
Michalis Spyroud7dd15c2019-05-30 14:53:58 +0100860 - @ref CLDeconvolutionReshapeOutputKernel
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100861 - @ref CLFFTDigitReverseKernel
862 - @ref CLFFTRadixStageKernel
863 - @ref CLFFTScaleKernel
864 - @ref CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100865 - CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +0000866 - CLHeightConcatenateLayerKernel
Georgios Pinitasf790fdb2019-04-24 12:41:25 +0100867 - @ref CLDirectDeconvolutionLayer
868 - @ref CLFFT1D
869 - @ref CLFFT2D
870 - @ref CLFFTConvolutionLayer
Michalis Spyrouca82e622019-05-10 16:43:20 +0100871 - @ref CLGEMMDeconvolutionLayer
872 - New OpenGLES kernels / functions:
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +0000873 - GCConcatenateLayer
Michalis Spyroua9c44722019-04-05 17:18:36 +0100874 - Deprecated functions/interfaces
Georgios Pinitas09f24972019-05-17 18:14:40 +0100875 - GCDepthConcatenateLayer
876 - NEWidthConcatenateLayer
877 - NEDepthConcatenateLayer
878 - CLWidthConcatenateLayer
879 - CLDepthConcatenateLayer
Gian Marco Iodice5fc07aa2019-05-15 17:08:02 +0100880 - CLGEMMInterleave4x4
881 - CLGEMMTranspose1xW
Michalis Spyrouc6608ac2019-05-16 17:40:23 +0100882 - Support different quantization info in CLConcatLayer.
883 - Add checks on different input/output quantization info were not supported.
884 - Tensors have different quantization information.
885 - Add FP16 support checks.
886 - Fix output quantization CLDeptwiseConv3x3 when activation is fused.
887 - New graph examples:
888 - graph_convolution
889 - graph_fully_connected
890 - graph_depthwise_convolution
891 - Deepspeech v0.4.1
892 - Add support for QASYMM8 in NEArithmeticSubtractionKernel.
893 - Add support for QASYMM8 in NEPixelWiseMultiplicationKernel.
894 - Add support for QASYMM8 NEDeconvolution.
Sheri Zhangac6499a2021-02-10 15:32:38 +0000895 - Add support for DequantizationLayer for Neon/CL.
Michalis Spyrouc6608ac2019-05-16 17:40:23 +0100896 - Add support for dilation in CLDepthwiseConvolution.
897 - Fuse offset contribution with the output stage when we use NEGEMMLowpMatrixMultiplyCore.
898 - Optimize CLDeconvolution.
899 - Add StackLayer to the graph API.
900 - Add support for "reflect" padding mode in NEPad.
901 - Winograd 7x7 NHWC on OpenCL.
902 - Rework CL ML layers to run exclusively on CL.
903 - Support different quantization info in PoolingLayer.
904 - Implement and test import memory interfaces.
905 - Added new tests and removed old ones.
906 - Various clang-tidy fixes.
Michalis Spyroua9c44722019-04-05 17:18:36 +0100907
giuros01a69a88b2019-01-31 16:29:19 +0000908v19.02 Public major release
Isabella Gottardi62538972019-02-12 19:52:44 +0000909 - Various bug fixes.
910 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000911 - New Arm® Neon™ kernels / functions:
Isabella Gottardi62538972019-02-12 19:52:44 +0000912 - @ref NETileKernel / @ref NETile
913 - @ref NEFuseBatchNormalizationKernel / @ref NEFuseBatchNormalization
Sang-Hoon Park63001ac2021-01-18 14:20:27 +0000914 - NEElementwiseOperationKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000915 - @ref NEElementwiseMax
916 - @ref NEElementwiseMin
917 - @ref NEElementwiseSquaredDiff
918 - @ref NESelectKernel / @ref NESelect
919 - @ref NESplit
920 - @ref NESlice
921 - @ref NEUnstack
922 - @ref NEStridedSliceKernel / @ref NEStridedSlice
Sang-Hoon Park7249f152021-01-22 11:55:03 +0000923 - NEElementwiseUnaryKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000924 - @ref NERsqrtLayer
925 - @ref NEExpLayer
926 - @ref NEReverseKernel / @ref NEReverse
927 - @ref NEArgMinMaxLayer
928 - @ref NEStackLayerKernel / @ref NEStackLayer
929 - @ref NERangeKernel / @ref NERange
930 - @ref NEPadLayer
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +0000931 - NEMemsetKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000932 - @ref NEGatherKernel / @ref NEGather
933 - @ref NEElementwiseComparison
934 - @ref NEElementwiseComparisonStatic
Sang-Hoon Park63001ac2021-01-18 14:20:27 +0000935 - NEComparisonOperationKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000936 - @ref NEElementwiseDivision
937 - New OpenCL kernels / functions:
938 - @ref CLSelectKernel / @ref CLSelect
939 - @ref CLTileKernel / @ref CLTile
940 - @ref CLComparisonKernel / @ref CLComparison
941 - @ref CLArgMinMaxLayer
942 - @ref CLElementwiseMax
943 - @ref CLElementwiseMin
944 - @ref CLElementwiseSquaredDiff
945 - @ref CLStackLayerKernel / @ref CLStackLayer
946 - @ref CLReverse / @ref CLReverseKernel
947 - @ref CLRsqrtLayer
948 - @ref CLExpLayer
Michele Di Giorgioc9c89052021-01-26 10:20:17 +0000949 - CLElementWiseUnaryLayerKernel
Georgios Pinitas856f66e2021-04-22 21:13:21 +0100950 - CLGEMMReshapeLHSMatrixKernel
951 - CLGEMMReshapeRHSMatrixKernel
952 - CLGEMMMatrixMultiplyReshapedKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000953 - @ref CLRangeKernel / @ref CLRange
954 - @ref CLUnstack
955 - @ref CLGatherKernel / @ref CLGather
956 - @ref CLGEMMLowpMatrixMultiplyReshapedKernel
957 - New CPP kernels / functions:
958 - @ref CPPDetectionOutputLayer
959 - @ref CPPTopKV / @ref CPPTopKVKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000960 - Added new examples:
961 - graph_ssd_mobilenet.cpp
962 - graph_mobilenet_v2.cpp
963 - graph_resnet12.cpp
964 - graph_srcnn955.cpp
965 - graph_vgg_vdsr.cpp
966 - graph_inception_resnet_v1.cpp
967 - Add 4D tensors support to
968 - @ref NESoftmaxLayer
969 - Fused activation in @ref CLWinogradConvolutionLayer
970 - Extented @ref NEPermute to support more cases
Sheri Zhangac6499a2021-02-10 15:32:38 +0000971 - Added Neon/SVE GEMM Hybrid kernels
Isabella Gottardi62538972019-02-12 19:52:44 +0000972 - Added u8 and s8 hybrid assembly kernels
973 - Introduced GEMM strategy name in NEGEMMAssemblyWrapper
974 - Improved @ref CLTuner
975 - Fused the bias addition within @ref CLGEMM
976 - Added support for QASYMM8 LOGISTIC activation in @ref NEActivationLayer
977 - Added NHWC data layout support to:
978 - @ref NEScale for F16
979 - @ref CLNormalizationLayer IN_MAP_2D for FP32/FP16
980 - @ref NEL2NormalizeLayer for FP32/FP16
981 - @ref NENormalizationLayer IN_MAP_2D for FP32/FP16
982 - @ref CLROIAlignLayer
Manuel Bottini5209be52019-02-13 16:34:56 +0000983 - @ref CLGenerateProposalsLayer
Isabella Gottardi62538972019-02-12 19:52:44 +0000984 - Added QASYMM8 support to the following kernels:
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +0000985 - NEArithmeticAdditionKernel
Isabella Gottardi62538972019-02-12 19:52:44 +0000986 - @ref NEScale
987 - Added new tests and improved validation and benchmarking suites.
giuros01a69a88b2019-01-31 16:29:19 +0000988 - Deprecated functions/interfaces
989 - Usage of inner_border_right and inner_border_top has been deprecated in @ref CLDeconvolutionLayer and @ref NEDeconvolutionLayer
990
Isabella Gottardi8773d7c2018-11-20 09:56:46 +0000991v18.11 Public major release
992 - Various bug fixes.
993 - Various optimisations.
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +0000994 - New Arm® Neon™ kernels / functions:
Isabella Gottardi8773d7c2018-11-20 09:56:46 +0000995 - @ref NEChannelShuffleLayer / @ref NEChannelShuffleLayerKernel
996 - @ref NEReduceMean
997 - @ref NEReorgLayer / @ref NEReorgLayerKernel
998 - @ref NEPriorBoxLayer / @ref NEPriorBoxLayerKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +0000999 - NEUpsampleLayer / NEUpsampleLayerKernel
Georgios Pinitas0b1c2db2020-12-04 15:51:34 +00001000 - NEYOLOLayer / NEYOLOLayerKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001001 - New OpenCL kernels / functions:
1002 - @ref CLBatchToSpaceLayer / @ref CLBatchToSpaceLayerKernel
1003 - @ref CLBoundingBoxTransform / @ref CLBoundingBoxTransformKernel
Manuel Bottini5209be52019-02-13 16:34:56 +00001004 - @ref CLComputeAllAnchorsKernel
1005 - @ref CLGenerateProposalsLayer
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001006 - @ref CLNormalizePlanarYUVLayer / @ref CLNormalizePlanarYUVLayerKernel
1007 - @ref CLReorgLayer / @ref CLReorgLayerKernel
1008 - @ref CLSpaceToBatchLayer / @ref CLSpaceToBatchLayerKernel
1009 - @ref CLPadLayer
1010 - @ref CLReduceMean
1011 - @ref CLPriorBoxLayer / @ref CLPriorBoxLayerKernel
1012 - @ref CLROIAlignLayer / @ref CLROIAlignLayerKernel
1013 - @ref CLSlice
1014 - @ref CLSplit
1015 - @ref CLStridedSlice / @ref CLStridedSliceKernel
Georgios Pinitasc53266e2020-12-09 03:11:53 +00001016 - CLUpsampleLayer / CLUpsampleLayerKernel
Georgios Pinitas0b1c2db2020-12-04 15:51:34 +00001017 - CLYOLOLayer / CLYOLOLayerKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001018 - New CPP kernels / functions:
1019 - @ref CPPBoxWithNonMaximaSuppressionLimit / @ref CPPBoxWithNonMaximaSuppressionLimitKernel
1020 - Added the validate method in:
1021 - @ref NEDepthConvertLayer
1022 - @ref NEFloor / @ref CLFloor
1023 - @ref NEGEMMMatrixAdditionKernel
1024 - @ref NEReshapeLayer / @ref CLReshapeLayer
1025 - @ref CLScale
1026 - Added new examples:
1027 - graph_shufflenet.cpp
1028 - graph_yolov3.cpp
1029 - Added documentation for add a new function or kernel.
1030 - Improved doxygen documentation adding a list of the existing functions.
1031 - Add 4D tensors support to
Georgios Pinitas09f24972019-05-17 18:14:40 +01001032 - CLWidthConcatenateLayer
Georgios Pinitase2696b12020-12-03 20:37:43 +00001033 - CLFlattenLayer
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001034 - @ref CLSoftmaxLayer
1035 - Add dot product support for @ref CLDepthwiseConvolutionLayer3x3NHWCKernel non-unit stride
1036 - Add SVE support
1037 - Fused batch normalization into convolution layer weights in @ref CLFuseBatchNormalization
1038 - Fuses activation in @ref CLDepthwiseConvolutionLayer3x3NCHWKernel, @ref CLDepthwiseConvolutionLayer3x3NHWCKernel and @ref NEGEMMConvolutionLayer
1039 - Added NHWC data layout support to:
1040 - @ref CLChannelShuffleLayer
1041 - @ref CLDeconvolutionLayer
1042 - @ref CLL2NormalizeLayer
1043 - Added QASYMM8 support to the following kernels:
Manuel Bottini3b131ab2021-02-19 18:16:44 +00001044 - CLScaleKernel
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001045 - NEDepthwiseConvolutionLayer3x3Kernel
Sheri Zhangf9ab9f92021-03-16 12:09:15 +00001046 - CLPixelWiseMultiplicationKernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001047 - Added FP16 support to the following kernels:
1048 - @ref CLDepthwiseConvolutionLayer3x3NHWCKernel
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001049 - NEDepthwiseConvolutionLayer3x3Kernel
Isabella Gottardi8773d7c2018-11-20 09:56:46 +00001050 - @ref CLNormalizePlanarYUVLayerKernel
1051 - @ref CLWinogradConvolutionLayer (5x5 kernel)
1052 - More tests added to both validation and benchmarking suites.
1053
Anthony Barbierd51ea0a2018-08-07 17:48:03 +01001054v18.08 Public major release
1055 - Various bug fixes.
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001056 - Various optimisations.
Anthony Barbierd51ea0a2018-08-07 17:48:03 +01001057 - Updated recommended NDK version to r17b.
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001058 - Removed support for QS8/QS16 data types.
1059 - Added support for grouped convolution in @ref CLConvolutionLayer.
1060 - Added NHWC data layout support to:
Georgios Pinitas09f24972019-05-17 18:14:40 +01001061 - NEDepthConcatenateLayer / CLDepthConcatenateLayer
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001062 - @ref NEWinogradConvolutionLayer / @ref CLWinogradConvolutionLayer
1063 - @ref CLDepthwiseConvolutionLayer
1064 - @ref CLDirectConvolutionLayer
1065 - @ref CLConvolutionLayer
1066 - @ref CLScale
1067 - @ref CLIm2ColKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001068 - New Arm® Neon™ kernels / functions:
Michele Di Giorgio02baf012018-08-20 18:10:38 +01001069 - @ref NERNNLayer
1070 - New OpenCL kernels / functions:
1071 - @ref CLArithmeticDivision
1072 - Introduced prepare() stage support in the graph API for GLES.
1073 - Added support for memory reusage when trying to allocate smaller CLTensors.
1074 - Enabled NHWC execution on graph examples.
1075 - Added JPEG accessor for validation purposes.
1076 - Added validate methods to some kernels / functions.
Anthony Barbierd51ea0a2018-08-07 17:48:03 +01001077
1078v18.05 Public major release
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001079 - Various bug fixes.
1080 - Various optimisations.
Pablo Telloeb82fd22018-02-23 13:43:50 +00001081 - Major redesign in the interface for the neon kernels implemented in assembly.
1082 - Removed arm_compute::NEGEMMLowpAArch64A53Kernel / arm_compute::NEGEMMLowpAArch64Kernel / arm_compute::NEGEMMLowpAArch64V8P4Kernel / arm_compute::NEGEMMInterleavedBlockedKernel / arm_compute::NEGEMMLowpAssemblyMatrixMultiplyCore / arm_compute::NEHGEMMAArch64FP16Kernel
1083 - Added NEGEMMAssemblyWrapper and AssemblyKernelGlue which are used to execute assembly kernels in neon functions.
1084 - Minor changes to the CPUInfo type to make it compatible with the new assembly gemm interface.
Sheri Zhangac6499a2021-02-10 15:32:38 +00001085 - Moved neon assembly kernels to the folder src/core/Neon/kernels/arm_gemm.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001086 - Improved doxygen documentation.
1087 - Improved memory management for layer's transitions.
1088 - Added support for NHWC data layout in tensors.
1089 - Added NHWC data layout support to:
1090 - @ref NEGEMMConvolutionLayer
1091 - @ref NEDirectConvolutionLayer
1092 - @ref NEPoolingLayer / @ref CLPoolingLayer
1093 - @ref NEBatchNormalizationLayer / @ref CLBatchNormalizationLayer
1094 - @ref NEDepthwiseConvolutionLayer
1095 - @ref NEScale
Georgios Pinitasf7c5a412020-12-03 14:38:33 +00001096 - NEIm2Col
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001097 - Added support for dilated convolutions in @ref NEConvolutionLayer and @ref CLConvolutionLayer.
1098 - New OpenCL kernels / functions:
1099 - @ref CLChannelShuffleLayer / @ref CLChannelShuffleLayerKernel
Teresa Charlin91b7f742021-04-12 13:57:00 +01001100 - CLConvertFullyConnectedWeightsKernel / @ref CLConvertFullyConnectedWeights
Sheri Zhang7e20e292021-02-02 11:49:34 +00001101 - @ref CLCopy / CLCopyKernel
Anthony Barbier38e7f1f2018-05-21 13:37:47 +01001102 - @ref CLLSTMLayer
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001103 - @ref CLRNNLayer
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +00001104 - CLWidthConcatenateLayer / CLWidthConcatenateLayerKernel
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001105 - @ref CLWinogradFilterTransformKernel / @ref CLWinogradInputTransformKernel / @ref CLWinogradConvolutionLayer
1106 - @ref CLWinogradInputTransformKernel / @ref CLWinogradInputTransform
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001107 - New Arm® Neon™ kernels / functions:
Teresa Charlin562bee52021-04-13 17:44:15 +01001108 - NEConvertFullyConnectedWeightsKernel / @ref NEConvertFullyConnectedWeights.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001109 - Created the validate method in @ref CLDepthwiseConvolutionLayer.
1110 - Beta and gamma are no longer mandatory arguments in @ref NEBatchNormalizationLayer and @ref CLBatchNormalizationLayer.
1111 - Added depth multiplier support in @ref NEDepthwiseConvolutionLayer and @ref CLDepthwiseConvolutionLayer.
Sheri Zhang1e3ab422021-03-16 17:35:08 +00001112 - Added broadcast multiply support in @ref NEPixelWiseMultiplication / NEPixelWiseMultiplicationKernel.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001113 - Port mobilenet example to NHWC data layout.
1114 - Enabled Winograd method in @ref CLConvolutionLayer.
1115 - Renamed NEWinogradLayer to @ref NEWinogradConvolutionLayer.
Sheri Zhangac6499a2021-02-10 15:32:38 +00001116 - Updated @ref NEWinogradConvolutionLayer to use highly optimised assembly kernels in src/core/Neon/kernels/arm_gemm.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001117 - Added memory manager support in GLES functions.
1118 - Major refactoring of the graph API.
1119 - Added GLES backend in the graph API.
1120 - Added support for the memory manager in the graph API.
1121 - Enabled Winograd Convolution method in the graph API.
1122 - Added support for grouped convolutions in the graph API.
Manuel Bottini10b38262021-02-19 18:16:44 +00001123 - Replaced NEDeconvolutionLayerUpsampleKernel with NEScaleKernel in @ref NEDeconvolutionLayer.
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001124 - Added fast maths flag in @ref CLConvolutionLayer.
1125 - Added new tests and benchmarks in validation and benchmark frameworks
Sheri Zhangac6499a2021-02-10 15:32:38 +00001126 - Merge Activation layer with Convolution Layer (Neon. CL, GLES)
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001127 - Added support to OpenCL 2.0 SVM
1128 - Added support to import memory in OpenCL tensors.
1129 - Added the prepare() method to perform any one off pre-processing before running the function.
1130 - Added new examples:
1131 - graph_inception_v4.cpp
Anthony Barbier38e7f1f2018-05-21 13:37:47 +01001132 - graph_resnext50.cpp
Pablo Tellob5cc95b2018-05-15 11:49:33 +01001133 - Added memory measurement instrument for CL.
Pablo Telloeb82fd22018-02-23 13:43:50 +00001134
Anthony Barbier577fbdf2018-03-01 15:17:54 +00001135v18.03 Public maintenance release
1136 - Various bug fixes.
Anthony Barbier3762e742018-03-02 11:49:33 +00001137 - Fixed bug in @ref NEActivationLayer
1138 - Fix in @ref CLTuner when using batches.
Anthony Barbier577fbdf2018-03-01 15:17:54 +00001139 - Updated recommended NDK version to r16b (And fixed warnings).
1140 - Fixed bug in validation code.
1141 - Added Inception v4 graph example.
Georgios Pinitas9fb11592018-04-26 20:34:58 +01001142 - Renamed NEWinogradLayer.cpp to @ref NEWinogradConvolutionLayer
Anthony Barbier577fbdf2018-03-01 15:17:54 +00001143
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001144v18.02 Public major release
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001145 - Various Arm® Neon™ / OpenCL / GLES optimisations.
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001146 - Various bug fixes.
1147 - Changed default number of threads on big LITTLE systems.
1148 - Refactored examples and added:
1149 - graph_mobilenet_qassym8
1150 - graph_resnet
1151 - graph_squeezenet_v1_1
Anthony Barbier3762e742018-03-02 11:49:33 +00001152 - Renamed @ref CLConvolutionLayer into @ref CLGEMMConvolutionLayer and created a new @ref CLConvolutionLayer to select the fastest convolution method.
1153 - Renamed @ref NEConvolutionLayer into @ref NEGEMMConvolutionLayer and created a new @ref NEConvolutionLayer to select the fastest convolution method.
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001154 - Added in place support to:
Anthony Barbier3762e742018-03-02 11:49:33 +00001155 - @ref CLActivationLayer
1156 - @ref CLBatchNormalizationLayer
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001157 - Added QASYMM8 support to:
Anthony Barbier3762e742018-03-02 11:49:33 +00001158 - @ref CLActivationLayer
1159 - @ref CLDepthwiseConvolutionLayer
1160 - @ref NEDepthwiseConvolutionLayer
1161 - @ref NESoftmaxLayer
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001162 - Added FP16 support to:
Manuel Bottini387259a2020-05-21 17:14:36 +01001163 - CLDepthwiseConvolutionLayer3x3
Anthony Barbier3762e742018-03-02 11:49:33 +00001164 - @ref CLDepthwiseConvolutionLayer
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001165 - Added broadcasting support to NEArithmeticAddition / @ref CLArithmeticAddition / @ref CLPixelWiseMultiplication
Anthony Barbier3762e742018-03-02 11:49:33 +00001166 - Added fused batched normalization and activation to @ref CLBatchNormalizationLayer and @ref NEBatchNormalizationLayer
1167 - Added support for non-square pooling to @ref NEPoolingLayer and @ref CLPoolingLayer
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001168 - New OpenCL kernels / functions:
Michele Di Giorgioa046e162019-10-08 09:36:26 +01001169 - CLDirectConvolutionLayerOutputStageKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001170 - New Arm® Neon™ kernels / functions
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001171 - Added name() method to all kernels.
1172 - Added support for Winograd 5x5.
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +00001173 - NEPermuteKernel / @ref NEPermute
Georgios Pinitas9fb11592018-04-26 20:34:58 +01001174 - @ref NEWinogradLayerTransformInputKernel / NEWinogradLayer
1175 - @ref NEWinogradLayerTransformOutputKernel / NEWinogradLayer
1176 - @ref NEWinogradLayerTransformWeightsKernel / NEWinogradLayer
Anthony Barbiere1553372018-07-16 18:53:52 +01001177 - Renamed NEWinogradLayerKernel into NEWinogradLayerBatchedGEMMKernel
Anthony Barbier2d0ce772018-02-21 15:35:36 +00001178 - New GLES kernels / functions:
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001179 - GCTensorShiftKernel / GCTensorShift
Pablo Tellof6c572c2018-02-14 12:47:30 +00001180
Anthony Barbier64c95a02018-01-22 18:48:55 +00001181v18.01 Public maintenance release
1182 - Various bug fixes
1183 - Added some of the missing validate() methods
Anthony Barbier3762e742018-03-02 11:49:33 +00001184 - Added @ref CLDeconvolutionLayerUpsampleKernel / @ref CLDeconvolutionLayer @ref CLDeconvolutionLayerUpsample
Sheri Zhang7e20e292021-02-02 11:49:34 +00001185 - Added CLPermuteKernel / @ref CLPermute
Anthony Barbier64c95a02018-01-22 18:48:55 +00001186 - Added method to clean the programs cache in the CL Kernel library.
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001187 - Added GCArithmeticAdditionKernel / GCArithmeticAddition
1188 - Added GCDepthwiseConvolutionLayer3x3Kernel / GCDepthwiseConvolutionLayer3x3
1189 - Added GCNormalizePlanarYUVLayerKernel / GCNormalizePlanarYUVLayer
1190 - Added GCScaleKernel / GCScale
1191 - Added GCWeightsReshapeKernel / GCConvolutionLayer
Anthony Barbier64c95a02018-01-22 18:48:55 +00001192 - Added FP16 support to the following GLES compute kernels:
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001193 - GCCol2ImKernel
1194 - GCGEMMInterleave4x4Kernel
1195 - GCGEMMTranspose1xWKernel
1196 - GCIm2ColKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001197 - Refactored Arm® Neon™ Winograd (NEWinogradLayerKernel)
Manuel Bottini327225d2021-04-13 13:09:30 +01001198 - Added NEDirectConvolutionLayerOutputStageKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001199 - Added QASYMM8 support to the following Arm® Neon™ kernels:
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001200 - NEDepthwiseConvolutionLayer3x3Kernel
Anthony Barbier3762e742018-03-02 11:49:33 +00001201 - @ref NEFillBorderKernel
Michele Di Giorgio19289042021-02-03 16:05:00 +00001202 - NEPoolingLayerKernel
Anthony Barbier64c95a02018-01-22 18:48:55 +00001203 - Added new examples:
1204 - graph_cl_mobilenet_qasymm8.cpp
1205 - graph_inception_v3.cpp
1206 - gc_dc.cpp
1207 - More tests added to both validation and benchmarking suites.
1208
Gian Marcoff850932017-12-11 12:37:17 +00001209v17.12 Public major release
1210 - Most machine learning functions on OpenCL support the new data type QASYMM8
1211 - Introduced logging interface
1212 - Introduced opencl timer
1213 - Reworked GEMMLowp interface
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001214 - Added new Arm® Neon™ assembly kernels for GEMMLowp, SGEMM and HGEMM
Gian Marcoff850932017-12-11 12:37:17 +00001215 - Added validation method for most Machine Learning kernels / functions
1216 - Added new graph examples such as googlenet, mobilenet, squeezenet, vgg16 and vgg19
1217 - Added sgemm example for OpenCL
1218 - Added absolute difference example for GLES compute
1219 - Added new tests and benchmarks in validation and benchmark frameworks
1220 - Added new kernels / functions for GLES compute
1221
1222 - New OpenGL ES kernels / functions
Manuel Bottiniceaa0bf2021-02-16 15:15:19 +00001223 - GCAbsoluteDifferenceKernel / GCAbsoluteDifference
1224 - GCActivationLayerKernel / GCActivationLayer
1225 - GCBatchNormalizationLayerKernel / GCBatchNormalizationLayer
1226 - GCCol2ImKernel
1227 - GCDepthConcatenateLayerKernel / GCDepthConcatenateLayer
1228 - GCDirectConvolutionLayerKernel / GCDirectConvolutionLayer
1229 - GCDropoutLayerKernel / GCDropoutLayer
1230 - GCFillBorderKernel / GCFillBorder
1231 - GCGEMMInterleave4x4Kernel / GCGEMMInterleave4x4
1232 - GCGEMMMatrixAccumulateBiasesKernel / GCGEMMMatrixAdditionKernel / GCGEMMMatrixMultiplyKernel / GCGEMM
1233 - GCGEMMTranspose1xWKernel / GCGEMMTranspose1xW
1234 - GCIm2ColKernel
1235 - GCNormalizationLayerKernel / GCNormalizationLayer
1236 - GCPixelWiseMultiplicationKernel / GCPixelWiseMultiplication
1237 - GCPoolingLayerKernel / GCPoolingLayer
1238 - GCLogits1DMaxKernel / GCLogits1DShiftExpSumKernel / GCLogits1DNormKernel / GCSoftmaxLayer
1239 - GCTransposeKernel / GCTranspose
Gian Marcoff850932017-12-11 12:37:17 +00001240
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001241 - New Arm® Neon™ kernels / functions
Pablo Telloeb82fd22018-02-23 13:43:50 +00001242 - arm_compute::NEGEMMLowpAArch64A53Kernel / arm_compute::NEGEMMLowpAArch64Kernel / arm_compute::NEGEMMLowpAArch64V8P4Kernel / arm_compute::NEGEMMInterleavedBlockedKernel / arm_compute::NEGEMMLowpAssemblyMatrixMultiplyCore
1243 - arm_compute::NEHGEMMAArch64FP16Kernel
Georgios Pinitas7d0adc62020-09-04 15:25:24 +01001244 - NEDepthwiseConvolutionLayer3x3Kernel / NEDepthwiseIm2ColKernel / NEGEMMMatrixVectorMultiplyKernel / NEDepthwiseVectorToTensorKernel / @ref NEDepthwiseConvolutionLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001245 - @ref NEGEMMLowpOffsetContributionKernel / @ref NEGEMMLowpMatrixAReductionKernel / @ref NEGEMMLowpMatrixBReductionKernel / @ref NEGEMMLowpMatrixMultiplyCore
1246 - @ref NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / @ref NEGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
Georgios Pinitas9fb11592018-04-26 20:34:58 +01001247 - NEWinogradLayer / NEWinogradLayerKernel
Gian Marcoff850932017-12-11 12:37:17 +00001248
1249 - New OpenCL kernels / functions
Anthony Barbier3762e742018-03-02 11:49:33 +00001250 - @ref CLGEMMLowpOffsetContributionKernel / @ref CLGEMMLowpMatrixAReductionKernel / @ref CLGEMMLowpMatrixBReductionKernel / @ref CLGEMMLowpMatrixMultiplyCore
Michele Di Giorgioba14c922020-10-12 13:27:57 +01001251 - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / @ref CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
Gian Marcoff850932017-12-11 12:37:17 +00001252
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001253 - New graph nodes for Arm® Neon™ and OpenCL
Georgios Pinitasd9eb2752018-04-03 13:44:29 +01001254 - graph::BranchLayer
1255 - graph::DepthConvertLayer
1256 - graph::DepthwiseConvolutionLayer
1257 - graph::DequantizationLayer
1258 - graph::FlattenLayer
1259 - graph::QuantizationLayer
1260 - graph::ReshapeLayer
Gian Marcoff850932017-12-11 12:37:17 +00001261
Anthony Barbier3c5b4ff2017-10-12 13:20:52 +01001262v17.10 Public maintenance release
1263 - Bug fixes:
1264 - Check the maximum local workgroup size supported by OpenCL devices
1265 - Minor documentation updates (Fixed instructions to build the examples)
Anthony Barbier3762e742018-03-02 11:49:33 +00001266 - Introduced a graph::GraphContext
Anthony Barbier3c5b4ff2017-10-12 13:20:52 +01001267 - Added a few new Graph nodes, support for branches and grouping.
1268 - Automatically enable cl_printf in debug builds
1269 - Fixed bare metal builds for armv7a
1270 - Added AlexNet and cartoon effect examples
1271 - Fixed library builds: libraries are no longer built as supersets of each other.(It means application using the Runtime part of the library now need to link against both libarm_compute_core and libarm_compute)
1272
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001273v17.09 Public major release
1274 - Experimental Graph support: initial implementation of a simple stream API to easily chain machine learning layers.
Anthony Barbier3762e742018-03-02 11:49:33 +00001275 - Memory Manager (@ref BlobLifetimeManager, @ref BlobMemoryPool, @ref ILifetimeManager, @ref IMemoryGroup, @ref IMemoryManager, @ref IMemoryPool, @ref IPoolManager, @ref MemoryManagerOnDemand, @ref PoolManager)
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001276 - New validation and benchmark frameworks (Boost and Google frameworks replaced by homemade framework).
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001277 - Most machine learning functions support both fixed point 8 and 16 bit (QS8, QS16) for both Arm® Neon™ and OpenCL.
1278 - New Arm® Neon™ kernels / functions:
Pablo Telloeb82fd22018-02-23 13:43:50 +00001279 - arm_compute::NEGEMMAssemblyBaseKernel arm_compute::NEGEMMAArch64Kernel
Manuel Bottini00f4dfc2021-03-10 09:55:14 +00001280 - NEDequantizationLayerKernel / @ref NEDequantizationLayer
Georgios Pinitas70eb53b2021-01-06 19:42:21 +00001281 - NEFloorKernel / @ref NEFloor
Anthony Barbier3762e742018-03-02 11:49:33 +00001282 - @ref NEL2NormalizeLayerKernel / @ref NEL2NormalizeLayer
Manuel Bottini0ded4c42021-03-09 14:15:27 +00001283 - NEQuantizationLayerKernel @ref NEMinMaxLayerKernel / @ref NEQuantizationLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001284 - @ref NEROIPoolingLayerKernel / @ref NEROIPoolingLayer
1285 - @ref NEReductionOperationKernel / @ref NEReductionOperation
Georgios Pinitas0f7ef8a2021-01-10 04:23:52 +00001286 - NEReshapeLayerKernel / @ref NEReshapeLayer
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001287
1288 - New OpenCL kernels / functions:
Manuel Bottini387259a2020-05-21 17:14:36 +01001289 - @ref CLDepthwiseConvolutionLayer3x3NCHWKernel @ref CLDepthwiseConvolutionLayer3x3NHWCKernel CLDepthwiseIm2ColKernel CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer
Manuel Bottini9e73c932021-03-02 17:40:42 +00001290 - CLDequantizationLayerKernel / CLDequantizationLayer
Sheri Zhang1efed922021-03-10 22:43:38 +00001291 - CLDirectConvolutionLayerKernel / @ref CLDirectConvolutionLayer
Georgios Pinitase2696b12020-12-03 20:37:43 +00001292 - CLFlattenLayer
Georgios Pinitasf47f7182021-01-15 09:29:50 +00001293 - CLFloorKernel / @ref CLFloor
Gian Marco Iodice5fc07aa2019-05-15 17:08:02 +01001294 - CLGEMMTranspose1xW
Michele Di Giorgioee82d342021-01-05 16:14:28 +00001295 - CLGEMMMatrixVectorMultiplyKernel
Anthony Barbier3762e742018-03-02 11:49:33 +00001296 - @ref CLL2NormalizeLayerKernel / @ref CLL2NormalizeLayer
Manuel Bottini5a1bf622021-03-01 17:39:36 +00001297 - CLQuantizationLayerKernel @ref CLMinMaxLayerKernel / @ref CLQuantizationLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001298 - @ref CLROIPoolingLayerKernel / @ref CLROIPoolingLayer
1299 - @ref CLReductionOperationKernel / @ref CLReductionOperation
Sheri Zhang7e20e292021-02-02 11:49:34 +00001300 - CLReshapeLayerKernel / @ref CLReshapeLayer
Anthony Barbier6a5627a2017-09-26 14:42:02 +01001301
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001302v17.06 Public major release
1303 - Various bug fixes
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001304 - Added support for fixed point 8 bit (QS8) to the various Arm® Neon™ machine learning kernels.
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001305 - Added unit tests and benchmarks (AlexNet, LeNet)
1306 - Added support for sub tensors.
1307 - Added infrastructure to provide GPU specific optimisation for some OpenCL kernels.
Sheri Zhangac6499a2021-02-10 15:32:38 +00001308 - Added @ref OMPScheduler (OpenMP) scheduler for Neon
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001309 - Added @ref SingleThreadScheduler scheduler for Arm® Neon™ (For bare metal)
Anthony Barbier3762e742018-03-02 11:49:33 +00001310 - User can specify his own scheduler by implementing the @ref IScheduler interface.
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001311 - New OpenCL kernels / functions:
Anthony Barbier3762e742018-03-02 11:49:33 +00001312 - @ref CLBatchNormalizationLayerKernel / @ref CLBatchNormalizationLayer
Michele Di Giorgio7d61ff02021-01-18 21:15:59 +00001313 - CLDepthConcatenateLayerKernel / CLDepthConcatenateLayer
Michalis Spyrou473cb012021-02-23 11:48:12 +00001314 - CLHOGOrientationBinningKernel CLHOGBlockNormalizationKernel, CLHOGDetectorKernel / CLHOGDescriptor CLHOGDetector CLHOGGradient CLHOGMultiDetection
Georgios Pinitas96b16b62020-12-01 17:41:34 +00001315 - CLLocallyConnectedMatrixMultiplyKernel / CLLocallyConnectedLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001316 - @ref CLWeightsReshapeKernel / @ref CLConvolutionLayerReshapeWeights
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001317 - New C++ kernels:
Georgios Pinitasc6f95102021-03-30 10:03:01 +01001318 - CPPDetectionWindowNonMaximaSuppressionKernel
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001319 - New Arm® Neon™ kernels / functions:
Anthony Barbier3762e742018-03-02 11:49:33 +00001320 - @ref NEBatchNormalizationLayerKernel / @ref NEBatchNormalizationLayer
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001321 - NEDepthConcatenateLayerKernel / NEDepthConcatenateLayer
Manuel Bottini327225d2021-04-13 13:09:30 +01001322 - NEDirectConvolutionLayerKernel / @ref NEDirectConvolutionLayer
Georgios Pinitas96b16b62020-12-01 17:41:34 +00001323 - NELocallyConnectedMatrixMultiplyKernel / NELocallyConnectedLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001324 - @ref NEWeightsReshapeKernel / @ref NEConvolutionLayerReshapeWeights
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001325
1326v17.05 Public bug fixes release
1327 - Various bug fixes
1328 - Remaining of the functions ported to use accurate padding.
1329 - Library does not link against OpenCL anymore (It uses dlopen / dlsym at runtime instead to determine whether or not OpenCL is available).
1330 - Added "free" method to allocator.
1331 - Minimum version of g++ required for armv7 Linux changed from 4.8 to 4.9
1332
1333v17.04 Public bug fixes release
1334
1335 The following functions have been ported to use the new accurate padding:
Michalis Spyrou473cb012021-02-23 11:48:12 +00001336 - CLColorConvertKernel
1337 - CLEdgeNonMaxSuppressionKernel
1338 - CLEdgeTraceKernel
1339 - CLGaussianPyramidHorKernel
1340 - CLGaussianPyramidVertKernel
1341 - CLGradientKernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001342 - NEChannelCombineKernel
Georgios Pinitasc6f95102021-03-30 10:03:01 +01001343 - NEFillArrayKernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001344 - NEGaussianPyramidHorKernel
1345 - NEGaussianPyramidVertKernel
Georgios Pinitas09d34512018-08-30 16:02:11 +01001346 - NEHarrisScoreFP16Kernel
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001347 - NEHarrisScoreKernel
1348 - NEHOGDetectorKernel
Michalis Spyrou373b4072021-01-20 16:41:12 +00001349 - NELogits1DMaxKernel
Anthony Barbier3762e742018-03-02 11:49:33 +00001350 - NELogits1DShiftExpSumKernel
1351 - NELogits1DNormKernel
Michalis Spyrou473cb012021-02-23 11:48:12 +00001352 - NENonMaximaSuppression3x3FP16Kernel
1353 - NENonMaximaSuppression3x3Kernel
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001354
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001355v17.03.1 First Major public release of the sources
1356 - Renamed the library to arm_compute
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001357 - New CPP target introduced for C++ kernels shared between Arm® Neon™ and CL functions.
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001358 - New padding calculation interface introduced and ported most kernels / functions to use it.
1359 - New OpenCL kernels / functions:
Gian Marco Iodiceeb65f6d2020-04-15 11:42:15 +01001360 - CLGEMMLowpMatrixMultiplyKernel / CLGEMMLowp
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001361 - New Arm® Neon™ kernels / functions:
Anthony Barbier3762e742018-03-02 11:49:33 +00001362 - @ref NENormalizationLayerKernel / @ref NENormalizationLayer
Teresa Charlind1dc09c2021-03-04 15:24:45 +00001363 - NETransposeKernel / @ref NETranspose
Michalis Spyrou373b4072021-01-20 16:41:12 +00001364 - NELogits1DMaxKernel, NELogits1DShiftExpSumKernel, NELogits1DNormKernel / @ref NESoftmaxLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001365 - @ref NEIm2ColKernel, @ref NECol2ImKernel, NEConvolutionLayerWeightsReshapeKernel / @ref NEConvolutionLayer
Michele Di Giorgiof22f6722020-07-03 16:29:24 +01001366 - NEGEMMMatrixAccumulateBiasesKernel / @ref NEFullyConnectedLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001367 - @ref NEGEMMLowpMatrixMultiplyKernel / NEGEMMLowp
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001368
1369v17.03 Sources preview
1370 - New OpenCL kernels / functions:
Michalis Spyrou473cb012021-02-23 11:48:12 +00001371 - CLGradientKernel, CLEdgeNonMaxSuppressionKernel, CLEdgeTraceKernel / CLCannyEdge
Georgios Pinitas856f66e2021-04-22 21:13:21 +01001372 - GEMM refactoring + FP16 support: CLGEMMInterleave4x4Kernel, CLGEMMTranspose1xWKernel, CLGEMMMatrixMultiplyKernel, CLGEMMMatrixAdditionKernel / @ref CLGEMM
Michele Di Giorgiof6f78762020-07-06 11:27:21 +01001373 - CLGEMMMatrixAccumulateBiasesKernel / @ref CLFullyConnectedLayer
Teresa Charlin27886092021-02-25 20:15:01 +00001374 - CLTransposeKernel / @ref CLTranspose
Georgios Pinitasc6f95102021-03-30 10:03:01 +01001375 - CLLKTrackerInitKernel, CLLKTrackerStage0Kernel, CLLKTrackerStage1Kernel, CLLKTrackerFinalizeKernel / CLOpticalFlow
Anthony Barbier3762e742018-03-02 11:49:33 +00001376 - @ref CLNormalizationLayerKernel / @ref CLNormalizationLayer
Michalis Spyrou473cb012021-02-23 11:48:12 +00001377 - CLLaplacianPyramid, CLLaplacianReconstruct
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001378 - New Arm® Neon™ kernels / functions:
Michele Di Giorgiobd2c8e12021-01-19 15:29:02 +00001379 - NEActivationLayerKernel / @ref NEActivationLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001380 - GEMM refactoring + FP16 support (Requires armv8.2 CPU): @ref NEGEMMInterleave4x4Kernel, @ref NEGEMMTranspose1xWKernel, @ref NEGEMMMatrixMultiplyKernel, @ref NEGEMMMatrixAdditionKernel / @ref NEGEMM
Michele Di Giorgio19289042021-02-03 16:05:00 +00001381 - NEPoolingLayerKernel / @ref NEPoolingLayer
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001382
1383v17.02.1 Sources preview
1384 - New OpenCL kernels / functions:
Sang-Hoon Park201e0fe2021-01-27 13:14:56 +00001385 - CLLogits1DMaxKernel, CLLogits1DShiftExpSumKernel, CLLogits1DNormKernel / @ref CLSoftmaxLayer
Michele Di Giorgioe1314662021-02-01 17:09:32 +00001386 - CLPoolingLayerKernel / @ref CLPoolingLayer
Michalis Spyrou473cb012021-02-23 11:48:12 +00001387 - @ref CLIm2ColKernel, @ref CLCol2ImKernel, CLConvolutionLayerWeightsReshapeKernel / CLConvolutionLayer
Anthony Barbier3762e742018-03-02 11:49:33 +00001388 - @ref CLRemapKernel / @ref CLRemap
Michalis Spyrou473cb012021-02-23 11:48:12 +00001389 - CLGaussianPyramidHorKernel, CLGaussianPyramidVertKernel / CLGaussianPyramid, CLGaussianPyramidHalf, CLGaussianPyramidOrb
1390 - CLMinMaxKernel, CLMinMaxLocationKernel / CLMinMaxLocation
1391 - CLNonLinearFilterKernel / CLNonLinearFilter
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001392 - New Arm® Neon™ FP16 kernels (Requires armv8.2 CPU)
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001393 - NEAccumulateWeightedFP16Kernel
1394 - NEBox3x3FP16Kernel
Michalis Spyrou473cb012021-02-23 11:48:12 +00001395 - NENonMaximaSuppression3x3FP16Kernel
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001396
1397v17.02 Sources preview
1398 - New OpenCL kernels / functions:
Georgios Pinitasf47f7182021-01-15 09:29:50 +00001399 - CLActivationLayerKernel / @ref CLActivationLayer
Michalis Spyrou473cb012021-02-23 11:48:12 +00001400 - CLChannelCombineKernel / CLChannelCombine
1401 - CLDerivativeKernel / CLChannelExtract
1402 - CLFastCornersKernel / CLFastCorners
1403 - CLMeanStdDevKernel / CLMeanStdDev
Michele Di Giorgio33f41fa2021-03-09 14:09:08 +00001404 - New Arm® Neon™ kernels / functions:
Michalis Spyrou27e67f02021-02-16 11:34:39 +00001405 - HOG / SVM: NEHOGOrientationBinningKernel, NEHOGBlockNormalizationKernel, NEHOGDetectorKernel, NEHOGNonMaximaSuppressionKernel / NEHOGDescriptor, NEHOGDetector, NEHOGGradient, NEHOGMultiDetection
1406 - NENonLinearFilterKernel / NENonLinearFilter
Anthony Barbier6ff3b192017-09-04 18:44:23 +01001407 - Introduced a CLScheduler to manage the default context and command queue used by the runtime library and create synchronisation events.
1408 - Switched all the kernels / functions to use tensors instead of images.
1409 - Updated documentation to include instructions to build the library from sources.
1410
1411v16.12 Binary preview release
1412 - Original release
1413
Sheri Zhangd813bab2021-04-30 16:53:41 +01001414 */
1415} // namespace arm_compute