COMPMID-2600: Implement a new and generic depthwise convolution for CL QASYMM8 NHWC
The NCHW case is supported at function level by permuting the
inputs/outputs to NHWC.
This patch also removes CLDirectConvolutionLayerOutputStageKernel which
is deprecated and some kernels which were only used in the generic case
of depthwise convolution.
Change-Id: I91e0f02d0a2f4a4a352e08c248e648944137fe68
Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Reviewed-on: https://review.mlplatform.org/c/2056
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
diff --git a/docs/00_introduction.dox b/docs/00_introduction.dox
index 6e014e3..1210b2b 100644
--- a/docs/00_introduction.dox
+++ b/docs/00_introduction.dox
@@ -236,6 +236,13 @@
@subsection S2_2_changelog Changelog
+v19.11 Public major release
+ - Deprecated OpenCL kernels / functions
+ - CLDepthwiseConvolutionLayerReshapeWeightsGenericKernel
+ - CLDepthwiseIm2ColKernel
+ - CLDepthwiseVectorToTensorKernel
+ - CLDirectConvolutionLayerOutputStageKernel
+
v19.08 Public major release
- Various bug fixes.
- Various optimisations.
@@ -624,7 +631,7 @@
- Added fused batched normalization and activation to @ref CLBatchNormalizationLayer and @ref NEBatchNormalizationLayer
- Added support for non-square pooling to @ref NEPoolingLayer and @ref CLPoolingLayer
- New OpenCL kernels / functions:
- - @ref CLDirectConvolutionLayerOutputStageKernel
+ - CLDirectConvolutionLayerOutputStageKernel
- New NEON kernels / functions
- Added name() method to all kernels.
- Added support for Winograd 5x5.
@@ -746,7 +753,7 @@
- @ref NEReshapeLayerKernel / @ref NEReshapeLayer
- New OpenCL kernels / functions:
- - @ref CLDepthwiseConvolutionLayer3x3NCHWKernel @ref CLDepthwiseConvolutionLayer3x3NHWCKernel @ref CLDepthwiseIm2ColKernel @ref CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / @ref CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer
+ - @ref CLDepthwiseConvolutionLayer3x3NCHWKernel @ref CLDepthwiseConvolutionLayer3x3NHWCKernel CLDepthwiseIm2ColKernel CLDepthwiseVectorToTensorKernel CLDepthwiseWeightsReshapeKernel / @ref CLDepthwiseConvolutionLayer3x3 @ref CLDepthwiseConvolutionLayer CLDepthwiseSeparableConvolutionLayer
- @ref CLDequantizationLayerKernel / @ref CLDequantizationLayer
- @ref CLDirectConvolutionLayerKernel / @ref CLDirectConvolutionLayer
- @ref CLFlattenLayer