COMPMID-2600: Implement a new and generic depthwise convolution for CL QASYMM8 NHWC The NCHW case is supported at function level by permuting the inputs/outputs to NHWC. This patch also removes CLDirectConvolutionLayerOutputStageKernel which is deprecated and some kernels which were only used in the generic case of depthwise convolution. Change-Id: I91e0f02d0a2f4a4a352e08c248e648944137fe68 Signed-off-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-on: https://review.mlplatform.org/c/2056 Reviewed-by: Giorgio Arena <giorgio.arena@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com> Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>

commit: a046e164b96a8441b2fa14ef578f7db46a0e97da [log] [tgz]
author: Michele Di Giorgio <michele.digiorgio@arm.com> Tue Oct 08 09:36:26 2019 +0100
committer: Michele Di Giorgio <michele.digiorgio@arm.com> Tue Oct 15 10:27:18 2019 +0000
tree: 9fa2b7e003342b608acd3ed627f47f9d027ef72c
parent: 76c996f3b240eb1f60a566e5b0a5e61fe363685a [diff] [blame]
diff --git a/src/core/CL/cl_kernels/helpers_asymm.h b/src/core/CL/cl_kernels/helpers_asymm.h
index 53e6719..57ecccc 100644
--- a/src/core/CL/cl_kernels/helpers_asymm.h
+++ b/src/core/CL/cl_kernels/helpers_asymm.h

@@ -381,11 +381,13 @@
 DEQUANTIZE_IMPL(ushort, 4)
 DEQUANTIZE_IMPL(short, 4)
 
+ASYMM_ROUNDING_DIVIDE_BY_POW2_IMPL(1)
 ASYMM_ROUNDING_DIVIDE_BY_POW2_IMPL(2)
 ASYMM_ROUNDING_DIVIDE_BY_POW2_IMPL(4)
 ASYMM_ROUNDING_DIVIDE_BY_POW2_IMPL(8)
 ASYMM_ROUNDING_DIVIDE_BY_POW2_IMPL(16)
 
+ASYMM_MULT_IMPL(1)
 ASYMM_MULT_IMPL(2)
 ASYMM_MULT_IMPL(4)
 ASYMM_MULT_IMPL(8)
commit	a046e164b96a8441b2fa14ef578f7db46a0e97da	[log] [tgz]
author	Michele Di Giorgio <michele.digiorgio@arm.com>	Tue Oct 08 09:36:26 2019 +0100
committer	Michele Di Giorgio <michele.digiorgio@arm.com>	Tue Oct 15 10:27:18 2019 +0000
tree	9fa2b7e003342b608acd3ed627f47f9d027ef72c
parent	76c996f3b240eb1f60a566e5b0a5e61fe363685a [diff] [blame]