Port DepthwiseConv2d operator to Ckw

 - Only support 1x1 blocks, i.e. n0=1, m0=1.
 - Dilation not supported yet.

Resolves: COMPMID-6258

Signed-off-by: ramy.elgammal@arm.com <ramy.elgammal@arm.com>
Change-Id: I1dcfd7640fb40e112736dedc81847f7b1b50dba2
Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10411
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox
index 04ecc80..3e04837 100644
--- a/docs/user_guide/release_version_and_change_log.dox
+++ b/docs/user_guide/release_version_and_change_log.dox
@@ -47,13 +47,13 @@
    - Add support for output data type S64 in NEArgMinMaxLayer and CLArgMinMaxLayer
    - Port the following kernels in the experimental Dynamic Fusion interface to use the new Compute Kernel Writer interface:
      - @ref experimental::dynamic_fusion::GpuCkwResize
+     - @ref experimental::dynamic_fusion::GpuCkwPool2d
+     - @ref experimental::dynamic_fusion::GpuCkwDepthwiseConv2d
    - Add support for OpenCL™ comand buffer with mutable dispatch extension.
  - Update OpenCL™ API headers to v2023.04.17.
  - Remove legacy PostOps interface. PostOps was the experimental interface for kernel fusion and is replaced by the new Dynamic Fusion interface.
  - Performance optimizations:
    - Optimize @ref cpu::CpuReshape
- - Port the following kernels in the experimental Dynamic Fusion interface to use the new Compute Kernel Writer interface with support for FP16/FP32 only:
-   - @ref experimental::dynamic_fusion::GpuCkwPool2d
  - Add new OpenCL™ kernels:
    - @ref opencl::kernels::ClMatMulLowpNativeMMULKernel support for QASYMM8 and QASYMM8_SIGNED, with batch support
  - Deprecate support for Bfloat16 in @ref cpu::CpuCast.