Use the stable CKW API in the GPU dynamic fusion backend

- Refactor all kernels to work with the CKW stable API
- Add support for sub-tile in the op_load/op_store CKW operator
- Fix mismatch in resize
- Add comments in all kernels written with CKW to help developers
understand the structure of the code
- Add texture image support in depthwise convolution written with CKW
- Add support for different block sizes in depthwise convolution
- Remove the use of the dynamic fusion helper functions.
- Add support for floor in the op_unary() of CKW

Resolves: COMPMID-6708, COMPMID-6743, COMPMID-6530

Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>

Change-Id: I8104ce4d04a3138a1aeb0b84940e1f1c89e76069
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10914
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
diff --git a/SConscript b/SConscript
index 100bb54..f0c4297 100644
--- a/SConscript
+++ b/SConscript
@@ -137,7 +137,7 @@
 
 
 def get_ckw_obj_list():
-    cmake_obj_dir = os.path.abspath("prototype/CMakeFiles/ckw_prototype.dir/src")
+    cmake_obj_dir = os.path.abspath("CMakeFiles/ckw.dir/src")
     return recursive_glob(root_dir=cmake_obj_dir, pattern=".*.o$")
 
 
@@ -163,7 +163,7 @@
     else:
         # Always statically link Compute Library against CKW
         if env['experimental_dynamic_fusion'] and name == "arm_compute":
-            libs.append('libckw_prototype.a')
+            libs.append('libckw.a')
 
         # Add shared library versioning
         if env['set_soname']: