Use the stable CKW API in the GPU dynamic fusion backend

- Refactor all kernels to work with the CKW stable API
- Add support for sub-tile in the op_load/op_store CKW operator
- Fix mismatch in resize
- Add comments in all kernels written with CKW to help developers
understand the structure of the code
- Add texture image support in depthwise convolution written with CKW
- Add support for different block sizes in depthwise convolution
- Remove the use of the dynamic fusion helper functions.
- Add support for floor in the op_unary() of CKW

Resolves: COMPMID-6708, COMPMID-6743, COMPMID-6530

Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>

Change-Id: I8104ce4d04a3138a1aeb0b84940e1f1c89e76069
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10914
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
diff --git a/.clang-tidy b/.clang-tidy
index e789636..b0d22e5 100644
--- a/.clang-tidy
+++ b/.clang-tidy
@@ -1,9 +1,9 @@
 ---
-Checks:          'clang-diagnostic-*,clang-analyzer-*,*,-abseil-*,-fuchsia-*,-bugprone-*,-hicpp-*,-cppcoreguidelines-pro-bounds-pointer-arithmetic,-cppcoreguidelines-pro-bounds-array-to-pointer-decay,-cppcoreguidelines-pro-bounds-constant-array-index,-cert-err58-cpp,-cppcoreguidelines-pro-type-reinterpret-cast,-google-runtime-references,-google-build-using-namespace,-readability-redundant-member-init,-readability-redundant-declaration,-readability-else-after-return,-performance-type-promotion-in-math-fn,-cert-err60-cpp,-cppcoreguidelines-narrowing-conversions,-readability-magic-numbers,-cppcoreguidelines-avoid-magic-numbers,-readability-named-parameter,-readability-implicit-bool-conversion,-readability-uppercase-literal-suffix,-clang-analyzer-optin.cplusplus.VirtualCall,-cppcoreguidelines-macro-usage'
+Checks:          'clang-diagnostic-*,*,-llvm-include-order,clang-analyzer-*,-abseil-*,-fuchsia-*,-bugprone-*,-hicpp-*,-cppcoreguidelines-pro-bounds-pointer-arithmetic,-cppcoreguidelines-pro-bounds-array-to-pointer-decay,-cppcoreguidelines-pro-bounds-constant-array-index,-cert-err58-cpp,-cppcoreguidelines-pro-type-reinterpret-cast,-google-runtime-references,-google-build-using-namespace,-readability-redundant-member-init,-readability-redundant-declaration,-readability-else-after-return,-performance-type-promotion-in-math-fn,-cert-err60-cpp,-cppcoreguidelines-narrowing-conversions,-readability-magic-numbers,-cppcoreguidelines-avoid-magic-numbers,-readability-named-parameter,-readability-implicit-bool-conversion,-readability-uppercase-literal-suffix,-clang-analyzer-optin.cplusplus.VirtualCall,-cppcoreguidelines-macro-usage'
 WarningsAsErrors: ''
 HeaderFilterRegex: ''
 AnalyzeTemporaryDtors: false
-CheckOptions:    
+CheckOptions:
   - key:             cert-dcl16-c.IgnoreMacros
     value:           '1'
   - key:             cert-dcl16-c.NewSuffixes
@@ -211,4 +211,3 @@
   - key:             zircon-temporary-objects.Names
     value:           ''
 ...
-