Integrate Dynamic Fusion patches

* Add public interfaces:
    * OperatorGraph: Describe a workload that could contain fused kernels
    * IWorkload: Generic interface for workloads built from OperatorGraph
    * ClWorkload: OpenCL workloads built from OperatorGraph
    * ClCompositeOperator: Runtime async operator to execute a ClWorkload
    * DependencyGraph (will likely be deprecated in later iterations)

* Add example
    * cl_fused_conv2d_elementwise_add.cpp to explain how to use the new
      interfaces

* Add internal translation layer

* Refactor ClKernelBuildingAPI
    * Remove non-tile based gemm native kernel component
    * Minor interface changes

* Add integration tests

Resolves COMPMID-5161

Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: Ib987ed79289ab0bcbd3130d54f5793408d9f1240
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/7510
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
diff --git a/arm_compute/core/experimental/Types.h b/arm_compute/core/experimental/Types.h
index c8755dc..1995ab0 100644
--- a/arm_compute/core/experimental/Types.h
+++ b/arm_compute/core/experimental/Types.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2020-2021 Arm Limited.
+ * Copyright (c) 2020-2022 Arm Limited.
  *
  * SPDX-License-Identifier: MIT
  *
@@ -41,20 +41,22 @@
     ACL_SRC_DST = 0,
 
     // Src
-    ACL_SRC   = 0,
-    ACL_SRC_0 = 0,
-    ACL_SRC_1 = 1,
-    ACL_SRC_2 = 2,
-    ACL_SRC_3 = 3,
-    ACL_SRC_4 = 4,
-    ACL_SRC_5 = 5,
-    ACL_SRC_6 = 6,
+    ACL_SRC     = 0,
+    ACL_SRC_0   = 0,
+    ACL_SRC_1   = 1,
+    ACL_SRC_2   = 2,
+    ACL_SRC_3   = 3,
+    ACL_SRC_4   = 4,
+    ACL_SRC_5   = 5,
+    ACL_SRC_6   = 6,
+    ACL_SRC_END = 6,
 
     // Dst
-    ACL_DST   = 30,
-    ACL_DST_0 = 30,
-    ACL_DST_1 = 31,
-    ACL_DST_2 = 32,
+    ACL_DST     = 30,
+    ACL_DST_0   = 30,
+    ACL_DST_1   = 31,
+    ACL_DST_2   = 32,
+    ACL_DST_END = 32,
 
     // Aux
     ACL_INT     = 50,