Connect CLMatMul function to quantized kernels and resolve NE BatchMatMul int_8 failures

* Adapt the CLMatMul function and ClMatMul operator to use quantized kernels.
* Add function-level tests.

Resolves: COMPMID-5929 and COMPMID-5811

Change-Id: I5348cdcf07b8074c138e04dfef0a73399377accd
Signed-off-by: Jakub Sujak <jakub.sujak@arm.com>
Signed-off-by: Omar Al Khatib <omar.alkhatib@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/9575
Reviewed-by: Mohmun02 <MohammedSuhail.Munshi@arm.com>
Reviewed-by: SiCong Li <sicong.li@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
diff --git a/src/gpu/cl/kernels/ClMatMulNativeKernel.h b/src/gpu/cl/kernels/ClMatMulNativeKernel.h
index 50aa3b7..f706256 100644
--- a/src/gpu/cl/kernels/ClMatMulNativeKernel.h
+++ b/src/gpu/cl/kernels/ClMatMulNativeKernel.h
@@ -47,17 +47,17 @@
      *                             Dimensions above 2 are collapsed onto dimension 2 and represent the batch.
      * @param[in]  rhs             Input tensor for the RHS matrix. Data type supported: same as @p lhs.
      *                             Dimensions above 2 are collapsed onto dimension 2 and represent the batch.
-     * @param[out] output          Output tensor info. Data type supported: same as @p lhs
+     * @param[out] dst             Output tensor info. Data type supported: same as @p lhs
      * @param[in]  matmul_info     Attributes for Batch MatMul Kernel
      */
-    void configure(const ClCompileContext &compile_context, ITensorInfo *lhs, ITensorInfo *rhs, ITensorInfo *output, const MatMulKernelInfo &matmul_info);
+    void configure(const ClCompileContext &compile_context, ITensorInfo *lhs, ITensorInfo *rhs, ITensorInfo *dst, const MatMulKernelInfo &matmul_info);
     /** Static function to check if given info will lead to a valid configuration
      *
      * Similar to @ref ClMatMulNativeKernel::configure()
      *
      * @return a status
      */
-    static Status validate(const ITensorInfo *lhs, const ITensorInfo *rhs, const ITensorInfo *output, const MatMulKernelInfo &matmul_info);
+    static Status validate(const ITensorInfo *lhs, const ITensorInfo *rhs, const ITensorInfo *dst, const MatMulKernelInfo &matmul_info);
 
     // Inherited methods overridden:
     void run_op(ITensorPack &tensors, const Window &window, cl::CommandQueue &queue) override;