Add broadcast batched matmul validation cases

Related to: COMPMID-5660

Signed-off-by: SiCong Li <sicong.li@arm.com>
Change-Id: I2314c8b21acc638402c77080d59db2f3fed58fe2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8911
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Mohmun02 <MohammedSuhail.Munshi@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
diff --git a/arm_compute/runtime/CL/functions/CLGEMM.h b/arm_compute/runtime/CL/functions/CLGEMM.h
index 38a07ef..b267bf1 100644
--- a/arm_compute/runtime/CL/functions/CLGEMM.h
+++ b/arm_compute/runtime/CL/functions/CLGEMM.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2016-2021 Arm Limited.
+ * Copyright (c) 2016-2021, 2023 Arm Limited.
  *
  * SPDX-License-Identifier: MIT
  *
@@ -77,6 +77,9 @@
      *
      * @note Whilst the first input tensor can be a vector, the second input tensor must be at least a matrix
      *
+     * @note Batched GEMM only allows RHS tensor's rank to be <= 3
+     * @note Batched GEMM only supports broadcasting cases where RHS rank < LHS rank but not the other way around
+     *
      * @param[in]  compile_context The compile context to be used.
      * @param[in]  a               First input tensor  (Matrix or Vector A). Data types supported: F16/F32
      * @param[in]  b               Second input tensor (Matrix B). Data type supported: same as @p a.