COMPMID-1271: New system for GEMM heuristics

This patch implements a system for separating the "validity" from
"preferred" aspect of the current heuristics in gemm_*.cpp.

Now, each gemm_*.cpp defines a list of candidate implementations,
each of which supplies an is_valid() function (to check for
validity), an is_preferred() function (the "heuristic" part), and an
instantiate() function which actually produces the GemmCommon object
pointer.

The actual gemm() function is now templated and uses this list to
select an implementation.  This patch also implements a mechanism to
identify the preferred implementation, and override it via the
GemmConfig structure.

Change-Id: Id49ab7af8bf2e3e9fd951a9698883ade234d40e1
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/139120
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
Tested-by: Jenkins <bsgcomp@arm.com>
diff --git a/src/core/NEON/kernels/arm_gemm/gemv_batched.hpp b/src/core/NEON/kernels/arm_gemm/gemv_batched.hpp
index d91b44b..d65971e 100644
--- a/src/core/NEON/kernels/arm_gemm/gemv_batched.hpp
+++ b/src/core/NEON/kernels/arm_gemm/gemv_batched.hpp
@@ -36,11 +36,12 @@
     UniqueGemmCommon<To, Tr> _subgemm = nullptr;
 
 public:
-    GemvBatched(const CPUInfo &ci, const unsigned int M, const unsigned int N, const unsigned int K,
-                const unsigned int nbatches, const unsigned int nmulti, const bool trA, const bool trB,
-                const To alpha, const To beta, const int maxthreads, const bool pretransposed_hint) {
+    GemvBatched(const GemmArgs<Tr> &args) {
         /* Just create a subgemm with batches->M */
-        _subgemm = gemm<To,Tr>(ci, nbatches, N, K, 1, nmulti, trA, trB, alpha, beta, maxthreads, pretransposed_hint);
+        GemmArgs<Tr> newargs = args;
+        newargs._Msize = args._nbatches;
+        newargs._nbatches = 1;
+        _subgemm = gemm<To,Tr>(newargs, nullptr);
     }
 
     void set_arrays(const To *A, const int lda, const int A_batch_stride, const int A_multi_stride,