COMPMID-421: Fixed FP16 support in Neon GEMM.

Fixed GEMM FP16 problem with matrices that are not multiple of 32.
Added a new test suite NEON/GEMM/Float16/SmallGEMM.
Implemented FP16 function to multiply vector by a matrix.

Change-Id: Ie6c692885a48d0206bd6fe748332fa83bc286d67
Reviewed-on: http://mpd-gerrit.cambridge.arm.com/79118
Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
Reviewed-by: Moritz Pflanzer <moritz.pflanzer@arm.com>
diff --git a/scripts/clang-tidy.h b/scripts/clang-tidy.h
index 32b0f69..cbc0d07 100644
--- a/scripts/clang-tidy.h
+++ b/scripts/clang-tidy.h
@@ -1,5 +1,30 @@
 #include <arm_neon.h>
 
+inline float16x8_t vmulq_lane_f16 (float16x8_t, float16x4_t, const int)
+{
+  return vdupq_n_f16(0);
+}
+
+inline float16x4_t vmul_f16 (float16x4_t, float16x4_t)
+{
+  return vdup_n_u16(0);
+}
+
+inline float16x4_t vadd_f16 (float16x4_t, float16x4_t)
+{
+  return vdup_n_u16(0);
+}
+
+inline float16x4_t vmul_lane_f16 (float16x4_t, float16x4_t, const int)
+{
+  return vdup_n_u16(0);
+}
+
+inline float16x4_t vmul_n_f16 (float16x4_t, float16_t)
+{
+  return vdup_n_u16(0);
+}
+
 inline float16x8_t vcvtq_f16_u16(uint16x8_t)
 {
   return vdupq_n_f16(0);