COMPMID-675: NEGEMMLowp Assembly Integration

Added support for S8 input in NEGEMMLowp Matrix Multiply Kernel.
Added a new function to run assembly kernels such that A*B=C (no offsets involved)
Added new tests for the assembly gemmlowp kernels (no offsets)
Integrated the assembly kernel for the A57

Change-Id: Ib3e39c1f3f7f1baa0d39be69485f61cd18e3c9b3
Reviewed-on: http://mpd-gerrit.cambridge.arm.com/95864
Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
Reviewed-by: Anthony Barbier <anthony.barbier@arm.com>
diff --git a/arm_compute/core/NEON/NEKernels.h b/arm_compute/core/NEON/NEKernels.h
index d78cec2..80fdaec 100644
--- a/arm_compute/core/NEON/NEKernels.h
+++ b/arm_compute/core/NEON/NEKernels.h
@@ -109,6 +109,7 @@
 #include "arm_compute/core/NEON/kernels/NEWeightsReshapeKernel.h"
 #include "arm_compute/core/NEON/kernels/arm32/NEGEMMAArch32Kernel.h"
 #include "arm_compute/core/NEON/kernels/arm64/NEGEMMAArch64Kernel.h"
+#include "arm_compute/core/NEON/kernels/arm64/NEGEMMLowpAArch64Kernel.h"
 #include "arm_compute/core/NEON/kernels/arm64/NEGEMMLowpAArch64V8P4Kernel.h"
 
 #endif /* __ARM_COMPUTE_NEKERNELS_H__ */