COMPMID-3281: Implement QSYMM16 Layer Normalization for NEON QLSTM

- Reference kernel is modified to use the same algorithm as NEON kernel.
- NEON kernel is implemented.
- Tests for validation and run are added.

Change-Id: I3533bc2bd12c6e9cc75d837ecf193f74ceddf796
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2948
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
diff --git a/Android.bp b/Android.bp
index 0cb0b77..528467a 100644
--- a/Android.bp
+++ b/Android.bp
@@ -322,6 +322,7 @@
         "src/core/NEON/kernels/NEPixelWiseMultiplicationKernel.cpp",
         "src/core/NEON/kernels/NEPoolingLayerKernel.cpp",
         "src/core/NEON/kernels/NEPriorBoxLayerKernel.cpp",
+        "src/core/NEON/kernels/NEQLSTMLayerNormalizationKernel.cpp",
         "src/core/NEON/kernels/NEQuantizationLayerKernel.cpp",
         "src/core/NEON/kernels/NEROIAlignLayerKernel.cpp",
         "src/core/NEON/kernels/NEROIPoolingLayerKernel.cpp",