COMPMID-3464: Address NESoftmaxLayer failures for QASYMM8_SIGNED

Normalization with the maximum value was causing results to wrap-around
As a work-around we use saturating intrinsics to perform the operation

Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I719b7ac7ad274dc2ae339bc4a055f9200134ed97
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3184
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
diff --git a/src/core/NEON/kernels/NESoftmaxLayerKernel.cpp b/src/core/NEON/kernels/NESoftmaxLayerKernel.cpp
index 790c8ba..41bf03a 100644
--- a/src/core/NEON/kernels/NESoftmaxLayerKernel.cpp
+++ b/src/core/NEON/kernels/NESoftmaxLayerKernel.cpp
@@ -311,7 +311,7 @@
             for(; x <= (input_width - vec_size); x += vec_size)
             {
                 auto vec_elements     = wrapper::vloadq(in_ptr + x);
-                vec_elements          = wrapper::vsub(vec_max, vec_elements);
+                vec_elements          = wrapper::vqsub(vec_max, vec_elements);
                 auto vec_elements_flt = convert_int_to_float<float32x4x4_t>(vec_elements);
 
                 if(is_log)