Fix output value for Reduction Ops MAX and MIN

In the prior version, the outputs from MAX and MIN Reduction operation
kernels would never be written to the output, because they are
stored in vec_res_value, not vec_res_value[1-4].

Also, the prior branch test is probably a mistake, because ARG_MIN_IDX
case is handled in the previous if statement, so that branch could
never actually get executed.

This commit adds a path that will correctly write the reduction
operation results to the ouput.

Signed-off-by: Nate Craun <nate@natecraun.net>
Change-Id: Id977a6240fbee4668426a9c6b487ae65fab246d2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2796
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
diff --git a/src/core/NEON/kernels/NEReductionOperationKernel.cpp b/src/core/NEON/kernels/NEReductionOperationKernel.cpp
index e2dee67..9b8c971 100644
--- a/src/core/NEON/kernels/NEReductionOperationKernel.cpp
+++ b/src/core/NEON/kernels/NEReductionOperationKernel.cpp
@@ -1047,7 +1047,7 @@
                 wrapper::vstore(reinterpret_cast<uint32_t *>(output.ptr()) + 8, vec_res_idx.val[2]);
                 wrapper::vstore(reinterpret_cast<uint32_t *>(output.ptr()) + 12, vec_res_idx.val[3]);
             }
-            else if(op == ReductionOperation::ARG_IDX_MIN)
+            else if(op == ReductionOperation::MIN || op == ReductionOperation::MAX)
             {
                 wrapper::vstore(reinterpret_cast<T *>(output.ptr()), vec_res_value);
             }