Fix precision issue in ChannelShuffleKernel

* Fixed the issue in NHWC Neon
* Fixed the rounding error in CL
* Added a new test case to reproduce the problem
* Resolves COMPMID-4831

Change-Id: I1613168cad580ca5acefe8ba340130af05cffaff
Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6454
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
diff --git a/src/core/CL/cl_kernels/nchw/channel_shuffle.cl b/src/core/CL/cl_kernels/nchw/channel_shuffle.cl
index 57d82e1..84396e1 100644
--- a/src/core/CL/cl_kernels/nchw/channel_shuffle.cl
+++ b/src/core/CL/cl_kernels/nchw/channel_shuffle.cl
@@ -33,7 +33,7 @@
 
 #define DIV_MOD_UINT(x, y, div_res, mod_res)                \
     ({                                                      \
-        div_res = (uint)((x) * (float)(1.0f / (float)(y))); \
+        div_res = (uint)((x)/(y)); \
         uint r  = div_res * (y);                            \
         mod_res = (x)-r;                                    \
     })
@@ -100,4 +100,4 @@
     (u1, 0, (__global DATA_TYPE *)(output_ptr + 1 * dst_stride_y));
 }
 
-#endif // defined(DATA_TYPE) && defined(VEC_SIZE) && defined(NUM_GROUPS) && defined(K) && defined(SRC_DIM_Z)
\ No newline at end of file
+#endif // defined(DATA_TYPE) && defined(VEC_SIZE) && defined(NUM_GROUPS) && defined(K) && defined(SRC_DIM_Z)