COMPMID-2568: NEON Convolution layer failure
Fixing the output tile size used when one of height or width is less
than 4. Also added a test case that stresses this out.

Change-Id: I99bb689f26aef713f8206c7d702f9fcf1017af58
Signed-off-by: giuros01 <giuseppe.rossini@arm.com>
Reviewed-on: https://review.mlplatform.org/c/1744
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
diff --git a/src/runtime/NEON/functions/NEWinogradConvolutionLayer.cpp b/src/runtime/NEON/functions/NEWinogradConvolutionLayer.cpp
index 01cdff6..e699ad1 100644
--- a/src/runtime/NEON/functions/NEWinogradConvolutionLayer.cpp
+++ b/src/runtime/NEON/functions/NEWinogradConvolutionLayer.cpp
@@ -182,7 +182,7 @@
     Size2D output_tile = Size2D{};
     if(kernel_dims == Size2D(3U, 3U))
     {
-        output_tile = (input_dims.width <= 4 && input_dims.height <= 4) ? Size2D(2U, 2U) : Size2D(4U, 4U);
+        output_tile = (input_dims.width <= 4 || input_dims.height <= 4) ? Size2D(2U, 2U) : Size2D(4U, 4U);
     }
     else if(kernel_dims == Size2D(5U, 5U))
     {