Skip upsampling for deconvolution when not needed

If the input tensor's stride is 1 and the kernel size is 1x1,
skip upsampling step and pass the input tensor pointer for
convolution directly.

Partially resolve: [ONCPUML-1137]

Change-Id: I9de9444ff99cf35d44a51ccbe0fa6facc1035d27
Signed-off-by: Annop Wongwathanarat <annop.wongwathanarat@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8994
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
diff --git a/arm_compute/runtime/NEON/functions/NEDeconvolutionLayer.h b/arm_compute/runtime/NEON/functions/NEDeconvolutionLayer.h
index 15124d6..869df69 100644
--- a/arm_compute/runtime/NEON/functions/NEDeconvolutionLayer.h
+++ b/arm_compute/runtime/NEON/functions/NEDeconvolutionLayer.h
@@ -148,6 +148,7 @@
     ITensor           *_input;
     PadStrideInfo      _info;
     bool               _is_prepared;
+    bool               _do_upsampling;
 };
 } // arm_compute
 #endif /* ARM_COMPUTE_NEDECONVOLUTIONLAYER_H */