Add optimization for global pooling in pooling_layer.cl - Simplify the implementation when the pooling size has the same spatial dimensions of the input tensor - Rework the heuristic for F32/F16 - Add test for validating the global pooling path - Fix compare_dimensions in validation. The validation fails because we have different number of dimensions for NCHW and NHWC (e.g. 1,1,2,1(NCHW) -> 2,1,1,1(NHWC) Change-Id: Iba680cb30bf2a5d0952265a4cc9794f368549ca5 Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5510 Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com> Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com> Comments-Addressed: Arm Jenkins <bsgcomp@arm.com> Tested-by: Arm Jenkins <bsgcomp@arm.com>

commit: 40471d12a19088df4af6ad80e5c0437d724dd8fa [log] [tgz]
author: Gian Marco Iodice <gianmarco.iodice@arm.com> Mon Apr 26 08:39:28 2021 +0100
committer: Georgios Pinitas <georgios.pinitas@arm.com> Tue Apr 27 11:44:17 2021 +0000
tree: d17c921b0285d447d6055c7bd88e9962bf4e8f1d
parent: 3eb5d29de823f7dbe0dc6b3a882a7db5950428a3 [diff] [blame]
diff --git a/src/core/gpu/cl/kernels/ClPoolingKernel.cpp b/src/core/gpu/cl/kernels/ClPoolingKernel.cpp
index 7824340..a432877 100644
--- a/src/core/gpu/cl/kernels/ClPoolingKernel.cpp
+++ b/src/core/gpu/cl/kernels/ClPoolingKernel.cpp

@@ -173,9 +173,11 @@
         }
         case DataLayout::NHWC:
         {
+            const size_t vec_size = dst->data_type() == DataType::F32 ? 2 : 4;
+
             // Initialize border size
             border_size                       = BorderSize();
-            num_elems_processed_per_iteration = adjust_vec_size(4, dst->dimension(0));
+            num_elems_processed_per_iteration = adjust_vec_size(vec_size, dst->dimension(0));
             win                               = calculate_max_window(*dst, Steps(num_elems_processed_per_iteration));
             break;
         }
commit	40471d12a19088df4af6ad80e5c0437d724dd8fa	[log] [tgz]
author	Gian Marco Iodice <gianmarco.iodice@arm.com>	Mon Apr 26 08:39:28 2021 +0100
committer	Georgios Pinitas <georgios.pinitas@arm.com>	Tue Apr 27 11:44:17 2021 +0000
tree	d17c921b0285d447d6055c7bd88e9962bf4e8f1d
parent	3eb5d29de823f7dbe0dc6b3a882a7db5950428a3 [diff] [blame]