[ONCPUML-7] Improvement to Window::split_window

If the total passed to split window did not fit
nicely into the selected Dimensions size then
the size of of the window return will vary considerably
for different ids

This change means that the amount of work each id
will vary by the minimal amount.

For example:
If total was 10 and a Dimensions size was 19

With then with the old code :
	* id 0 - 8 would get back 1,
	* id 9 would get 10

With the new code:
	* id 0-8 would get 2
	* id 9 would get 1

Change-Id: I6b74b81d7ddcea06db7aa9fbaf8cb47a659994c1
Signed-off-by: Joseph Dobson <joseph.dobson@arm.com>
Reviewed-on: https://eu-gerrit-1.euhpc.arm.com/c/VisualCompute/ComputeLibrary/+/224448
Tested-by: bsgcomp <bsgcomp@arm.com>
Reviewed-by: Georgios Pinitas <georgios.pinitas@arm.com>
Comments-Addressed: bsgcomp <bsgcomp@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/2961
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
diff --git a/arm_compute/core/Window.inl b/arm_compute/core/Window.inl
index c213181..70c4f80 100644
--- a/arm_compute/core/Window.inl
+++ b/arm_compute/core/Window.inl
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2016-2019 ARM Limited.
+ * Copyright (c) 2016-2020 ARM Limited.
  *
  * SPDX-License-Identifier: MIT
  *
@@ -197,18 +197,30 @@
     {
         if(d == dimension)
         {
-            int start          = _dims[d].start();
-            int end            = _dims[d].end();
-            int per_sub_window = (num_iterations(d) / total) * _dims[d].step();
+            int start        = _dims[d].start();
+            int end          = _dims[d].end();
+            const int step   = _dims[d].step();
 
-            start += id * per_sub_window;
+            const int num_it = num_iterations(d);
+            const int rem    = num_it % total;
+            int work         = num_it / total;
 
-            if(id != total - 1)
+            int it_start     = work * id;
+
+            if(int(id) < rem)
             {
-                end = start + per_sub_window;
+                ++work;
+                it_start += id;
+            }
+            else
+            {
+                it_start += rem;
             }
 
-            out.set(d, Dimension(start, end, _dims[d].step()));
+            start += it_start * step;
+            end = std::min(end, start + work * step);
+
+            out.set(d, Dimension(start, end, step));
         }
         else
         {