MLBEDSW-6640: Modify elementwise block size selection

Limit relative cost to 1 for elementwise operations since increasing
block size when the full ofm already fits gives no additional benefits.

Signed-off-by: Rickard Bolin <rickard.bolin@arm.com>
Change-Id: Ib6128f6346834fd916efa59adbe07a069dbda0ae
diff --git a/ethosu/vela/architecture_allocator.py b/ethosu/vela/architecture_allocator.py
index b5edcab..9cc22bb 100644
--- a/ethosu/vela/architecture_allocator.py
+++ b/ethosu/vela/architecture_allocator.py
@@ -330,7 +330,7 @@
 
                     # Scale relative to every output OFM element
                     if npu_op_type == NpuBlockType.ElementWise:
-                        relative_cost = ofm_shape.elements() / (height * width * depth)
+                        relative_cost = max(ofm_shape.elements() / (height * width * depth), 1)
                     else:
                         relative_cost = (ifm_fetch + weight_fetch) / ofm_shape.elements()