MLBEDSW-6716: Updates to estimate op SRAM usage

- The cascade builder estimates how much SRAM usage an operator
takes when calculating the cascades. If an elementwise operator
is included in a cascade the IFM2 will always be a constant/scalar
and the IFM2 will be in permanent memory and the size of the
IFM2 should not be included in the SRAM estimate.

- The scheduler did not take into account that IFM can be reused
for the OFM when calculating the op memory usage resulting in
a negative number for non-local memory usage. Corrected the
calculation and added assert to detect future problems.

Change-Id: Id7ec8fe1ec5560290f34579a7b9203a75067aba2
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
diff --git a/ethosu/vela/scheduler.py b/ethosu/vela/scheduler.py
index 73eb8b4..c7d08fb 100644
--- a/ethosu/vela/scheduler.py
+++ b/ethosu/vela/scheduler.py
@@ -57,6 +57,7 @@
 from .nn_graph import PassPlacement
 from .nn_graph import SchedulingStrategy
 from .nn_graph import Subgraph
+from .live_range import ofm_can_reuse_ifm
 from .numeric_util import round_down
 from .numeric_util import round_up
 from .operation import NpuBlockType
@@ -974,9 +975,11 @@
                 op_mem_usage = 0
             else:
                 # Min schedule only have ifm and ofm in SRAM (no buffered weigth tensors)
-                op_mem_usage = sched_op.ifm_size_in_bytes() + sched_op.ofm_size_in_bytes()
+                ofm_size = 0 if ofm_can_reuse_ifm(sched_op) else sched_op.ofm_size_in_bytes()
+                op_mem_usage = sched_op.ifm_size_in_bytes() + ofm_size
 
             non_local_mem_usage[sched_op] = min_schedule.memory_snapshot[time_index] - op_mem_usage
+            assert non_local_mem_usage[sched_op] >= 0
 
         # Crate cascades for Min schedule
         cascade_builder = CascadeBuilder(self.sched_ops, self.arch.is_spilling_enabled(), non_local_mem_usage)