MLBEDSW-7316: Fix crash for networks with resource variables

- The problem was that networks with resource variables have
not been thought of. The major problem was the graph traversal
where these ops were not visited resulting in an empty subgraph
that resulted in the crash.
- Fixed the problem by attaching virtual tensors to the ops simulating
subgraph output. These tensors are only used to get the graph
traversal to work.
- Fixed serializing of attribute container and shared_name
- Fixed subgraph index for operator CallOnce
- All resource variable ops are pushed to the CPU

Change-Id: I815f9c81baf7a3fbb686e895980b462f58208b6e
Signed-off-by: Johan Alfven <johan.alfven@arm.com>
diff --git a/ethosu/vela/pass_packing.py b/ethosu/vela/pass_packing.py
index 5c0d8eb..6049366 100644
--- a/ethosu/vela/pass_packing.py
+++ b/ethosu/vela/pass_packing.py
@@ -1,4 +1,4 @@
-# SPDX-FileCopyrightText: Copyright 2020-2022 Arm Limited and/or its affiliates <open-source-office@arm.com>
+# SPDX-FileCopyrightText: Copyright 2020-2023 Arm Limited and/or its affiliates <open-source-office@arm.com>
 #
 # SPDX-License-Identifier: Apache-2.0
 #
@@ -469,6 +469,8 @@
         #
         # 1) CPU passes that only depends on sg.input_tensor can be
         #    moved to the top of the list.
+        #    ResourceVariables ops like VarHandle, ReadVariable, CallOnce
+        #    can also be moved to the top of list.
         #
         # 2) A CPU pass X is allowed to be grouped together with CPU pass Y
         #    if there is no NPU pass between pass X and pass Y that depends
@@ -487,17 +489,20 @@
                 pass_list_top.insert(0, ps)
                 continue
 
-            if (
-                ps.placement == PassPlacement.Cpu
-                and ps.ops[0].ifm in sg.input_tensors
+            if ps.placement == PassPlacement.Cpu and (
+                ps.ops[0].ifm in sg.input_tensors
                 and (ps.ops[0].ifm2 in sg.input_tensors or ps.ops[0].ifm2 is None)
+                or (ps.ops[0].type in (Op.VarHandle, Op.ReadVariable, Op.CallOnce))
             ):
-                # This CPU pass only depends on sg.input_tensors
+                # This CPU pass only depends on sg.input_tensors or resource variable
                 pass_list_top.append(ps)
             else:
                 # Add pass to the list that will be sorted in the next step
                 pass_list.append(ps)
 
+        # Sort ops by op_index (same call order as in the original graph)
+        pass_list_top = sorted(pass_list_top, key=lambda ps: -1 if ps.ops[0].op_index is None else ps.ops[0].op_index)
+
         # Sort the rest of the list based on critera 2.
         # Search from bottom of list and when a CPU pass is found
         # search forward in the list and see if it is possible to join another CPU pass.