Improve selection speed of CPU implementations
CPU micro-kernel to be used was picked during kernel execution.
Move selection during configuration to reduce runtime overhead.
Standardize kernel names as follows:
<simd_tech>_<data_type>_<data_layout>_<kernel_name>
e.g. sve_fp32_nhwc_scale
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I544f1c08c8fef0f130a3bde61882ccb9a1f47f21
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5855
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
diff --git a/src/core/cpu/kernels/CpuFloorKernel.h b/src/core/cpu/kernels/CpuFloorKernel.h
index 2680871..78534d2 100644
--- a/src/core/cpu/kernels/CpuFloorKernel.h
+++ b/src/core/cpu/kernels/CpuFloorKernel.h
@@ -45,10 +45,9 @@
* @param[out] dst Destination tensor. Same as @p src
*/
void configure(const ITensorInfo *src, ITensorInfo *dst);
- /** Static function to check if given info will lead to a valid configuration of @ref CpuFloorKernel
+ /** Static function to check if given info will lead to a valid configuration
*
- * @param[in] src Source tensor info. Data type supported: F16/F32.
- * @param[in] dst Destination tensor info. Same as @p src
+ * Similar to CpuFloorKernel::configure()
*
* @return a status
*/
@@ -65,6 +64,13 @@
// Inherited methods overridden:
void run_op(ITensorPack &tensors, const Window &window, const ThreadInfo &info) override;
const char *name() const override;
+
+private:
+ using FloorUKernelPtr = std::add_pointer<void(const void *, void *, int)>::type;
+
+private:
+ FloorUKernelPtr _run_method{ nullptr };
+ std::string _name{};
};
} // namespace kernels
} // namespace cpu