Add SVE support and decouple data type for NEScaleKernel
- Decouple data type for NEON NHWC implementation, supported data types are: fp32, fp16, u8, s16, qasymm8, qasymm8_signed.
- Add SVE support for NHWC and all six data types showed above.
Resolves: COMPMID-3873
Change-Id: I097de119f4667b28b025a78cadf7185afa5f15f0
Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/4766
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Pablo Marquez Tello <pablo.tello@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
diff --git a/src/core/helpers/ScaleHelpers.h b/src/core/helpers/ScaleHelpers.h
index 827bbef..f19a8b8 100644
--- a/src/core/helpers/ScaleHelpers.h
+++ b/src/core/helpers/ScaleHelpers.h
@@ -1,5 +1,5 @@
/*
-* Copyright (c) 2020 Arm Limited.
+* Copyright (c) 2020-2021 Arm Limited.
*
* SPDX-License-Identifier: MIT
*
@@ -325,6 +325,32 @@
// Return average
return sum / (x_elements * y_elements);
}
+
+/** Computes bilinear interpolation using the top-left, top-right, bottom-left, bottom-right pixels and the pixel's distance between
+ * the real coordinates and the smallest following integer coordinates.
+ *
+ * @param[in] a00 The top-left pixel value.
+ * @param[in] a01 The top-right pixel value.
+ * @param[in] a10 The bottom-left pixel value.
+ * @param[in] a11 The bottom-right pixel value.
+ * @param[in] dx Pixel's distance between the X real coordinate and the smallest X following integer
+ * @param[in] dy Pixel's distance between the Y real coordinate and the smallest Y following integer
+ *
+ * @note dx and dy must be in the range [0, 1.0]
+ *
+ * @return The bilinear interpolated pixel value
+ */
+inline float delta_bilinear(float a00, float a01, float a10, float a11, float dx_val, float dy_val)
+{
+ const float dx1_val = 1.0f - dx_val;
+ const float dy1_val = 1.0f - dy_val;
+
+ const float w1 = dx1_val * dy1_val;
+ const float w2 = dx_val * dy1_val;
+ const float w3 = dx1_val * dy_val;
+ const float w4 = dx_val * dy_val;
+ return a00 * w1 + a01 * w2 + a10 * w3 + a11 * w4;
+}
} // namespace scale_helpers
} // namespace arm_compute