Port the ClGemmLowp kernels to the new API
Ported kernels:
- CLGEMMLowpMatrixMultiplyNativeKernel
- CLGEMMLowpMatrixMultiplyReshapedKernel
- CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
- CLGEMMLowpOffsetContributionKernel
- CLGEMMLowpOffsetContributionOutputStageKernel
- CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
- CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
- CLGEMMLowpQuantizeDownInt32ScaleKernel
Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
Change-Id: I9d5a744d6a2dd2f2726fdfb291bad000b6970de2
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5870
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
diff --git a/docs/user_guide/release_version_and_change_log.dox b/docs/user_guide/release_version_and_change_log.dox
index eb4c280..0c8b57f 100644
--- a/docs/user_guide/release_version_and_change_log.dox
+++ b/docs/user_guide/release_version_and_change_log.dox
@@ -227,7 +227,7 @@
- @ref CLLogSoftmaxLayer
- GCSoftmaxLayer
- New OpenCL kernels / functions:
- - @ref CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
+ - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
- @ref CLLogicalNot
- @ref CLLogicalAnd
- @ref CLLogicalOr
@@ -260,13 +260,13 @@
- @ref CLBatchNormalizationLayerKernel
- CLPoolingLayerKernel
- CLWinogradInputTransformKernel
- - @ref CLGEMMLowpMatrixMultiplyNativeKernel
- - @ref CLGEMMLowpMatrixAReductionKernel
- - @ref CLGEMMLowpMatrixBReductionKernel
- - @ref CLGEMMLowpOffsetContributionOutputStageKernel
- - @ref CLGEMMLowpOffsetContributionKernel
+ - CLGEMMLowpMatrixMultiplyNativeKernel
+ - CLGEMMLowpMatrixAReductionKernel
+ - CLGEMMLowpMatrixBReductionKernel
+ - CLGEMMLowpOffsetContributionOutputStageKernel
+ - CLGEMMLowpOffsetContributionKernel
- CLWinogradOutputTransformKernel
- - @ref CLGEMMLowpMatrixMultiplyReshapedKernel
+ - CLGEMMLowpMatrixMultiplyReshapedKernel
- @ref CLFuseBatchNormalizationKernel
- @ref CLDepthwiseConvolutionLayerNativeKernel
- CLDepthConvertLayerKernel
@@ -281,11 +281,11 @@
- CLLogits1DNormKernel
- CLHeightConcatenateLayerKernel
- CLGEMMMatrixMultiplyKernel
- - @ref CLGEMMLowpQuantizeDownInt32ScaleKernel
- - @ref CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
- - @ref CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
+ - CLGEMMLowpQuantizeDownInt32ScaleKernel
+ - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
+ - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
- CLDepthConcatenateLayerKernel
- - @ref CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
+ - CLGEMMLowpQuantizeDownInt32ScaleByFixedPointKernel
- Removed OpenCL kernels / functions:
- CLGEMMLowpQuantizeDownInt32ToInt16ScaleByFixedPointKernel
- CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel
@@ -596,9 +596,9 @@
- @ref CLDeconvolutionLayer
- @ref CLDirectDeconvolutionLayer
- @ref CLGEMMDeconvolutionLayer
- - @ref CLGEMMLowpMatrixMultiplyReshapedKernel
- - @ref CLGEMMLowpQuantizeDownInt32ScaleKernel
- - @ref CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
+ - CLGEMMLowpMatrixMultiplyReshapedKernel
+ - CLGEMMLowpQuantizeDownInt32ScaleKernel
+ - CLGEMMLowpQuantizeDownInt32ScaleByFloatKernel
- @ref CLReductionOperation
- @ref CLReduceMean
- @ref NEScale
@@ -655,9 +655,9 @@
- @ref CLDepthwiseConvolutionLayer
- CLDepthwiseConvolutionLayer3x3
- @ref CLGEMMConvolutionLayer
- - @ref CLGEMMLowpMatrixMultiplyCore
- - @ref CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
- - @ref CLGEMMLowpMatrixMultiplyNativeKernel
+ - CLGEMMLowpMatrixMultiplyCore
+ - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
+ - CLGEMMLowpMatrixMultiplyNativeKernel
- @ref NEActivationLayer
- NEComparisonOperationKernel
- @ref NEConvolutionLayer
@@ -680,7 +680,7 @@
- @ref NESplit
- New OpenCL kernels / functions:
- @ref CLFill
- - CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / @ref CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint
+ - CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / CLGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint
- New Arm® Neon™ kernels / functions:
- @ref NEFill
- NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPointKernel / NEGEMMLowpQuantizeDownInt32ToInt8ScaleByFixedPoint
@@ -861,7 +861,7 @@
- @ref CLFFTDigitReverseKernel
- @ref CLFFTRadixStageKernel
- @ref CLFFTScaleKernel
- - @ref CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
+ - CLGEMMLowpMatrixMultiplyReshapedOnlyRHSKernel
- CLGEMMMatrixMultiplyReshapedOnlyRHSKernel
- CLHeightConcatenateLayerKernel
- @ref CLDirectDeconvolutionLayer
@@ -953,7 +953,7 @@
- @ref CLRangeKernel / @ref CLRange
- @ref CLUnstack
- @ref CLGatherKernel / @ref CLGather
- - @ref CLGEMMLowpMatrixMultiplyReshapedKernel
+ - CLGEMMLowpMatrixMultiplyReshapedKernel
- New CPP kernels / functions:
- @ref CPPDetectionOutputLayer
- @ref CPPTopKV / @ref CPPTopKVKernel
@@ -1247,8 +1247,8 @@
- NEWinogradLayer / NEWinogradLayerKernel
- New OpenCL kernels / functions
- - @ref CLGEMMLowpOffsetContributionKernel / @ref CLGEMMLowpMatrixAReductionKernel / @ref CLGEMMLowpMatrixBReductionKernel / @ref CLGEMMLowpMatrixMultiplyCore
- - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / @ref CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
+ - CLGEMMLowpOffsetContributionKernel / CLGEMMLowpMatrixAReductionKernel / CLGEMMLowpMatrixBReductionKernel / CLGEMMLowpMatrixMultiplyCore
+ - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel / CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPoint
- New graph nodes for Arm® Neon™ and OpenCL
- graph::BranchLayer