Optimize Transposed Convolution for CL backend (Quantized)

This patch optimizes transposed convolution for QASYMM and QASYMM8_SIGNED types, by extending the transposed convolution kernel written for FP32/16.

Resolves: COMPMID-5723
Change-Id: Iab8f09231938adb949c506fd915ed45b885e5c7c
Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/8792
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
diff --git a/src/gpu/cl/operators/ClTransposedConvolution.h b/src/gpu/cl/operators/ClTransposedConvolution.h
index bc04387..58ebc68 100644
--- a/src/gpu/cl/operators/ClTransposedConvolution.h
+++ b/src/gpu/cl/operators/ClTransposedConvolution.h
@@ -57,11 +57,11 @@
      *
      * @param[in]  compile_context The compile context to be used.
      * @param[in]  input           Input tensor info with dimensions [IFM, width, height, batch]
-     *                             Data types supported: F16/F32.
+     *                             Data types supported: F16/F32/QASYMM8/QASYMM8_SIGNED.
      * @param[in]  weights         Weight tensor info with dimensions [IFM, width, height, OFM].
      *                             Data type supported: Same as @p input
      * @param[in]  biases          (Optional) Biases tensor info. Biases are 1D tensor with dimension [OFM].
-     *                             Data type supported: Should match @p input data type
+     *                             Data type supported: Should match @p input data type if floating point, otherwise S32.
      * @param[out] output          Output tensor info with dimensions [OFM, width, height, batch]
      *                             The 1st dimension must be equal to the 4th dimension of the @p weights tensor.
      *                             Data types supported: Same as @p input.