COMPMID-417: Port DepthConcatenate to QS8/QS16 for NEON/CL.

Change-Id: I3dddae63043c7aa18d908a4fc8abacf3c64f98ca
Reviewed-on: http://mpd-gerrit.cambridge.arm.com/80081
Tested-by: Kaizen <jeremy.johnson+kaizengerrit@arm.com>
Reviewed-by: Steven Niu <steven.niu@arm.com>
diff --git a/arm_compute/core/CL/kernels/CLDepthConcatenateKernel.h b/arm_compute/core/CL/kernels/CLDepthConcatenateKernel.h
index eda4c66..e85e0ec 100644
--- a/arm_compute/core/CL/kernels/CLDepthConcatenateKernel.h
+++ b/arm_compute/core/CL/kernels/CLDepthConcatenateKernel.h
@@ -52,9 +52,9 @@
     ~CLDepthConcatenateKernel() = default;
     /** Initialise the kernel's inputs and output
      *
-     * @param[in]     input        Input tensor. Data types supported: F32.
+     * @param[in]     input        Input tensor. Data types supported: QS8/QS16/F16/F32.
      * @param[in]     depth_offset The offset on the Z axis.
-     * @param[in,out] output       Output tensor. Data types supported: F32.
+     * @param[in,out] output       Output tensor. Data types supported: Same as @p input.
      *
      * @note: The output tensor's low two dimensions can't be smaller than the input one's.
      * @note: The gaps between the two lowest dimensions of input and output need to be divisible by 2.