COMPMID-3690: Update release note

- Missing documentation for padding-removed NEON kernels has been added.
- Missing documentation for removed NEON kernel has been added.
- Minor format clean-up.

Change-Id: Id3ca2c9998d220c7e63b2343306caff13fcc3a34
Signed-off-by: Sang-Hoon Park <sang-hoon.park@arm.com>
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/3777
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Sheri Zhang <sheri.zhang@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
diff --git a/docs/00_introduction.dox b/docs/00_introduction.dox
index 35fff36..92d0ec2 100644
--- a/docs/00_introduction.dox
+++ b/docs/00_introduction.dox
@@ -255,34 +255,57 @@
    - @ref NEMaxUnpoolingLayerKernel
  - New graph example:
    - graph_yolov3_output_detector
+ - GEMMTuner improvements:
+   - Added fp16 support
+   - Output json files for easier integration
+   - Enabled tuning for export_to_cl_image_rhs option for RHS tensors
+   - More robust script for running benchmarks
  - Removed padding from:
    - @ref NEPixelWiseMultiplicationKernel
+   - @ref NEHeightConcatenateLayerKernel
+   - @ref NEThresholdKernel
+   - @ref NEBatchConcatenateLayerKernel
+   - @ref NETransposeKernel
+   - @ref NEBatchNormalizationLayerKernel
+   - @ref NEArithmeticSubtractionKernel
+   - @ref NEBoundingBoxTransformKernel
+   - @ref NELogits1DMaxKernel
+   - @ref NELogits1DSoftmaxKernel
+   - @ref NEROIPoolingLayerKernel
+   - @ref NEROIAlignLayerKernel
+   - @ref NEYOLOLayerKernel
+   - @ref NEUpsampleLayerKernel
+   - @ref NEFloorKernel
+   - @ref NEWidthConcatenateLayerKernel
+   - @ref NEDepthConcatenateLayerKernel
+   - @ref NENormalizationLayerKernel
+   - @ref NEL2NormalizeLayerKernel
+   - @ref NEFillArrayKernel
+   - @ref NEDepthConvertLayerKernel
+   - @ref NERangeKernel
+   - @ref NEPriorBoxLayer
  - Removedd OpenCL kernels / functions:
-     - CLGEMMLowpQuantizeDownInt32ToUint8Scale
-     - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat
+   - CLGEMMLowpQuantizeDownInt32ToUint8Scale
+   - CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloat
  - Removed NEON kernels / functions:
-     - NEGEMMLowpQuantizeDownInt32ToUint8Scale
- - GEMMTuner improvements:
-     - Added fp16 support
-     - Output json files for easier integration
-     - Enabled tuning for export_to_cl_image_rhs option for RHS tensors
-     - More robust script for running benchmarks
+   - NEGEMMLowpQuantizeDownInt32ToUint8Scale
+   - NEGEMMMatrixAccumulateBiasesKernel
  - Deprecated functions / interfaces:
    - Non-descriptor based interfaces for @ref NEThreshold, @ref CLThreshold
    - In @ref NESoftmaxLayer, @ref NELogSoftmaxLayer, @ref CLSoftmaxLayer, @ref CLLogSoftmaxLayer and @ref GCSoftmaxLayer :
       The default "axis" value for @ref CLSoftmaxLayer, @ref CLLogSoftmaxLayer and @ref GCSoftmaxLayer is changed from 1 to 0.
       Only axis 0 is supported.
       The default "axis" value for @ref NESoftmaxLayer, @ref NELogSoftmaxLayer is changed from 1 to 0.
-      Only axis 0 is supported. 
+      Only axis 0 is supported.
  - The support for quantized data types has been removed from @ref CLLogSoftmaxLayer due to implementation complexity.
  - Removed padding requirement for the input (e.g. LHS of GEMM) and output in @ref CLGEMMMatrixMultiplyNativeKernel, @ref CLGEMMMatrixMultiplyReshapedKernel, @ref CLGEMMMatrixMultiplyReshapedOnlyRHSKernel and @ref CLIm2ColKernel (NHWC only)
-    - This change allows to use @ref CLGEMMConvolutionLayer without extra padding for the input and output.
-    - Only the weights/bias of @ref CLGEMMConvolutionLayer could require padding for the computation.
-    - Only on Arm Mali Midgard GPUs, @ref CLGEMMConvolutionLayer could require padding since @ref CLGEMMMatrixMultiplyKernel is called and currently requires padding.
+   - This change allows to use @ref CLGEMMConvolutionLayer without extra padding for the input and output.
+   - Only the weights/bias of @ref CLGEMMConvolutionLayer could require padding for the computation.
+   - Only on Arm Mali Midgard GPUs, @ref CLGEMMConvolutionLayer could require padding since @ref CLGEMMMatrixMultiplyKernel is called and currently requires padding.
  - Added support for exporting the OpenCL buffer object to the OpenCL image object in @ref CLGEMMMatrixMultiplyReshapedKernel and @ref CLGEMMMatrixMultiplyReshapedOnlyRHSKernel.
-    - This support allows to export the OpenCL buffer used for the reshaped RHS matrix to the OpenCL image object.
-    - The padding requirement for the OpenCL image object is considered into the @ref CLGEMMReshapeRHSMatrixKernel.
-    - The reshaped RHS matrix stores the weights when GEMM is used to accelerate @ref CLGEMMConvolutionLayer.
+   - This support allows to export the OpenCL buffer used for the reshaped RHS matrix to the OpenCL image object.
+   - The padding requirement for the OpenCL image object is considered into the @ref CLGEMMReshapeRHSMatrixKernel.
+   - The reshaped RHS matrix stores the weights when GEMM is used to accelerate @ref CLGEMMConvolutionLayer.
 
 v20.05 Public major release
  - Various bug fixes.