Make memset/copy functions state-less

Port following functions:
- CLCopy
- CLFill
- CLPermute
- CLReshapeLayer
- CLCropResize

Resolves: COMPMID-4002

Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: I8392aa515aaeb5b44dab6122be6a795d08376d5f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/5003
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Michele Di Giorgio <michele.digiorgio@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
diff --git a/docs/04_adding_operator.dox b/docs/04_adding_operator.dox
index 9e6f375..f311fb4 100644
--- a/docs/04_adding_operator.dox
+++ b/docs/04_adding_operator.dox
@@ -117,7 +117,7 @@
 
 The structure of the kernel .cpp file should be similar to the next ones.
 For OpenCL:
-@snippet src/core/CL/kernels/CLReshapeLayerKernel.cpp CLReshapeLayerKernel Kernel
+@snippet src/core/gpu/cl/kernels/ClReshapeKernel.cpp ClReshapeKernel Kernel
 The run will call the function defined in the .cl file.
 
 For the NEON backend case: