Conv3d support

* Add CpuDirectConv3d support for fp32 and fp16
* Dilation is not supported
* Need decouple

Partially resolve: COMPMID-4661

Signed-off-by: Sheri Zhang <sheri.zhang@arm.com>
Change-Id: Ib1865b9ff328b684d131512b1baf77bc2f10318f
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/6430
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Giorgio Arena <giorgio.arena@arm.com>
diff --git a/Android.bp b/Android.bp
index 36d392d..ccfb2c7 100644
--- a/Android.bp
+++ b/Android.bp
@@ -397,6 +397,7 @@
         "src/cpu/kernels/CpuDequantizeKernel.cpp",
         "src/cpu/kernels/CpuDirectConv2dKernel.cpp",
         "src/cpu/kernels/CpuDirectConv2dOutputStageKernel.cpp",
+        "src/cpu/kernels/CpuDirectConv3dKernel.cpp",
         "src/cpu/kernels/CpuElementwiseKernel.cpp",
         "src/cpu/kernels/CpuElementwiseUnaryKernel.cpp",
         "src/cpu/kernels/CpuFillKernel.cpp",
@@ -477,6 +478,7 @@
         "src/cpu/operators/CpuDepthwiseConv2dAssemblyDispatch.cpp",
         "src/cpu/operators/CpuDequantize.cpp",
         "src/cpu/operators/CpuDirectConv2d.cpp",
+        "src/cpu/operators/CpuDirectConv3d.cpp",
         "src/cpu/operators/CpuElementwise.cpp",
         "src/cpu/operators/CpuElementwiseUnary.cpp",
         "src/cpu/operators/CpuFill.cpp",
@@ -736,6 +738,7 @@
         "src/runtime/NEON/functions/NECast.cpp",
         "src/runtime/NEON/functions/NEChannelShuffleLayer.cpp",
         "src/runtime/NEON/functions/NEConcatenateLayer.cpp",
+        "src/runtime/NEON/functions/NEConv3D.cpp",
         "src/runtime/NEON/functions/NEConvertFullyConnectedWeights.cpp",
         "src/runtime/NEON/functions/NEConvolutionLayer.cpp",
         "src/runtime/NEON/functions/NECopy.cpp",