Extend CKW MatMul with nt_t

- Add the kernel variant: (nt_t) to GpuCKWMatMul.
- Extend CKW MatMul validation test with nt_t.
- Fixes a bug in CKW where z-dim = 1.

Resolves: COMPMID-6435

Signed-off-by: Adnan AlSinan <adnan.alsinan@arm.com>
Change-Id: I4c5e8791e55f21ffff3c11eca7802c51a4259977
Reviewed-on: https://review.mlplatform.org/c/ml/ComputeLibrary/+/10525
Comments-Addressed: Arm Jenkins <bsgcomp@arm.com>
Tested-by: Arm Jenkins <bsgcomp@arm.com>
Reviewed-by: Gian Marco Iodice <gianmarco.iodice@arm.com>
Benchmark: Arm Jenkins <bsgcomp@arm.com>
diff --git a/compute_kernel_writer/prototype/src/Prototype.h b/compute_kernel_writer/prototype/src/Prototype.h
index 433eef9..b392fe2 100644
--- a/compute_kernel_writer/prototype/src/Prototype.h
+++ b/compute_kernel_writer/prototype/src/Prototype.h
@@ -3050,7 +3050,7 @@
             address += " * ";
             address += stride_y;
         }
-        if (z != "0" && (_mapper.is_one_component_z() != true))
+        if (z != "0")
         {
             const std::string stride_z = _mapper.tensor_component_stride_z();
             address += " + (";