1. 2b6ebfe Implement OpenCL MatMul for Lhs NT Rhs T/NT FP32/16 by Ramy Elgammal · 1 year, 4 months ago
  2. ec320d9 Add Subtraction operator to Dynamic Fusion interface by Ramy Elgammal · 1 year, 7 months ago
  3. 7359a87 Add Multiplication operator (FP only) to Dynamic Fusion Interface by Jakub Sujak · 1 year, 6 months ago
  4. 3cce35d Extend cl image support to input and output tensors by Gian Marco Iodice · 1 year, 6 months ago
  5. b3077fb LHS broadcasting addition for dynamic fusion by Viet-Hoa Do · 1 year, 6 months ago
  6. 76335eb Implement the OpenCL kernel to compute the indirect convolution by Gian Marco Iodice · 1 year, 8 months ago
  7. 404462a Adding GpuAdd to dynamic fusion operators by Ramy Elgammal · 1 year, 8 months ago
  8. 96fb194 Optimize T_QUANTIZE8_ASYMMETRIC for Mali™ G52 by Pablo Marquez Tello · 1 year, 8 months ago
  9. 3394f3e Rework direct convolution heuristic on OpenCL by Gian Marco Iodice · 1 year, 10 months ago
  10. 4bfc70e Add Gemm MMUL Reshaped Only Rhs Support for FP32/FP16 by Gunes Bayir · 2 years, 7 months ago
  11. b1fcefd Implement new Elementwise Dynamic Fusion Operators: Div, Floor by Michalis Spyrou · 2 years, 1 month ago
  12. 82169b3 Add cl_khr_integer_dot_product extension support by Viet-Hoa Do · 2 years, 1 month ago
  13. 06adbc5 Mismatches in dynamically fused direct conv2d + add kernel by Michalis Spyrou · 2 years, 2 months ago
  14. ca364df Include missing embedded headers by SiCong Li · 2 years, 3 months ago
  15. 451c309 Revert "Rework gemm_mm_reshaped_only_rhs_ kernels with new macros" by Ramy Elgammal · 2 years, 5 months ago
  16. 10e88a7 Rework gemm_mm_reshaped_only_rhs_ kernels with new macros by Gian Marco Iodice · 2 years, 7 months ago
  17. 3e155a5 Rework gemm_reshape_lhs_ with new macros by Adnan AlSinan · 2 years, 7 months ago
  18. 4fb5670 Rework gemm_reshape_rhs_(nt,t) with new macros by Gian Marco Iodice · 2 years, 8 months ago
  19. 17975a6 Improve start-up time for ClScale by Adnan AlSinan · 2 years, 8 months ago
  20. 945ae9e Implement CLDirectConv3D f32/f16 by Giorgio Arena · 2 years, 9 months ago
  21. 767dbf9 Fix oclgrind int overflow warning by Freddie Liardet · 3 years ago
  22. c38ca38 Fix CL kernel compilation failure by Michalis Spyrou · 3 years ago
  23. 8155c02 Rework OpenCL Depthwise Convolution by Gian Marco Iodice · 3 years, 3 months ago
  24. 6683165 Add quantization helper functions for OpenCL by Georgios Pinitas · 3 years ago
  25. c63b722 Revert "Rework OpenCL Depthwise Convolution" by Gian Marco Iodice · 3 years ago
  26. 561c176 Rework OpenCL Depthwise Convolution by Gian Marco Iodice · 3 years, 3 months ago
  27. ea8d266 Enable unroll through pragma based on DDK version by Giorgio Arena · 3 years, 1 month ago
  28. bdd16d1 Add macro to manually unroll loops in OpenCL by Giorgio Arena · 3 years, 2 months ago
  29. 2ba39b6 Fix missing DATA_TYPE in DOT_PRODUCT4_INTEGER8 OpenCL macro by Gian Marco Iodice · 3 years, 2 months ago
  30. ada6cbc Remove OpenCL padding: CLPixelWiseMultiplicationKernel by Giorgio Arena · 3 years, 3 months ago
  31. 0b76f7d Add support for cl_image in CLDirectConvolutionLayer by Gian Marco Iodice · 3 years, 3 months ago
  32. 534b889 Rework the OpenCL Winograd Input Transformations NHWC by Gian Marco Iodice · 3 years, 3 months ago
  33. a8903c8 Improve performance of Winograd Output Transform 3x3 by Gian Marco Iodice · 3 years, 3 months ago
  34. 5c9eed8 Extend direct convolution (F32/F16/QASYMM8) by Gian Marco Iodice · 3 years, 3 months ago