1. e683652 Optimize memory management of CPU operators by Michael Tyler · 12 days ago
  2. fc94f4d Update CPU kernels and add mixed sign GEMM support by Michael Tyler · 5 weeks ago
  3. 1d8936d Fix nightly build error by Pablo Marquez Tello · 8 weeks ago
  4. 2217f1e Refactor arm_gemm to enable FP16 in all multi_isa builds by Pablo Marquez Tello · 8 weeks ago
  5. a668f9f Add s8f32 kernels and dynamic QuantizationInfo by Jonathan Deakin · 6 months ago
  6. cdce25b Accumulation in Cpu Gemm kernels is not supported for quantized kernels in aarch32. This patch guards the relevant tests. by Radu Salavat · 3 months ago
  7. f1f1f87 Add in place summation to CPU GEMM kernels by Radu Salavat · 4 months ago
  8. 1e91d71 Parallelise im2col along dimensions with higher number of iterations by Milos Puzovic · 3 months ago
  9. 36a75da [ONCPUML-1451] Add matmul kernel to enable bf16 to bf16 operations via PyTorch® autocast() function by Renato Arantes · 5 months ago
  10. 93e743f Optimize CpuSoftmaxKernel for axis != 0 and neon kernels by Omar Al Khatib · 6 months ago
  11. 9167c9c Prefer indirect Gemm vs. Direct convolution if supported by Gunes Bayir · 4 months ago
  12. bf05373 Fix performance regression in fixed-format kernels by Gunes Bayir · 4 months ago
  13. ef63739 Integrate new pretranspose_b_array with extra fused transpose of B by Gunes Bayir · 5 months ago
  14. 0a48c4c Requantization cases for offset changes only by Mohammed Suhail Munshi · 5 months ago
  15. 0c85334 Fix parallel depthwise perf regression from 2db938c by Jonathan Deakin · 5 months ago
  16. 2db938c Parallelize CPU depthwise over batch if only 1 row by Jonathan Deakin · 5 months ago
  17. b526431 Winograd changes to enable fp16 in armv8a multi_isa builds by Pablo Marquez Tello · 7 months ago
  18. 17e116e Revert "thread_local _custom_scheduler" by Pablo Marquez Tello · 7 months ago
  19. fadc9b1 Optimize CpuSoftmaxKernel for axis=0 by Gunes Bayir · 8 months ago
  20. ded5b18 thread_local _custom_scheduler by David Svantesson · 11 months ago
  21. d4650e9 Fix various coverity issues by SiCong Li · 8 months ago
  22. 24c140f Fix CpuGemmConv2d int8 segfault by SiCong Li · 8 months ago
  23. c5ab4df Optimize CpuGemmConv2d start-up time by SiCong Li · 9 months ago
  24. c2a51bd Optimize CL and Neon Winograd tests by Gunes Bayir · 9 months ago
  25. afd38f0 Apply clang-format on repository by Felix Thomasmathibalan · 9 months ago
  26. e071b5e Fix the validation issue in AddMulAdd fused kernel by Gunes Bayir · 10 months ago
  27. 40a9d3e Remove deprecated support for BF16 in CpuCast by Adnan AlSinan · 10 months ago
  28. c071328 Fix include dependencies for mass reformatting patch by Gunes Bayir · 10 months ago
  29. 0d27b2e Remove legacy PostOps code by Jakub Sujak · 11 months ago
  30. 082630b Update CpuGemmConv2d and CpuFlatten to use CpuReshape operator by Anitha Raj · 11 months ago
  31. eb5696d Optimize CpuReshapeKernel by Anitha Raj · 12 months ago
  32. 246fe08 Fix various static check issues by Viet-Hoa Do · 11 months ago
  33. 78ce273 Document the Conv2D heuristic by Gian Marco Iodice · 11 months ago
  34. 9129549 Retain back-compatibility for arm_compute/core/Types.h by SiCong Li · 12 months ago
  35. 4a1c917 Add support for input S64/U64 in CpuCastKernel by Pablo Marquez Tello · 12 months ago
  36. 1d06204 Do not include headers necessary for logging when logging is disabled by Matthew Bentham · 1 year ago
  37. 8deee9b Depthwise channel pre-multiplication by Michael Tyler · 1 year ago
  38. 47a50ef Address the issues with the ACL coverage pipeline failures related to matmul. by Renato Arantes · 1 year, 1 month ago
  39. 94abde4 Add Fused Activation to OpenCL MatMul by Mohammed Suhail Munshi · 1 year, 1 month ago
  40. 043613f Break up Utils.h a bit to reduce unused code being included everywhere by Matthew Bentham · 1 year, 1 month ago
  41. f1aeab9 Break up arm_compute/core/Types.h a bit by Matthew Bentham · 1 year, 1 month ago
  42. a8db612 Re-enable dyanmic weights in Neon™ depthwise convolution by Ramy Elgammal · 1 year, 2 months ago
  43. e9b3ee2 Connect CLMatMul function to quantized kernels and resolve NE BatchMatMul int_8 failures by Jakub Sujak · 1 year, 3 months ago
  44. edafe7f Disable dynamic weights in unsupported operators by Viet-Hoa Do · 1 year, 2 months ago
  45. 5713294 Fix im2col for fast-maths mode with padding. by Renato Arantes · 1 year, 3 months ago
  46. 54e52a9 Fix CPU MatMul broadcast detection by Viet-Hoa Do · 1 year, 2 months ago
  47. a62129a Fix fully connected and matmul mismatches by Viet-Hoa Do · 1 year, 2 months ago
  48. dba672c Integrate multi-threaded pretranspose_B_array by SiCong Li · 1 year, 3 months ago
  49. 9c7c2d2 Add quantized support for CPU MatMul by Viet-Hoa Do · 1 year, 3 months ago
  50. 9b0a6b4 Fix dynamic weights for CPU connected layer by Viet-Hoa Do · 1 year, 3 months ago
  51. a1b1e41 Implement MatMul Function and Operator with Floating Point support for CPU by Mohammed Suhail Munshi · 1 year, 4 months ago
  52. a3e57c2 Add dynamic weights for CPU fully connected layer by Viet-Hoa Do · 1 year, 4 months ago
  53. 0ffc88b [ONCPUML-1174] Allow src/weights mismatch for fixed format by Jonathan Deakin · 1 year, 4 months ago
  54. 1fe48ca NEGEMMLowpMatrixMultiplyCore should be configured for optimized int8 kernel. by Ethan Doe · 1 year, 4 months ago
  55. bbf2e74 Add support for kernel indices in Maxpool by Adnan AlSinan · 1 year, 5 months ago
  56. ae72a46 Add new operator AddMulAdd for Neon™ backend for Float/Quantized types by Gunes Bayir · 1 year, 5 months ago
  57. 464ed20 Remove fixed format strides hack by Jonathan Deakin · 1 year, 6 months ago
  58. 1b6377b Add broadcast batched matmul validation cases by SiCong Li · 1 year, 6 months ago
  59. 6bcdc57 Deprecated BF16 support in DepthConvert by Pablo Marquez Tello · 1 year, 6 months ago
  60. a7077e9 Updateable weights in depthwise convolution by Milos Puzovic · 1 year, 8 months ago
  61. 4b5f6ef Add check for Batch Matmul in GemmAssemblyDispatch by Mohammed Suhail Munshi · 1 year, 9 months ago
  62. 9fc0b5c Update reinterpret tensor as 1D for CPU add by Viet-Hoa Do · 1 year, 9 months ago
  63. fa79fda Optimize Neon™ Logistic Activation by Mohammed Suhail Munshi · 1 year, 10 months ago
  64. c8cc024 Adding documentation section explaining how BF16 is used by Ramy Elgammal · 1 year, 9 months ago
  65. 842ad21 Optimize Neon™ SUB operator by squashing execution window by Jakub Sujak · 1 year, 10 months ago
  66. c4f2743 Optimize Quantized/Integer Bilinear Scale for Neon™ by Gunes Bayir · 1 year, 10 months ago
  67. 0d05b66 Interpreting tensor as 1D for CPU multiplication by Viet-Hoa Do · 1 year, 10 months ago
  68. 26c9d1a Add test for NEGEMM to test a batched matrix multiplication with variable input tensors by Adnan AlSinan · 1 year, 10 months ago
  69. 0eed305 Optimize FP32/16 Bilinear Scale Kernel for Neon™ by Gunes Bayir · 1 year, 10 months ago
  70. e4e3b2e Disable Winograd on fp16 if fast-math = false by Ramy Elgammal · 1 year, 10 months ago
  71. 65c8db8 Fix for AI benchmark ResNet regression by Viet-Hoa Do · 1 year, 11 months ago
  72. 93581a5 [ONCPUML-970] Fast math mode for fixed format kernels by Pablo Marquez Tello · 2 years ago
  73. 13b623e [ONCPUML-968] Fixed format kernel support in additional APIs by Milos Puzovic · 2 years ago
  74. 9b921be Optimize add layer by considering the input tensors as 1D array by Gunes Bayir · 2 years ago
  75. aa52b7d Fix compilation error rasied in Nightly_NEW by Ramy Elgammal · 2 years ago
  76. 9178002 Fix for inclusion of "arm_gemm" from src into "Types.h" from core by Ramy Elgammal · 2 years ago
  77. d208f4f Enable march=armv8.6-a in non multi-isa builds by Pablo Marquez Tello · 2 years ago
  78. 553f695 [ONCPUML-951] Variable weight support for Convolution. by Francesco Petrogalli · 2 years ago
  79. a1f7851 Integrate new winograd APIs from MLTech by ramelg01 · 2 years ago
  80. 16aa474 Wrong arguments for running activation function in CpuGemmDirectConv2d by Michalis Spyrou · 2 years ago
  81. 5fcf22d [arm_gemm] Import fixed-format kernels from gemm_linux. by Francesco.Petrogalli@arm.com · 2 years, 3 months ago
  82. 168d6a8 Use svcreate instead of list initializations. by Michalis Spyrou · 2 years, 2 months ago
  83. fa6877f [CpuGemmConv2d] Extract skip_im2col and skip_col2im computation. by Francesco.Petrogalli@arm.com · 2 years, 3 months ago
  84. 9104cd5 Add support for int8 CpuPool3d by Adnan AlSinan · 2 years, 3 months ago
  85. 5d606cc Fix CpuGemmAssemblyDispatch::has_opt_impl. by Francesco.Petrogalli@arm.com · 2 years, 3 months ago
  86. e33c556 [arm_gemm] Use static validate to find arm_gemm kernels. by Francesco.Petrogalli@arm.com · 2 years, 3 months ago
  87. 171fc3d Add CPU Pool3d FP16/32 implementation by Adnan AlSinan · 2 years, 4 months ago
  88. 193cad3 Remove deprecated interface from arm_compute. by Francesco.Petrogalli@arm.com · 2 years, 4 months ago
  89. 149203b Port MaxUnpoolingLayer kernel and add KernelSelect vaidation test by Dana Zlotnik · 2 years, 5 months ago
  90. 46d44d2 Enable kernel selection testing (Phase #2) by Yair Schwarzbaum · 2 years, 6 months ago
  91. f2c022e Enable fast_math in CpuFullyConnected by cfRod · 2 years, 8 months ago
  92. f727ef4 Add uint8/int8 support to cpu conv3d by Freddie Liardet · 2 years, 9 months ago
  93. 5dda217 DirectConv3d support refine by Sheri Zhang · 2 years, 9 months ago
  94. 6d9c982 Conv3d support by Sheri Zhang · 2 years, 10 months ago
  95. ded3663 Remove padding in cpuPool2d NCHW by Freddie Liardet · 2 years, 10 months ago
  96. 63e0beb Add support for non-constant weights and biases in CpuFullyConnected by Giorgio Arena · 2 years, 10 months ago
  97. 3ae3d88 Provide logging for configure functions in all cpu operators by ramelg01 · 2 years, 10 months ago
  98. 9ac7b99 Revert "Add support for non-constant weights and biases in CpuFullyConnected" by Pablo Marquez Tello · 2 years, 10 months ago
  99. 2f9ae16 Avoid checking on biases' constantness if nullptr by Giorgio Arena · 2 years, 10 months ago
  100. aed63ee Add support for non-constant weights and biases in CpuFullyConnected by Michele Di Giorgio · 3 years ago