Gitiles
Code Review
Sign In
review.mlplatform.org
/
ml
/
ComputeLibrary
/
refs/heads/release_candidate
/
src
/
cpu
2961735
Use generic implementation of elementwise_div for s32 SVE kernel
by Yevgen Pronenko
· 3 days ago
main
release_candidate
e683652
Optimize memory management of CPU operators
by Michael Tyler
· 4 days ago
fc94f4d
Update CPU kernels and add mixed sign GEMM support
by Michael Tyler
· 3 weeks ago
cbc27ff
Fix out-of-bound memory write
by Viet-Hoa Do
· 2 weeks ago
ab538a2
Use lookup table for Fp16 Tanh activation in hardware with SVE
by Gunes Bayir
· 5 weeks ago
8710385
Fix linking error to fp16_run_dequantization_core()
by Ramy Elgammal
· 6 weeks ago
d3d2e9b
Refactor Dequantize to enable FP16 kernel in v8a multi_isa builds
by Ramy Elgammal
· 6 weeks ago
1d8936d
Fix nightly build error
by Pablo Marquez Tello
· 6 weeks ago
b4b61a6
Rework CpuQuantizeKernel to enable FP16 in multi_isa builds
by Ramy Elgammal
· 6 weeks ago
2217f1e
Refactor arm_gemm to enable FP16 in all multi_isa builds
by Pablo Marquez Tello
· 7 weeks ago
21fb2ad
Fix ReductionLayer FP16 for armv8a multi_isa builds
by Ramy Elgammal
· 7 weeks ago
c1575b2
Add SME2 implementation of Softmax for QASYMM8 and QASYMM8_SIGNED.
by Omar Al Khatib
· 9 weeks ago
a668f9f
Add s8f32 kernels and dynamic QuantizationInfo
by Jonathan Deakin
· 5 months ago
cdce25b
Accumulation in Cpu Gemm kernels is not supported for quantized kernels in aarch32. This patch guards the relevant tests.
by Radu Salavat
· 3 months ago
cfca87b
Add SME2 implementation of softmax for FP16
by Gunes Bayir
· 3 months ago
f1f1f87
Add in place summation to CPU GEMM kernels
by Radu Salavat
· 4 months ago
1e91d71
Parallelise im2col along dimensions with higher number of iterations
by Milos Puzovic
· 3 months ago
77bbe2e
Add SME2 implementation of softmax for FP32
by Viet-Hoa Do
· 7 months ago
36a75da
[ONCPUML-1451] Add matmul kernel to enable bf16 to bf16 operations via PyTorch® autocast() function
by Renato Arantes
· 5 months ago
d219115
Make Cpu/Gpu/Ref scalar/vectoral S32 division consistent
by Gunes Bayir
· 3 months ago
c00a82b
Fix overflow in NEMeanStdDevNormalizationKernel
by Pablo Marquez Tello
· 3 months ago
3ac0b87
Fix validation in pool2d assembly wrapper
by Pablo Marquez Tello
· 4 months ago
93e743f
Optimize CpuSoftmaxKernel for axis != 0 and neon kernels
by Omar Al Khatib
· 6 months ago
9167c9c
Prefer indirect Gemm vs. Direct convolution if supported
by Gunes Bayir
· 4 months ago
bf05373
Fix performance regression in fixed-format kernels
by Gunes Bayir
· 4 months ago
ef63739
Integrate new pretranspose_b_array with extra fused transpose of B
by Gunes Bayir
· 5 months ago
0a48c4c
Requantization cases for offset changes only
by Mohammed Suhail Munshi
· 5 months ago
0c85334
Fix parallel depthwise perf regression from 2db938c
by Jonathan Deakin
· 5 months ago
2db938c
Parallelize CPU depthwise over batch if only 1 row
by Jonathan Deakin
· 5 months ago
0c17c4b
Fix leftover cols in CpuGemmLowpMatrixBReductionKernel
by Jonathan Deakin
· 6 months ago
bde6e78
Fix for Logically dead code detected in Coverity checks
by Anitha Raj
· 5 months ago
7467ba8
Use look up table for fp16 activation
by Mohammed Suhail Munshi
· 7 months ago
7fe7791
Prevent RELU from being processed thru LUT in INT8
by Sangwon Ha
· 6 months ago
b526431
Winograd changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 7 months ago
4737094
Optimize CPU depth-to-space
by Viet-Hoa Do
· 8 months ago
17e116e
Revert "thread_local _custom_scheduler"
by Pablo Marquez Tello
· 7 months ago
fadc9b1
Optimize CpuSoftmaxKernel for axis=0
by Gunes Bayir
· 8 months ago
8d4cdd4
BatchNorm changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 7 months ago
568aab6
CpuMul changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 7 months ago
ded5b18
thread_local _custom_scheduler
by David Svantesson
· 11 months ago
ba93371
NormalizationLayer changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 8 months ago
d4650e9
Fix various coverity issues
by SiCong Li
· 8 months ago
24c140f
Fix CpuGemmConv2d int8 segfault
by SiCong Li
· 8 months ago
01b0f9b
Pooling changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 8 months ago
64f4a30
DepthwiseConvolution changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 8 months ago
c5ab4df
Optimize CpuGemmConv2d start-up time
by SiCong Li
· 9 months ago
e5362e7
DirectConv and Im2Col changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 9 months ago
074b985
FuseBatchNorm changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 9 months ago
d8a397e
Fix build error in CpuScale
by Pablo Marquez Tello
· 9 months ago
b5cb4d2
Scale changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 9 months ago
9aa153a
Fix build error
by Pablo Marquez Tello
· 9 months ago
6777359
CpuSubKernel changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 9 months ago
68b6dce
Pool2d changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 9 months ago
a23b468
Optimize CLTranspose operator
by Jakub Sujak
· 9 months ago
c2a51bd
Optimize CL and Neon Winograd tests
by Gunes Bayir
· 9 months ago
afd38f0
Apply clang-format on repository
by Felix Thomasmathibalan
· 9 months ago
0392160
Re-arrange header inclusion order
by Felix Thomasmathibalan
· 9 months ago
6d87887
Select changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 9 months ago
6b6ba9e
Maxunpooling changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 9 months ago
e9fd8b4
L2Norm changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 9 months ago
f57d6ec
Gemm changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 9 months ago
e071b5e
Fix the validation issue in AddMulAdd fused kernel
by Gunes Bayir
· 9 months ago
40a9d3e
Remove deprecated support for BF16 in CpuCast
by Adnan AlSinan
· 10 months ago
2ffc85e
GenerateProposals changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
7e58980
Fuse batch normalization changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
c071328
Fix include dependencies for mass reformatting patch
by Gunes Bayir
· 10 months ago
7ce8a83
Softmax changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
145e82e
Changes to InstanceNrom to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
cf219a4
Changes in NECropResize to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
3912f47
Meanstddevnorm changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
45e5b5a
Changes to BoundingBoxTransform to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
ea9bd8f
Changes to ElementwiseOp to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
0d27b2e
Remove legacy PostOps code
by Jakub Sujak
· 10 months ago
7ff03b6
DWC changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
324ba7a
Pool3d changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
8770669
Changes in roi_align to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 10 months ago
cea7060
NEFuseBatchNormalizationKernel rework
by Pablo Marquez Tello
· 11 months ago
3a9ecdf
CpuAdd rework to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 11 months ago
082630b
Update CpuGemmConv2d and CpuFlatten to use CpuReshape operator
by Anitha Raj
· 10 months ago
eb5696d
Optimize CpuReshapeKernel
by Anitha Raj
· 12 months ago
580ecd7
Fix depthwise convolution not using assembly kernel
by Viet-Hoa Do
· 11 months ago
246fe08
Fix various static check issues
by Viet-Hoa Do
· 11 months ago
29e27b0
Add support for S64 output in NEArgMinMaxLayer
by Pablo Marquez Tello
· 11 months ago
78ce273
Document the Conv2D heuristic
by Gian Marco Iodice
· 11 months ago
9129549
Retain back-compatibility for arm_compute/core/Types.h
by SiCong Li
· 11 months ago
4a1c917
Add support for input S64/U64 in CpuCastKernel
by Pablo Marquez Tello
· 12 months ago
314d3e2
Break up core/Utils.h to reduce unused code being included everywhere
by Matthew Bentham
· 1 year ago
1d06204
Do not include headers necessary for logging when logging is disabled
by Matthew Bentham
· 12 months ago
8deee9b
Depthwise channel pre-multiplication
by Michael Tyler
· 12 months ago
7d9a78e
Remove dependency on fp16 definitions from some core include files
by Matthew Bentham
· 1 year, 1 month ago
47a50ef
Address the issues with the ACL coverage pipeline failures related to matmul.
by Renato Arantes
· 1 year ago
8eb82d2
Fix CPU depthwise convolution in case of large padding
by Viet-Hoa Do
· 1 year ago
94abde4
Add Fused Activation to OpenCL MatMul
by Mohammed Suhail Munshi
· 1 year, 1 month ago
043613f
Break up Utils.h a bit to reduce unused code being included everywhere
by Matthew Bentham
· 1 year, 1 month ago
f1aeab9
Break up arm_compute/core/Types.h a bit
by Matthew Bentham
· 1 year, 1 month ago
48cfd5f
Refactor activation LUT computation
by Pablo Marquez Tello
· 1 year, 1 month ago
48c0ed9
Fix ScaleKernel validate method.
by Pablo Marquez Tello
· 1 year, 2 months ago
c0463a2
Move lut kernel to sve2 category
by SiCong Li
· 1 year, 1 month ago
a8db612
Re-enable dyanmic weights in Neon™ depthwise convolution
by Ramy Elgammal
· 1 year, 2 months ago
e9b3ee2
Connect CLMatMul function to quantized kernels and resolve NE BatchMatMul int_8 failures
by Jakub Sujak
· 1 year, 2 months ago
Next »