Gitiles
Code Review
Sign In
review.mlplatform.org
/
ml
/
ComputeLibrary
/
refs/heads/release_candidate
/
src
« Previous
cbc27ff
Fix out-of-bound memory write
by Viet-Hoa Do
· 2 weeks ago
d723076
Enable FP16 in multi_isa+v8a
by Pablo Marquez Tello
· 3 weeks ago
311753a
Fixed illegal instruction in Softmax
by Pablo Marquez Tello
· 3 weeks ago
f1723a0
Fix OpenMP thread scheduling for large machines
by Hamza Butt
· 6 weeks ago
ab538a2
Use lookup table for Fp16 Tanh activation in hardware with SVE
by Gunes Bayir
· 6 weeks ago
1d2733e
Fix issues with OpenMP scheduler little core exclusion.
by Omar Al Khatib
· 6 weeks ago
f5053f7
Update logic in the OpenMP scheduler to exclude LITTLE cores
by Omar Al Khatib
· 8 weeks ago
8710385
Fix linking error to fp16_run_dequantization_core()
by Ramy Elgammal
· 6 weeks ago
d3d2e9b
Refactor Dequantize to enable FP16 kernel in v8a multi_isa builds
by Ramy Elgammal
· 7 weeks ago
1d8936d
Fix nightly build error
by Pablo Marquez Tello
· 7 weeks ago
b4b61a6
Rework CpuQuantizeKernel to enable FP16 in multi_isa builds
by Ramy Elgammal
· 7 weeks ago
2217f1e
Refactor arm_gemm to enable FP16 in all multi_isa builds
by Pablo Marquez Tello
· 7 weeks ago
21fb2ad
Fix ReductionLayer FP16 for armv8a multi_isa builds
by Ramy Elgammal
· 7 weeks ago
4c3f716
Improve CPU extension detection on macos
by Viet-Hoa Do
· 7 weeks ago
05269f0
ScatterND fix for scalar cases
by Gunes Bayir
· 8 weeks ago
48f120c
Make quantization rounding consistent
by Jonathan Deakin
· 8 weeks ago
c1575b2
Add SME2 implementation of Softmax for QASYMM8 and QASYMM8_SIGNED.
by Omar Al Khatib
· 10 weeks ago
2fea135
Add batched indices support to Scatter GPU Implementation
by Mohammed Suhail Munshi
· 9 weeks ago
c22e126
arm_gemm: fix SVE check on fast mode kernels.
by David Mansell
· 8 weeks ago
0c5ba9e
Change reorder implementation to be vector length agnostic for OHWIo8 reorder
by Radu Salavat
· 3 months ago
5c76742
New SME2 heuristics.
by David Mansell
· 4 months ago
301e33f
Add fp16 and integer data type support for ScatterNd in Gpu
by Gunes Bayir
· 9 weeks ago
e5ef8c1
Disable SME2 Gemmlowp s8f32 kernel selection in case results needs to be accumulated
by Gunes Bayir
· 9 weeks ago
499b5bc
Disable SME2 Gemm kernel selection in case results needs to be accumulated
by Gunes Bayir
· 9 weeks ago
ada3200
Add update/index/output (m+1)/2d/(m+n) support for CLScatter
by Gunes Bayir
· 10 weeks ago
0fa28be
Add padding to the shift and multipliers buffers
by Pablo Marquez Tello
· 10 weeks ago
7377107
Scatter GPU Kernel Implementation for 1D tensors.
by Mohammed Suhail Munshi
· 3 months ago
6ac82a4
fix compilation errors on linux with gcc12
by Sunita Nadampalli
· 3 months ago
a668f9f
Add s8f32 kernels and dynamic QuantizationInfo
by Jonathan Deakin
· 5 months ago
cdce25b
Accumulation in Cpu Gemm kernels is not supported for quantized kernels in aarch32. This patch guards the relevant tests.
by Radu Salavat
· 3 months ago
cfca87b
Add SME2 implementation of softmax for FP16
by Gunes Bayir
· 3 months ago
f1f1f87
Add in place summation to CPU GEMM kernels
by Radu Salavat
· 4 months ago
553e241
Fix compiler error
by Pablo Marquez Tello
· 3 months ago
1e91d71
Parallelise im2col along dimensions with higher number of iterations
by Milos Puzovic
· 3 months ago
77bbe2e
Add SME2 implementation of softmax for FP32
by Viet-Hoa Do
· 7 months ago
905786e
Added new NEON fixed format fast math mode hybrid kernel with maximum height of 6 for accumulation and updated heuristics
by Milos Puzovic
· 3 months ago
473b829
Adds Tests and reference implementation for scatter operator with 1D tensors.
by Mohammed Suhail Munshi
· 3 months ago
8609ca0
Add skeleton for CLScatter op, reference and tests
by Mohammed Suhail Munshi
· 4 months ago
36a75da
[ONCPUML-1451] Add matmul kernel to enable bf16 to bf16 operations via PyTorch® autocast() function
by Renato Arantes
· 5 months ago
d219115
Make Cpu/Gpu/Ref scalar/vectoral S32 division consistent
by Gunes Bayir
· 3 months ago
c00a82b
Fix overflow in NEMeanStdDevNormalizationKernel
by Pablo Marquez Tello
· 3 months ago
3e4b193
Fix quant. gemv kernel driver by adding set_quantized_bias()
by Gunes Bayir
· 4 months ago
5a67733
arm_gemm: Fix bias handling for sme2 FP16 GEMV.
by David Mansell
· 4 months ago
3ac0b87
Fix validation in pool2d assembly wrapper
by Pablo Marquez Tello
· 4 months ago
93e743f
Optimize CpuSoftmaxKernel for axis != 0 and neon kernels
by Omar Al Khatib
· 6 months ago
57a8852
Fix WoA nightly failure
by Pablo Marquez Tello
· 4 months ago
9167c9c
Prefer indirect Gemm vs. Direct convolution if supported
by Gunes Bayir
· 4 months ago
40af090
Disable FP16 on 32 bit
by Pablo Marquez Tello
· 4 months ago
bf05373
Fix performance regression in fixed-format kernels
by Gunes Bayir
· 4 months ago
6fe9eaf
Set Neon™ as present for WoA
by Pablo Marquez Tello
· 4 months ago
2676424
Fix segfault in DWC in WoA
by Pablo Marquez Tello
· 4 months ago
c1787f0
Fix OpenBSD® build failure caused by patch 11144
by Gunes Bayir
· 4 months ago
ef63739
Integrate new pretranspose_b_array with extra fused transpose of B
by Gunes Bayir
· 5 months ago
0a48c4c
Requantization cases for offset changes only
by Mohammed Suhail Munshi
· 5 months ago
7976f08
Fix compiler errors in cl-clang
by Pablo Marquez Tello
· 5 months ago
0c85334
Fix parallel depthwise perf regression from 2db938c
by Jonathan Deakin
· 5 months ago
0e73498
Add support for QSYMM8 in ClCastKernel
by Pablo Marquez Tello
· 5 months ago
0ee13af
Remove CKW prototype and Template Writer
by Gunes Bayir
· 5 months ago
a3e1b50
Fix the bug in GpuTanh operator in dynamic fusion
by Gunes Bayir
· 5 months ago
a5a81ae
Mark GpuSoftmax and GpuReshape as not supported
by Gunes Bayir
· 5 months ago
2db938c
Parallelize CPU depthwise over batch if only 1 row
by Jonathan Deakin
· 5 months ago
e695579
arm_gemm: SME: Remove artificial single-thread constraint on quantized int8 kernels.
by David Mansell
· 6 months ago
0c17c4b
Fix leftover cols in CpuGemmLowpMatrixBReductionKernel
by Jonathan Deakin
· 6 months ago
2b9fa59
Use the stable CKW API in the GPU dynamic fusion backend
by Gunes Bayir
· 6 months ago
fb92e22
arm_gemm: convolution: optimize convolver.hpp.
by David Mansell
· 7 months ago
bde6e78
Fix for Logically dead code detected in Coverity checks
by Anitha Raj
· 5 months ago
e8e016e
Fix for unchecked return value detected in Coverity checks.
by Anitha Raj
· 5 months ago
fdf56fb
Make GpuWorkloadContext own all tensor info objects
by Viet-Hoa Do
· 5 months ago
6829e02
Fix divide-by-zero compilation error
by Viet-Hoa Do
· 6 months ago
8896cf7
Fix minor issue, clean lut code
by Mohammed Suhail Munshi
· 6 months ago
27dee1e
Fix potential threading issue in LUTManager
by Mohammed Suhail Munshi
· 6 months ago
0eb9cfb
[ONCPUML-1387] Add ACL based reorder for f32 to bf16 data type conversion.
by Renato Arantes
· 7 months ago
5d7a93a
Fix compilation error on GCC 13.2
by Jakub Sujak
· 6 months ago
7467ba8
Use look up table for fp16 activation
by Mohammed Suhail Munshi
· 7 months ago
7fe7791
Prevent RELU from being processed thru LUT in INT8
by Sangwon Ha
· 6 months ago
c310c11
Fix nightly issue caused by gemm_reshaped_only_rhs_mmul kernel
by Gunes Bayir
· 6 months ago
85cafff
Add Mali™-G720 and Mali™-G620 as GpuTargets
by Gunes Bayir
· 7 months ago
306a8a9
Fix nightly bug caused by not validation 3d cases for input tensor
by Gunes Bayir
· 7 months ago
ec0a057
Revert "Fix nightly bug caused by wrong validation in Gemm mmul kernel"
by Gunes Bayir
· 7 months ago
feef9b9
Fix validation error in CL generate proposals kernel
by Gunes Bayir
· 7 months ago
270576a
Fix nightly bug caused by wrong validation in Gemm mmul kernel
by Gunes Bayir
· 7 months ago
b526431
Winograd changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 7 months ago
0660172
Fix validation error in graph_ssd_mobilenet
by Gunes Bayir
· 7 months ago
eb475ec
Fix unit tests failing in CL/UNIT/TensorAllocator
by Gunes Bayir
· 7 months ago
4737094
Optimize CPU depth-to-space
by Viet-Hoa Do
· 8 months ago
17e116e
Revert "thread_local _custom_scheduler"
by Pablo Marquez Tello
· 7 months ago
fadc9b1
Optimize CpuSoftmaxKernel for axis=0
by Gunes Bayir
· 8 months ago
9f7aca9
Changes to enable FP16 in armv8a multi_isa
by Pablo Marquez Tello
· 11 months ago
8d4cdd4
BatchNorm changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 7 months ago
568aab6
CpuMul changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 7 months ago
ded5b18
thread_local _custom_scheduler
by David Svantesson
· 11 months ago
ba93371
NormalizationLayer changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 8 months ago
d4650e9
Fix various coverity issues
by SiCong Li
· 8 months ago
ec2afd6
Fix device issue with CL softmax
by Viet-Hoa Do
· 8 months ago
c63f8b0
Update comments to suppress doxygen warnings.
by Anitha Raj
· 8 months ago
24c140f
Fix CpuGemmConv2d int8 segfault
by SiCong Li
· 8 months ago
92c3d71
Remove duplicate definitions of BF16 fixed format kernels.
by David Mansell
· 8 months ago
01b0f9b
Pooling changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 8 months ago
64f4a30
DepthwiseConvolution changes to enable fp16 in armv8a multi_isa builds
by Pablo Marquez Tello
· 8 months ago
c5ab4df
Optimize CpuGemmConv2d start-up time
by SiCong Li
· 9 months ago
Next »