vladmandic
|
a4671045b6
|
lint and crlf
Signed-off-by: vladmandic <mandic00@live.com>
|
2026-01-24 10:28:46 +01:00 |
|
Disty0
|
8d6bfcd827
|
Update SDNQ
|
2026-01-23 14:39:07 +03:00 |
|
Disty0
|
784cda80aa
|
update sdnq
|
2026-01-14 16:23:26 +03:00 |
|
Disty0
|
db59d2b507
|
SDNQ handle packed floats in fp mm
|
2025-12-27 16:29:18 +03:00 |
|
Disty0
|
949ff04577
|
SDNQ fix fp16 mm with fp8 weights and improve stochastic rounding performance
|
2025-12-09 17:41:29 +03:00 |
|
Disty0
|
aaef4992c3
|
SDNQ fix svd + fp8 tw and fp16 mm
|
2025-11-28 22:31:09 +03:00 |
|
Disty0
|
b6e9332cfe
|
SDNQ de-couple matmul dtype and add fp16 matmul
|
2025-11-22 02:16:20 +03:00 |
|
Disty0
|
1745ed53f8
|
Refactor SDNQDequantizer
|
2025-11-18 01:42:58 +03:00 |
|
Disty0
|
6f33ec3357
|
SDNQ use the model quant params instead of user settings on Lora
|
2025-11-10 00:12:38 +03:00 |
|
Disty0
|
f12caf81f9
|
SDNQ skip bad layers on svd and fix svd with dequantize_fp32
|
2025-10-17 17:25:50 +03:00 |
|
Disty0
|
c7aba8589b
|
SDNQ fix Qwen loading
|
2025-10-11 00:05:09 +03:00 |
|
Disty0
|
be91bbff75
|
SDNQ add SVD support for Convs
|
2025-10-06 18:26:42 +03:00 |
|
Disty0
|
9e52d0c1fb
|
SDNQ add SVDQuant quantization method
|
2025-10-05 22:50:30 +03:00 |
|
Disty0
|
99113947bf
|
SDNQ add RDNA2 INT8 support via Triton
|
2025-10-04 18:31:25 +03:00 |
|
Disty0
|
54acf1760b
|
Make SDNQ scales compatible with balanced offload
|
2025-10-03 18:13:55 +03:00 |
|
Disty0
|
c5cab96223
|
SDNQ simplify check_mats
|
2025-10-03 02:58:17 +03:00 |
|
Disty0
|
03382bdd4c
|
SDNQ simplify check_mats
|
2025-10-01 01:35:51 +03:00 |
|
Disty0
|
0c1d34721c
|
SDNQ use contiguous for intel
|
2025-09-30 02:37:58 +03:00 |
|
Disty0
|
6b67a9d0c4
|
SDNQ add check_mats to matmul
|
2025-09-30 01:58:13 +03:00 |
|
Disty0
|
e6715ba8d3
|
Cleanup SDNQ compile
|
2025-09-19 19:29:36 +03:00 |
|
Disty0
|
a12edc1e90
|
SDNQ use nan_to_num_ with fp8 quantization in case of zeros
|
2025-09-15 20:22:39 +03:00 |
|
Vladimir Mandic
|
9743c8e4bf
|
keep previous processed state
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-08-31 15:20:15 -04:00 |
|
Disty0
|
bbb345cf44
|
Fix bias dtype mismatch
|
2025-08-30 02:31:41 +03:00 |
|
Disty0
|
6c36433a14
|
SDNQ fix row-wise FP8 matmul with fp32 and fp16 inputs
|
2025-08-30 02:27:15 +03:00 |
|
Disty0
|
a8de3f7282
|
SDNQ add quantized matmul support for all quantization types and group sizes
|
2025-08-29 22:26:47 +03:00 |
|
Disty0
|
f324b7c0e5
|
SDNQ remove unnecessary .contiguous()
|
2025-08-21 02:21:05 +03:00 |
|
Disty0
|
8460be662c
|
SDNQ use inplace transpose and use view instead of reshape
|
2025-08-17 05:07:55 +03:00 |
|
Disty0
|
9992338187
|
sdnq fix convs
|
2025-08-11 23:24:13 +03:00 |
|
Disty0
|
26461f1d8d
|
fix conv in8 matmul
|
2025-08-11 23:15:30 +03:00 |
|
Disty0
|
dc7b25d387
|
Cleanup SDNQ and add SDNQ_USE_TENSORWISE_FP8_MATMUL env var
|
2025-08-11 14:50:17 +03:00 |
|
Disty0
|
3f45c4e570
|
Cleanup SDNQ and skip transpose on packed int8 matmul
|
2025-08-10 19:31:34 +03:00 |
|
Disty0
|
22d86acda3
|
Make SDNQ MatMul listen to the dequantize fp32 setting
|
2025-08-09 01:10:07 +03:00 |
|
Disty0
|
c3d007b02c
|
SDNQ split forward.py into layers and cleanup
|
2025-08-02 17:36:55 +03:00 |
|