Disty0
|
8d6bfcd827
|
Update SDNQ
|
2026-01-23 14:39:07 +03:00 |
|
Disty0
|
784cda80aa
|
update sdnq
|
2026-01-14 16:23:26 +03:00 |
|
Disty0
|
47dcab3522
|
update sdnq
|
2026-01-09 00:34:32 +03:00 |
|
Disty0
|
e7fa690321
|
cleanup
|
2025-12-26 20:10:55 +03:00 |
|
Disty0
|
4a4784eafa
|
SDNQ add new stack of custom floating point types and remove irrelevant qtypes from the ui list
|
2025-12-26 20:09:17 +03:00 |
|
Disty0
|
ce8b6d138c
|
SDNQ remove forced uint4 from convs and cleanup
|
2025-12-13 01:32:52 +03:00 |
|
Disty0
|
949ff04577
|
SDNQ fix fp16 mm with fp8 weights and improve stochastic rounding performance
|
2025-12-09 17:41:29 +03:00 |
|
Disty0
|
1c2a81ee2d
|
Make SDNQDequantizer a dataclass
|
2025-12-08 22:29:45 +03:00 |
|
Disty0
|
ed6f977218
|
SDNQ fix z_image matmul
|
2025-11-27 14:19:29 +03:00 |
|
Disty0
|
70b96daa63
|
cleanup
|
2025-11-25 23:02:01 +03:00 |
|
Disty0
|
b6e9332cfe
|
SDNQ de-couple matmul dtype and add fp16 matmul
|
2025-11-22 02:16:20 +03:00 |
|
Disty0
|
49cd85d388
|
SDNQ add training related changes
|
2025-11-18 22:46:14 +03:00 |
|
Disty0
|
3fbfae5963
|
cleanup
|
2025-11-18 02:37:10 +03:00 |
|
Disty0
|
1745ed53f8
|
Refactor SDNQDequantizer
|
2025-11-18 01:42:58 +03:00 |
|
Disty0
|
6f33ec3357
|
SDNQ use the model quant params instead of user settings on Lora
|
2025-11-10 00:12:38 +03:00 |
|
Disty0
|
f05c29175e
|
cleanup
|
2025-10-19 02:09:25 +03:00 |
|
Disty0
|
ef72edf18f
|
SDNQ improve svd and low bit matmul perf
|
2025-10-19 00:06:07 +03:00 |
|
Disty0
|
9206d9443e
|
SDNQ add dequantize model
|
2025-10-12 00:00:53 +03:00 |
|
Disty0
|
5306376b2a
|
improve contiguous mm performance
|
2025-10-06 19:05:46 +03:00 |
|
Disty0
|
be91bbff75
|
SDNQ add SVD support for Convs
|
2025-10-06 18:26:42 +03:00 |
|
Disty0
|
c931bf9efa
|
SDNQ add dtype casting to loader
|
2025-10-06 17:44:52 +03:00 |
|
Disty0
|
23f2deaa58
|
fix enable_quantized_mamtul
|
2025-10-06 02:04:28 +03:00 |
|
Disty0
|
9e52d0c1fb
|
SDNQ add SVDQuant quantization method
|
2025-10-05 22:50:30 +03:00 |
|
Disty0
|
f2e12a682f
|
SDNQ remove use_contiguous_mm path in re_quant
|
2025-10-04 19:17:05 +03:00 |
|
Disty0
|
99113947bf
|
SDNQ add RDNA2 INT8 support via Triton
|
2025-10-04 18:31:25 +03:00 |
|
Disty0
|
95a7da7e75
|
SDNQ use non-contiguous re-quantize
|
2025-10-03 18:54:58 +03:00 |
|
Disty0
|
54acf1760b
|
Make SDNQ scales compatible with balanced offload
|
2025-10-03 18:13:55 +03:00 |
|
Disty0
|
e6715ba8d3
|
Cleanup SDNQ compile
|
2025-09-19 19:29:36 +03:00 |
|
Disty0
|
a12edc1e90
|
SDNQ use nan_to_num_ with fp8 quantization in case of zeros
|
2025-09-15 20:22:39 +03:00 |
|
Disty0
|
4ec8603f63
|
SDNQ re-add bitpacking for uint1
|
2025-08-29 23:06:11 +03:00 |
|
Disty0
|
d49e954918
|
SDNQ listen to dequantize_fp32 option with re_quantize
|
2025-08-29 22:48:28 +03:00 |
|
Disty0
|
a8de3f7282
|
SDNQ add quantized matmul support for all quantization types and group sizes
|
2025-08-29 22:26:47 +03:00 |
|
Disty0
|
8460be662c
|
SDNQ use inplace transpose and use view instead of reshape
|
2025-08-17 05:07:55 +03:00 |
|
Disty0
|
dc7b25d387
|
Cleanup SDNQ and add SDNQ_USE_TENSORWISE_FP8_MATMUL env var
|
2025-08-11 14:50:17 +03:00 |
|
Disty0
|
3f45c4e570
|
Cleanup SDNQ and skip transpose on packed int8 matmul
|
2025-08-10 19:31:34 +03:00 |
|
Disty0
|
c3d007b02c
|
SDNQ split forward.py into layers and cleanup
|
2025-08-02 17:36:55 +03:00 |
|
Disty0
|
25a4731a97
|
SDNQ use static compile
|
2025-07-20 16:25:57 +03:00 |
|
Disty0
|
86cd272b96
|
SDNQ fix Dora
|
2025-06-18 16:24:42 +03:00 |
|
Disty0
|
26800a1ef9
|
Cleanup sdnq
|
2025-06-17 02:05:13 +03:00 |
|
Disty0
|
d31df8c1eb
|
SDNQ fuse bias into dequantizer with matmul
|
2025-06-14 22:10:10 +03:00 |
|
Disty0
|
5e013fb154
|
SDNQ optimize input quantization and use the word quantize instead of compress
|
2025-06-12 12:06:57 +03:00 |
|