1
0
mirror of https://github.com/vladmandic/sdnext.git synced 2026-01-27 15:02:48 +03:00

41 Commits

Author SHA1 Message Date
Disty0
8d6bfcd827 Update SDNQ 2026-01-23 14:39:07 +03:00
Disty0
784cda80aa update sdnq 2026-01-14 16:23:26 +03:00
Disty0
47dcab3522 update sdnq 2026-01-09 00:34:32 +03:00
Disty0
e7fa690321 cleanup 2025-12-26 20:10:55 +03:00
Disty0
4a4784eafa SDNQ add new stack of custom floating point types and remove irrelevant qtypes from the ui list 2025-12-26 20:09:17 +03:00
Disty0
ce8b6d138c SDNQ remove forced uint4 from convs and cleanup 2025-12-13 01:32:52 +03:00
Disty0
949ff04577 SDNQ fix fp16 mm with fp8 weights and improve stochastic rounding performance 2025-12-09 17:41:29 +03:00
Disty0
1c2a81ee2d Make SDNQDequantizer a dataclass 2025-12-08 22:29:45 +03:00
Disty0
ed6f977218 SDNQ fix z_image matmul 2025-11-27 14:19:29 +03:00
Disty0
70b96daa63 cleanup 2025-11-25 23:02:01 +03:00
Disty0
b6e9332cfe SDNQ de-couple matmul dtype and add fp16 matmul 2025-11-22 02:16:20 +03:00
Disty0
49cd85d388 SDNQ add training related changes 2025-11-18 22:46:14 +03:00
Disty0
3fbfae5963 cleanup 2025-11-18 02:37:10 +03:00
Disty0
1745ed53f8 Refactor SDNQDequantizer 2025-11-18 01:42:58 +03:00
Disty0
6f33ec3357 SDNQ use the model quant params instead of user settings on Lora 2025-11-10 00:12:38 +03:00
Disty0
f05c29175e cleanup 2025-10-19 02:09:25 +03:00
Disty0
ef72edf18f SDNQ improve svd and low bit matmul perf 2025-10-19 00:06:07 +03:00
Disty0
9206d9443e SDNQ add dequantize model 2025-10-12 00:00:53 +03:00
Disty0
5306376b2a improve contiguous mm performance 2025-10-06 19:05:46 +03:00
Disty0
be91bbff75 SDNQ add SVD support for Convs 2025-10-06 18:26:42 +03:00
Disty0
c931bf9efa SDNQ add dtype casting to loader 2025-10-06 17:44:52 +03:00
Disty0
23f2deaa58 fix enable_quantized_mamtul 2025-10-06 02:04:28 +03:00
Disty0
9e52d0c1fb SDNQ add SVDQuant quantization method 2025-10-05 22:50:30 +03:00
Disty0
f2e12a682f SDNQ remove use_contiguous_mm path in re_quant 2025-10-04 19:17:05 +03:00
Disty0
99113947bf SDNQ add RDNA2 INT8 support via Triton 2025-10-04 18:31:25 +03:00
Disty0
95a7da7e75 SDNQ use non-contiguous re-quantize 2025-10-03 18:54:58 +03:00
Disty0
54acf1760b Make SDNQ scales compatible with balanced offload 2025-10-03 18:13:55 +03:00
Disty0
e6715ba8d3 Cleanup SDNQ compile 2025-09-19 19:29:36 +03:00
Disty0
a12edc1e90 SDNQ use nan_to_num_ with fp8 quantization in case of zeros 2025-09-15 20:22:39 +03:00
Disty0
4ec8603f63 SDNQ re-add bitpacking for uint1 2025-08-29 23:06:11 +03:00
Disty0
d49e954918 SDNQ listen to dequantize_fp32 option with re_quantize 2025-08-29 22:48:28 +03:00
Disty0
a8de3f7282 SDNQ add quantized matmul support for all quantization types and group sizes 2025-08-29 22:26:47 +03:00
Disty0
8460be662c SDNQ use inplace transpose and use view instead of reshape 2025-08-17 05:07:55 +03:00
Disty0
dc7b25d387 Cleanup SDNQ and add SDNQ_USE_TENSORWISE_FP8_MATMUL env var 2025-08-11 14:50:17 +03:00
Disty0
3f45c4e570 Cleanup SDNQ and skip transpose on packed int8 matmul 2025-08-10 19:31:34 +03:00
Disty0
c3d007b02c SDNQ split forward.py into layers and cleanup 2025-08-02 17:36:55 +03:00
Disty0
25a4731a97 SDNQ use static compile 2025-07-20 16:25:57 +03:00
Disty0
86cd272b96 SDNQ fix Dora 2025-06-18 16:24:42 +03:00
Disty0
26800a1ef9 Cleanup sdnq 2025-06-17 02:05:13 +03:00
Disty0
d31df8c1eb SDNQ fuse bias into dequantizer with matmul 2025-06-14 22:10:10 +03:00
Disty0
5e013fb154 SDNQ optimize input quantization and use the word quantize instead of compress 2025-06-12 12:06:57 +03:00