1
0
mirror of https://github.com/vladmandic/sdnext.git synced 2026-01-29 05:02:09 +03:00

54 Commits

Author SHA1 Message Date
Disty0
784cda80aa update sdnq 2026-01-14 16:23:26 +03:00
Disty0
47dcab3522 update sdnq 2026-01-09 00:34:32 +03:00
vladmandic
4e8b0f83b4 lint
Signed-off-by: vladmandic <mandic00@live.com>
2026-01-01 16:33:49 +01:00
Disty0
4a4784eafa SDNQ add new stack of custom floating point types and remove irrelevant qtypes from the ui list 2025-12-26 20:09:17 +03:00
Disty0
ce8b6d138c SDNQ remove forced uint4 from convs and cleanup 2025-12-13 01:32:52 +03:00
Disty0
d4e2cbb826 SDNQ fix torch.compile always being active 2025-12-08 18:15:08 +03:00
Disty0
064b64c76c cleanup 2025-12-08 01:14:19 +03:00
Disty0
6e05a12a49 SDNQ post process pre-quants after load 2025-12-08 01:08:53 +03:00
vladmandic
0ad40d2b8b lint
Signed-off-by: vladmandic <mandic00@live.com>
2025-12-02 12:25:04 +01:00
Disty0
d9bc31e7da Cleanup 2025-11-29 01:46:04 +03:00
Disty0
01a0f6b356 Warn and disable quantized matmul if triton is not available 2025-11-29 01:34:54 +03:00
Disty0
55cf627ac6 add version to sdnq 2025-11-28 00:45:24 +03:00
Disty0
73e4d1e379 Pass torch_dtype to sdnq loader 2025-11-27 18:37:35 +03:00
Disty0
7b2a8e3f87 cleanup 2025-11-27 18:26:14 +03:00
Disty0
ff4c254930 Auto handle tied weights with new transformers 2025-11-27 18:24:55 +03:00
CalamitousFelicitousness
9dd537072c Fix import path for SDNQ options and handle Qwen models in load_sdnq_model 2025-11-27 14:53:03 +00:00
Disty0
131c51918b SDNQ fix model_ oader 2025-11-27 14:51:45 +03:00
Disty0
ed6f977218 SDNQ fix z_image matmul 2025-11-27 14:19:29 +03:00
Disty0
48b5d56ba4 Enable or disable quantized matmul on pre-quant models 2025-11-26 21:08:15 +03:00
Disty0
da0df35106 fix typo 2025-11-25 21:58:53 +03:00
Disty0
4e4f49b38d update sdnq loader 2025-11-22 03:45:27 +03:00
Disty0
b6e9332cfe SDNQ de-couple matmul dtype and add fp16 matmul 2025-11-22 02:16:20 +03:00
Disty0
1745ed53f8 Refactor SDNQDequantizer 2025-11-18 01:42:58 +03:00
Disty0
0e8429dbd8 Cleanup 2025-11-07 18:49:29 +03:00
Disty0
93f28f07ac Make SDNQ not depended on quantization_config.json and fix invalid quantization_config getting attached to the model on load 2025-11-07 18:11:21 +03:00
Vladimir Mandic
5ab9a5a15d add sota model loader: runai streamer
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-27 14:20:10 -04:00
Disty0
b627617d14 SDNQ fix enable matmul after load 2025-10-19 17:25:02 +03:00
Disty0
758b006104 cleanup 2025-10-19 02:00:16 +03:00
Disty0
ef72edf18f SDNQ improve svd and low bit matmul perf 2025-10-19 00:06:07 +03:00
Disty0
845869079d Fix sdnq unset config 2025-10-14 17:58:09 +03:00
Disty0
b601f0d402 SDNQ expose svd_steps and update module skip keys 2025-10-14 00:15:09 +03:00
Disty0
9a8ba0fc90 SDNQ unset device specific configs on save 2025-10-11 19:24:09 +03:00
Disty0
c7aba8589b SDNQ fix Qwen loading 2025-10-11 00:05:09 +03:00
Disty0
2a3deaa064 Check T5 keys before override 2025-10-09 22:46:27 +03:00
Disty0
6995d8c3c6 SDNQ fix T5 loading 2025-10-09 22:42:20 +03:00
Disty0
612df3abbb cleanup 2025-10-09 20:09:34 +03:00
Disty0
a9de8ef152 cleanup 2025-10-09 19:58:57 +03:00
Disty0
e19fb2d833 SDNQ keep the quant configs inside the module subfolder, add dtype cast and don't send to GPU 2025-10-09 19:34:48 +03:00
Vladimir Mandic
70defe6d06 handle load shards
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-09 11:29:36 -04:00
Vladimir Mandic
6907fcd320 speedup prequant model load
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-08 13:47:36 -04:00
Disty0
bdcd07f713 Add add_module_skip_keys to pre-load quant too 2025-10-08 01:11:40 +03:00
Disty0
7fdf400e8b cleanup 2025-10-08 00:41:04 +03:00
Disty0
df03ea9ba8 SDNQ add sdnq_post_load_quant and update Qwen keys 2025-10-08 00:29:36 +03:00
Vladimir Mandic
962cb7115d infra for full-model load/save with quant
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-07 14:30:45 -04:00
Vladimir Mandic
7fdc880a73 sdnq patches
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-07 09:43:34 -04:00
Disty0
1cd7b6d63a fix upcast scale check 2025-10-07 01:27:54 +03:00
Disty0
aa0c10440f SDNQ make the loader don't touch the model options by default 2025-10-07 00:15:23 +03:00
Disty0
c931bf9efa SDNQ add dtype casting to loader 2025-10-06 17:44:52 +03:00
Disty0
5c042c5fb8 cleanup 2025-10-06 11:30:26 +03:00
Vladimir Mandic
a315a004e9 linting
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-05 20:25:33 -04:00