Disty0
|
784cda80aa
|
update sdnq
|
2026-01-14 16:23:26 +03:00 |
|
Disty0
|
47dcab3522
|
update sdnq
|
2026-01-09 00:34:32 +03:00 |
|
vladmandic
|
4e8b0f83b4
|
lint
Signed-off-by: vladmandic <mandic00@live.com>
|
2026-01-01 16:33:49 +01:00 |
|
Disty0
|
4a4784eafa
|
SDNQ add new stack of custom floating point types and remove irrelevant qtypes from the ui list
|
2025-12-26 20:09:17 +03:00 |
|
Disty0
|
ce8b6d138c
|
SDNQ remove forced uint4 from convs and cleanup
|
2025-12-13 01:32:52 +03:00 |
|
Disty0
|
d4e2cbb826
|
SDNQ fix torch.compile always being active
|
2025-12-08 18:15:08 +03:00 |
|
Disty0
|
064b64c76c
|
cleanup
|
2025-12-08 01:14:19 +03:00 |
|
Disty0
|
6e05a12a49
|
SDNQ post process pre-quants after load
|
2025-12-08 01:08:53 +03:00 |
|
vladmandic
|
0ad40d2b8b
|
lint
Signed-off-by: vladmandic <mandic00@live.com>
|
2025-12-02 12:25:04 +01:00 |
|
Disty0
|
d9bc31e7da
|
Cleanup
|
2025-11-29 01:46:04 +03:00 |
|
Disty0
|
01a0f6b356
|
Warn and disable quantized matmul if triton is not available
|
2025-11-29 01:34:54 +03:00 |
|
Disty0
|
55cf627ac6
|
add version to sdnq
|
2025-11-28 00:45:24 +03:00 |
|
Disty0
|
73e4d1e379
|
Pass torch_dtype to sdnq loader
|
2025-11-27 18:37:35 +03:00 |
|
Disty0
|
7b2a8e3f87
|
cleanup
|
2025-11-27 18:26:14 +03:00 |
|
Disty0
|
ff4c254930
|
Auto handle tied weights with new transformers
|
2025-11-27 18:24:55 +03:00 |
|
CalamitousFelicitousness
|
9dd537072c
|
Fix import path for SDNQ options and handle Qwen models in load_sdnq_model
|
2025-11-27 14:53:03 +00:00 |
|
Disty0
|
131c51918b
|
SDNQ fix model_ oader
|
2025-11-27 14:51:45 +03:00 |
|
Disty0
|
ed6f977218
|
SDNQ fix z_image matmul
|
2025-11-27 14:19:29 +03:00 |
|
Disty0
|
48b5d56ba4
|
Enable or disable quantized matmul on pre-quant models
|
2025-11-26 21:08:15 +03:00 |
|
Disty0
|
da0df35106
|
fix typo
|
2025-11-25 21:58:53 +03:00 |
|
Disty0
|
4e4f49b38d
|
update sdnq loader
|
2025-11-22 03:45:27 +03:00 |
|
Disty0
|
b6e9332cfe
|
SDNQ de-couple matmul dtype and add fp16 matmul
|
2025-11-22 02:16:20 +03:00 |
|
Disty0
|
1745ed53f8
|
Refactor SDNQDequantizer
|
2025-11-18 01:42:58 +03:00 |
|
Disty0
|
0e8429dbd8
|
Cleanup
|
2025-11-07 18:49:29 +03:00 |
|
Disty0
|
93f28f07ac
|
Make SDNQ not depended on quantization_config.json and fix invalid quantization_config getting attached to the model on load
|
2025-11-07 18:11:21 +03:00 |
|
Vladimir Mandic
|
5ab9a5a15d
|
add sota model loader: runai streamer
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-10-27 14:20:10 -04:00 |
|
Disty0
|
b627617d14
|
SDNQ fix enable matmul after load
|
2025-10-19 17:25:02 +03:00 |
|
Disty0
|
758b006104
|
cleanup
|
2025-10-19 02:00:16 +03:00 |
|
Disty0
|
ef72edf18f
|
SDNQ improve svd and low bit matmul perf
|
2025-10-19 00:06:07 +03:00 |
|
Disty0
|
845869079d
|
Fix sdnq unset config
|
2025-10-14 17:58:09 +03:00 |
|
Disty0
|
b601f0d402
|
SDNQ expose svd_steps and update module skip keys
|
2025-10-14 00:15:09 +03:00 |
|
Disty0
|
9a8ba0fc90
|
SDNQ unset device specific configs on save
|
2025-10-11 19:24:09 +03:00 |
|
Disty0
|
c7aba8589b
|
SDNQ fix Qwen loading
|
2025-10-11 00:05:09 +03:00 |
|
Disty0
|
2a3deaa064
|
Check T5 keys before override
|
2025-10-09 22:46:27 +03:00 |
|
Disty0
|
6995d8c3c6
|
SDNQ fix T5 loading
|
2025-10-09 22:42:20 +03:00 |
|
Disty0
|
612df3abbb
|
cleanup
|
2025-10-09 20:09:34 +03:00 |
|
Disty0
|
a9de8ef152
|
cleanup
|
2025-10-09 19:58:57 +03:00 |
|
Disty0
|
e19fb2d833
|
SDNQ keep the quant configs inside the module subfolder, add dtype cast and don't send to GPU
|
2025-10-09 19:34:48 +03:00 |
|
Vladimir Mandic
|
70defe6d06
|
handle load shards
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-10-09 11:29:36 -04:00 |
|
Vladimir Mandic
|
6907fcd320
|
speedup prequant model load
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-10-08 13:47:36 -04:00 |
|
Disty0
|
bdcd07f713
|
Add add_module_skip_keys to pre-load quant too
|
2025-10-08 01:11:40 +03:00 |
|
Disty0
|
7fdf400e8b
|
cleanup
|
2025-10-08 00:41:04 +03:00 |
|
Disty0
|
df03ea9ba8
|
SDNQ add sdnq_post_load_quant and update Qwen keys
|
2025-10-08 00:29:36 +03:00 |
|
Vladimir Mandic
|
962cb7115d
|
infra for full-model load/save with quant
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-10-07 14:30:45 -04:00 |
|
Vladimir Mandic
|
7fdc880a73
|
sdnq patches
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-10-07 09:43:34 -04:00 |
|
Disty0
|
1cd7b6d63a
|
fix upcast scale check
|
2025-10-07 01:27:54 +03:00 |
|
Disty0
|
aa0c10440f
|
SDNQ make the loader don't touch the model options by default
|
2025-10-07 00:15:23 +03:00 |
|
Disty0
|
c931bf9efa
|
SDNQ add dtype casting to loader
|
2025-10-06 17:44:52 +03:00 |
|
Disty0
|
5c042c5fb8
|
cleanup
|
2025-10-06 11:30:26 +03:00 |
|
Vladimir Mandic
|
a315a004e9
|
linting
Signed-off-by: Vladimir Mandic <mandic00@live.com>
|
2025-10-05 20:25:33 -04:00 |
|