1
0
mirror of https://github.com/vladmandic/sdnext.git synced 2026-01-27 15:02:48 +03:00

281 Commits

Author SHA1 Message Date
vladmandic
32b8b082e2 cleanup logging
Signed-off-by: vladmandic <mandic00@live.com>
2026-01-16 10:36:02 +01:00
vladmandic
85332594fc triton test reduce verbosity
Signed-off-by: vladmandic <mandic00@live.com>
2026-01-10 10:32:13 +01:00
Seunghoon Lee
49965dfda8 get_hip_arch_name -> get_hip_agent, use amdhip64_7.dll served within rocm package 2026-01-03 21:00:36 +09:00
vladmandic
b9c18452f2 unify hip get arch name
Signed-off-by: vladmandic <mandic00@live.com>
2026-01-03 08:22:19 +01:00
Vladimir Mandic
0b1e6d2d3c improve offloading
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-12-25 10:24:02 +00:00
vladmandic
e2fb70d4a1 detailer draw segmentation overlays
Signed-off-by: vladmandic <mandic00@live.com>
2025-12-17 10:03:17 +01:00
Vladimir Mandic
3a3c984411 Merge pull request #4388 from vladmandic/kanvas
merge kanvas to dev
2025-11-09 07:57:54 -05:00
Disty0
f4ee9c7052 Add Flex attention 2025-11-09 00:14:38 +03:00
Vladimir Mandic
f491955991 Merge pull request #4383 from vladmandic/dev
refresh branch
2025-11-08 15:43:28 -05:00
Vladimir Mandic
69180202d3 kanvas integration
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-11-08 15:41:52 -05:00
Disty0
2bbbb684cc Rename CK Flash attention to just Flash attention 2025-11-08 23:24:40 +03:00
Disty0
a93715e0da Don't expose AMD Triton Flash Atten for non AMD 2025-11-08 23:20:55 +03:00
Vladimir Mandic
56026c4e61 refactor attention handling
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-11-08 10:55:41 -05:00
Vladimir Mandic
155ee7f84c fix sage-attention checks on sm86
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-11-08 08:47:21 -05:00
Vladimir Mandic
5ffbca9377 cleanup and update changelog
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-11-05 12:57:30 -05:00
CalamitousFelicitousness
bdc477d252 Refactor GPU backend selection for sage attention
Removed hot path, now everything is defined at setup

Also, passing device to get_device_capability so that it works properly with multi-gpu setups.
2025-11-05 16:31:13 +00:00
CalamitousFelicitousness
4c791fb795 Remove model check logic for SA2 workaround 2025-11-05 10:53:07 +00:00
CalamitousFelicitousness
18676996d0 Sage Attention 2 + Triton workaround Qwen-Image
Workaround to prevent black images generated with Qwen-Image models when Sage Attention 2 is enabled with Triton as backend on devices with compute capability 8.0 and 8.6.

Simply switches back to Cuda backend for these models only.

Proof of concept, feel free to close if this is not appropriate.
2025-11-04 23:31:14 +00:00
Vladimir Mandic
780cd26587 triton test hide errors behind debug flag
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-11-04 07:26:21 -05:00
Vladimir Mandic
495cfd8632 fix cn
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-11-01 12:21:19 -04:00
Vladimir Mandic
58f218a560 add cudnn enable/disable override
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-11-01 11:33:39 -04:00
Vladimir Mandic
408b82ef08 cleanup
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-29 09:44:18 -04:00
Vladimir Mandic
46876060ab kandinsky 10s force flex attn
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-28 10:53:19 -04:00
vladmandic
4282762c8c fix late init
Signed-off-by: vladmandic <mandic00@live.com>
2025-10-26 18:57:51 -04:00
vladmandic
60ac82b191 add basic xpu gpu monitor
Signed-off-by: vladmandic <mandic00@live.com>
2025-10-26 18:55:54 -04:00
vladmandic
0271e0830c triton split check into early and full
Signed-off-by: vladmandic <mandic00@live.com>
2025-10-26 11:48:10 -04:00
Disty0
818b0c0821 Add basic triton test 2025-10-26 10:44:04 +03:00
Vladimir Mandic
4b95d72d45 video tab layout
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-18 14:07:52 -04:00
Vladimir Mandic
1ebd96fdc6 add kandinsky5-lite t2v
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-18 12:15:36 -04:00
Disty0
4a70e82b0c ROCm always use numpy on cholesky 2025-09-28 14:05:20 +03:00
Disty0
a47959b114 move ROCm Windows hijacks outside of torch install 2025-09-28 13:33:15 +03:00
Disty0
6766563510 Add info log for Building CK Flash attention 2025-09-10 19:25:07 +03:00
Disty0
bc7c89c070 add typing to sdpa hijacks 2025-09-10 04:43:57 +03:00
Disty0
c51552af90 Enable triton flash atten option for rocm linux too 2025-09-03 16:26:00 +03:00
Disty0
c42e0e0b37 Cleanup 2025-09-03 16:18:02 +03:00
Disty0
266c9c0d3d Move Zluda Triton flash atten hijack to Triton Flash attention option 2025-09-03 16:16:41 +03:00
Vladimir Mandic
fa44521ea3 offload-never and offload-always per-module and new highvram profile
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-31 11:40:24 -04:00
Vladimir Mandic
04af23a3bc refactore pipeline apply/unapply optional components & features
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-26 20:04:07 -04:00
Disty0
ad716b118b fix enable_gqa with dyn atten 2025-07-26 01:49:17 +03:00
Vladimir Mandic
a5b77b8ee2 remove dead code
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-05 16:47:25 -04:00
Disty0
c9f49720c5 Cleanup 2025-06-25 23:37:41 +03:00
Disty0
dd84fb541f Always set sdpa params 2025-06-11 21:43:48 +03:00
Disty0
7679028c1a Override CPU to use FP32 by default 2025-06-06 15:33:51 +03:00
chrismuzyn
299d189276 When using the openvino backend, do not look for an nvidia gpu. 2025-05-12 19:14:26 -04:00
Disty0
b0e5a6c4df Add devices.has_triton() and enable NNCF compile if triton is available 2025-05-09 22:24:36 +03:00
Disty0
dfebc909eb Disable cuDNN benchmark on ROCm and add cudnn_benchmark_limit option 2025-05-08 13:27:06 +03:00
Disty0
90f887ac4a Add dim checks to ck flash atten and fix dim check on dyn atten 2025-03-25 03:50:21 +03:00
Seunghoon Lee
0c890b50e0 proper zluda detection 2025-03-20 23:03:23 +09:00
Disty0
1e0f512ccb ROCm disable FP16 for gfx1102 2025-03-19 15:42:36 +03:00
Disty0
878cab085f Reverse the sdpa hijcak order 2025-02-14 19:56:39 +03:00