vladmandic
32b8b082e2
cleanup logging
...
Signed-off-by: vladmandic <mandic00@live.com >
2026-01-16 10:36:02 +01:00
vladmandic
85332594fc
triton test reduce verbosity
...
Signed-off-by: vladmandic <mandic00@live.com >
2026-01-10 10:32:13 +01:00
Seunghoon Lee
49965dfda8
get_hip_arch_name -> get_hip_agent, use amdhip64_7.dll served within rocm package
2026-01-03 21:00:36 +09:00
vladmandic
b9c18452f2
unify hip get arch name
...
Signed-off-by: vladmandic <mandic00@live.com >
2026-01-03 08:22:19 +01:00
Vladimir Mandic
0b1e6d2d3c
improve offloading
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-12-25 10:24:02 +00:00
vladmandic
e2fb70d4a1
detailer draw segmentation overlays
...
Signed-off-by: vladmandic <mandic00@live.com >
2025-12-17 10:03:17 +01:00
Vladimir Mandic
3a3c984411
Merge pull request #4388 from vladmandic/kanvas
...
merge kanvas to dev
2025-11-09 07:57:54 -05:00
Disty0
f4ee9c7052
Add Flex attention
2025-11-09 00:14:38 +03:00
Vladimir Mandic
f491955991
Merge pull request #4383 from vladmandic/dev
...
refresh branch
2025-11-08 15:43:28 -05:00
Vladimir Mandic
69180202d3
kanvas integration
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-11-08 15:41:52 -05:00
Disty0
2bbbb684cc
Rename CK Flash attention to just Flash attention
2025-11-08 23:24:40 +03:00
Disty0
a93715e0da
Don't expose AMD Triton Flash Atten for non AMD
2025-11-08 23:20:55 +03:00
Vladimir Mandic
56026c4e61
refactor attention handling
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-11-08 10:55:41 -05:00
Vladimir Mandic
155ee7f84c
fix sage-attention checks on sm86
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-11-08 08:47:21 -05:00
Vladimir Mandic
5ffbca9377
cleanup and update changelog
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-11-05 12:57:30 -05:00
CalamitousFelicitousness
bdc477d252
Refactor GPU backend selection for sage attention
...
Removed hot path, now everything is defined at setup
Also, passing device to get_device_capability so that it works properly with multi-gpu setups.
2025-11-05 16:31:13 +00:00
CalamitousFelicitousness
4c791fb795
Remove model check logic for SA2 workaround
2025-11-05 10:53:07 +00:00
CalamitousFelicitousness
18676996d0
Sage Attention 2 + Triton workaround Qwen-Image
...
Workaround to prevent black images generated with Qwen-Image models when Sage Attention 2 is enabled with Triton as backend on devices with compute capability 8.0 and 8.6.
Simply switches back to Cuda backend for these models only.
Proof of concept, feel free to close if this is not appropriate.
2025-11-04 23:31:14 +00:00
Vladimir Mandic
780cd26587
triton test hide errors behind debug flag
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-11-04 07:26:21 -05:00
Vladimir Mandic
495cfd8632
fix cn
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-11-01 12:21:19 -04:00
Vladimir Mandic
58f218a560
add cudnn enable/disable override
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-11-01 11:33:39 -04:00
Vladimir Mandic
408b82ef08
cleanup
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-10-29 09:44:18 -04:00
Vladimir Mandic
46876060ab
kandinsky 10s force flex attn
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-10-28 10:53:19 -04:00
vladmandic
4282762c8c
fix late init
...
Signed-off-by: vladmandic <mandic00@live.com >
2025-10-26 18:57:51 -04:00
vladmandic
60ac82b191
add basic xpu gpu monitor
...
Signed-off-by: vladmandic <mandic00@live.com >
2025-10-26 18:55:54 -04:00
vladmandic
0271e0830c
triton split check into early and full
...
Signed-off-by: vladmandic <mandic00@live.com >
2025-10-26 11:48:10 -04:00
Disty0
818b0c0821
Add basic triton test
2025-10-26 10:44:04 +03:00
Vladimir Mandic
4b95d72d45
video tab layout
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-10-18 14:07:52 -04:00
Vladimir Mandic
1ebd96fdc6
add kandinsky5-lite t2v
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-10-18 12:15:36 -04:00
Disty0
4a70e82b0c
ROCm always use numpy on cholesky
2025-09-28 14:05:20 +03:00
Disty0
a47959b114
move ROCm Windows hijacks outside of torch install
2025-09-28 13:33:15 +03:00
Disty0
6766563510
Add info log for Building CK Flash attention
2025-09-10 19:25:07 +03:00
Disty0
bc7c89c070
add typing to sdpa hijacks
2025-09-10 04:43:57 +03:00
Disty0
c51552af90
Enable triton flash atten option for rocm linux too
2025-09-03 16:26:00 +03:00
Disty0
c42e0e0b37
Cleanup
2025-09-03 16:18:02 +03:00
Disty0
266c9c0d3d
Move Zluda Triton flash atten hijack to Triton Flash attention option
2025-09-03 16:16:41 +03:00
Vladimir Mandic
fa44521ea3
offload-never and offload-always per-module and new highvram profile
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-07-31 11:40:24 -04:00
Vladimir Mandic
04af23a3bc
refactore pipeline apply/unapply optional components & features
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-07-26 20:04:07 -04:00
Disty0
ad716b118b
fix enable_gqa with dyn atten
2025-07-26 01:49:17 +03:00
Vladimir Mandic
a5b77b8ee2
remove dead code
...
Signed-off-by: Vladimir Mandic <mandic00@live.com >
2025-07-05 16:47:25 -04:00
Disty0
c9f49720c5
Cleanup
2025-06-25 23:37:41 +03:00
Disty0
dd84fb541f
Always set sdpa params
2025-06-11 21:43:48 +03:00
Disty0
7679028c1a
Override CPU to use FP32 by default
2025-06-06 15:33:51 +03:00
chrismuzyn
299d189276
When using the openvino backend, do not look for an nvidia gpu.
2025-05-12 19:14:26 -04:00
Disty0
b0e5a6c4df
Add devices.has_triton() and enable NNCF compile if triton is available
2025-05-09 22:24:36 +03:00
Disty0
dfebc909eb
Disable cuDNN benchmark on ROCm and add cudnn_benchmark_limit option
2025-05-08 13:27:06 +03:00
Disty0
90f887ac4a
Add dim checks to ck flash atten and fix dim check on dyn atten
2025-03-25 03:50:21 +03:00
Seunghoon Lee
0c890b50e0
proper zluda detection
2025-03-20 23:03:23 +09:00
Disty0
1e0f512ccb
ROCm disable FP16 for gfx1102
2025-03-19 15:42:36 +03:00
Disty0
878cab085f
Reverse the sdpa hijcak order
2025-02-14 19:56:39 +03:00