1
0
mirror of https://github.com/vladmandic/sdnext.git synced 2026-01-29 05:02:09 +03:00

Commit Graph

  • 3f3a986c0e SDNQ fix scale staying in fp32 with tensorwise fp8 matmul Disty0 2025-06-02 23:18:10 +03:00
  • 3ca9f29c8f Merge pull request #3957 from vladmandic/dev Vladimir Mandic 2025-06-02 15:52:22 +02:00
  • 7d05bed459 update changelog Vladimir Mandic 2025-06-02 15:50:42 +02:00
  • f147bca20f Merge branch 'master' into dev Vladimir Mandic 2025-06-02 15:45:38 +02:00
  • b1d6897621 update changelog Vladimir Mandic 2025-06-02 15:44:52 +02:00
  • 4e3795a0a5 SDNQ fix packed int8 matmul Disty0 2025-06-02 03:31:51 +03:00
  • 82f5634d53 SDNQ use torch.bool for uint1 Disty0 2025-06-02 01:39:51 +03:00
  • ea7fe3bb73 OpenVINO update dtype_mapping Disty0 2025-06-02 01:11:41 +03:00
  • e8588c91ea SDNQ enable matmul support for float8_e5m2 Disty0 2025-06-02 00:53:10 +03:00
  • 8f1a1d7311 SDNQ expand quantized_matmul_dtypes for CPU Disty0 2025-06-02 00:28:29 +03:00
  • b146025a5e SDNQ add int2 Disty0 2025-06-02 00:17:39 +03:00
  • 766aec32d5 Update changelog Disty0 2025-06-01 23:35:00 +03:00
  • 9669b36010 SDNQ fix older PyTorch with FP8 matmul Disty0 2025-06-01 23:29:16 +03:00
  • acefa58834 SDNQ don't force fp32 with fp8 tensorwise matmul Disty0 2025-06-01 23:16:00 +03:00
  • 839295f79a Add fp8 fnuz to sdnq options Disty0 2025-06-01 23:10:08 +03:00
  • c77162fb82 update wiki and changelog Vladimir Mandic 2025-06-01 21:31:43 +02:00
  • 539fae3234 Update naming Disty0 2025-06-01 21:01:56 +03:00
  • cefe460052 SDNQ skip FP8 matmul for input len < 32 Disty0 2025-05-31 01:27:59 +03:00
  • 046840c8be Fix HiDream sampling Disty0 2025-05-31 00:52:56 +03:00
  • 109c0d7e49 SDNQ use tensorwise FP8 matmul on CPU Disty0 2025-05-30 21:09:53 +03:00
  • 959b759721 Cleanup Disty0 2025-05-30 16:45:59 +03:00
  • b5d588fa45 SDNQ remove unnecessary bitwise ands Disty0 2025-05-30 16:29:59 +03:00
  • db816d7088 Cleanup Disty0 2025-05-30 16:02:26 +03:00
  • c85cc6b397 SDNQ enable quant with GPU by default and don't do unnecessary clones Disty0 2025-05-30 15:21:29 +03:00
  • 4654acde3c SDNQ re-enable memory fix for diffusers Disty0 2025-05-30 14:59:45 +03:00
  • 87a801e24d SDNQ remove memory fix hijack Disty0 2025-05-30 13:54:49 +03:00
  • f81cb22c00 SDNQ fix new transformers Disty0 2025-05-30 13:32:03 +03:00
  • 36febda6e6 SDNQ update supported dtypes Disty0 2025-05-30 13:07:23 +03:00
  • 29bd2af779 SDNQ add 6-bit support Disty0 2025-05-30 12:20:13 +03:00
  • 98a11fc86c fix gallery duplicate entries Vladimir Mandic 2025-05-30 11:04:56 +02:00
  • 9168a66fd2 update requirements Vladimir Mandic 2025-05-30 08:53:39 +02:00
  • 4e184f41af update changelog Vladimir Mandic 2025-05-30 08:44:30 +02:00
  • d1491962d9 One bit Disty0 2025-05-30 05:41:02 +03:00
  • 3c8be0f55f SDNQ add uint2 Disty0 2025-05-30 04:47:29 +03:00
  • 599224d392 SDNQ reduce 5 reshape ops to 2 with quantized input Disty0 2025-05-30 01:31:41 +03:00
  • d8dea9031f SDNQ do FP8 matmul shape check only once Disty0 2025-05-30 01:13:37 +03:00
  • b4e615e760 SDNQ add FP8 row wise scaling workaround for SM89 on Windows Disty0 2025-05-30 00:16:54 +03:00
  • 54154cf698 Cleanup Disty0 2025-05-29 20:22:49 +03:00
  • 90324f9c8c SDNQ fix lora with quant matmul Disty0 2025-05-29 18:25:12 +03:00
  • df8b31fcfc Don't downcast scale with fp8 matmul Disty0 2025-05-29 16:35:40 +03:00
  • 2351efb8f7 Remove redundant shape check Disty0 2025-05-29 14:58:00 +03:00
  • 14893b7617 Don't make the weights contiguous with int8 matmul Disty0 2025-05-29 03:43:57 +03:00
  • cf2d1e56a6 Update changelog Disty0 2025-05-29 03:32:11 +03:00
  • 2cc5a58b0f Update changelog Disty0 2025-05-29 03:26:47 +03:00
  • 67e0f4d833 Cleanup Disty0 2025-05-29 03:22:40 +03:00
  • 3698f8bb84 SDNQ add experimental FP8 matmul Disty0 2025-05-29 03:11:59 +03:00
  • dd33c4d583 Fix scale and zero_point not being moved by tensor.to Disty0 2025-05-28 17:46:06 +03:00
  • dd0dbc476f SDNQ fix asym quant formula for dtypes with non zero minimums Disty0 2025-05-28 17:25:38 +03:00
  • e06cbea7aa Cleanup Disty0 2025-05-28 15:55:08 +03:00
  • d8e8f47ce5 SDNQ add an option to toggle quantize with GPU Disty0 2025-05-28 15:18:39 +03:00
  • 1961e88c13 Set SDPA as the default on all backends and enable Dyn SDPA on ROCm, DML, CPU and MPS Disty0 2025-05-28 13:42:29 +03:00
  • 569e9099d7 Use torch.amax instead of torch.max Disty0 2025-05-28 12:44:07 +03:00
  • 0b564e2373 Cleanup Disty0 2025-05-28 04:07:45 +03:00
  • 1433dfe3de SDNQ fix high RAM usage with pre mode Disty0 2025-05-28 03:16:29 +03:00
  • 4ed15f5cce SDNQ revert device_map = gpu Disty0 2025-05-27 23:32:58 +03:00
  • d3e3fb98b0 Don't override user set device_map Disty0 2025-05-27 21:45:52 +03:00
  • b1b29e9001 SDNQ disable device_map = gpu with TE and LLM Disty0 2025-05-27 21:32:32 +03:00
  • b724cd7c57 Update changelog Disty0 2025-05-27 21:21:42 +03:00
  • 5d3c1832b2 SDNQ add FP8 quants Disty0 2025-05-27 20:29:15 +03:00
  • 3618e39cff SDNQ use device_map = gpu Disty0 2025-05-27 19:46:30 +03:00
  • 73999ac710 Add soft gc to nncf quant layer Disty0 2025-05-27 16:24:04 +03:00
  • e94128a02e SDNQ add force torch_gc to pre load mode Disty0 2025-05-27 16:11:04 +03:00
  • dece497f10 Refactor SDNQ to use weights_dtype and rename decompress_int8_matmul to use_quantized_matmul Disty0 2025-05-27 15:49:21 +03:00
  • 79bb348927 SDNQ sort quant schemes by recommended order Disty0 2025-05-27 13:06:17 +03:00
  • dec460e665 SDNQ use torch.bitwise ops instead of python Disty0 2025-05-27 03:02:36 +03:00
  • 280be31883 SDNQ fix Lora change Disty0 2025-05-27 00:08:32 +03:00
  • 4d9c2a8608 Cleanup Disty0 2025-05-26 22:41:12 +03:00
  • 84ddfb2868 SDNQ fix lora apply Disty0 2025-05-26 22:39:20 +03:00
  • 6dee9f5ac7 Fix HiDream teacache not reseting Disty0 2025-05-26 21:21:01 +03:00
  • 742cd61d1f Add TeaCache for HiDream Disty0 2025-05-26 19:59:43 +03:00
  • 687c50dcc8 SDNQ fix Lora Disty0 2025-05-26 19:48:45 +03:00
  • ccf9deaf28 Move SDNQ to the top of the settings list Disty0 2025-05-26 18:30:50 +03:00
  • 02f15b28cc Cleanup Disty0 2025-05-26 15:57:17 +03:00
  • 91bb07f650 SDNQ remove unused args and simplify decompressors Disty0 2025-05-26 15:51:53 +03:00
  • d2159af10e cleanup Disty0 2025-05-26 04:24:28 +03:00
  • 4ad404182d cleanup Disty0 2025-05-26 04:17:22 +03:00
  • 3f8ae754a0 Update readme Disty0 2025-05-26 03:35:25 +03:00
  • 46e9a9a631 IPEX disable Dynamic Attention by default on PyTorch 2.7 Disty0 2025-05-26 03:03:17 +03:00
  • 5fcd0be79c Update changelog Disty0 2025-05-26 02:50:05 +03:00
  • e314a7ca19 Update changelog Disty0 2025-05-26 02:49:12 +03:00
  • 17df7ba83b Cleanup whitespace Disty0 2025-05-26 02:41:29 +03:00
  • 4453efee76 Rename NNCF to SDNQ and rename quant schemes Disty0 2025-05-26 02:39:51 +03:00
  • 9c2e15433e NNCF set required_packages to None Disty0 2025-05-26 01:39:09 +03:00
  • cbc1bfe710 Cleanup Disty0 2025-05-26 01:24:06 +03:00
  • 2d79380bd7 NNCF implement better layer hijacks and remove all NNCF imports Disty0 2025-05-26 01:12:28 +03:00
  • af3a44ccbe optional skimage Vladimir Mandic 2025-05-24 08:58:04 +02:00
  • bfc5c7c457 installer version check Vladimir Mandic 2025-05-24 08:42:13 +02:00
  • 85f00f9edb Enable dyn atten by default for ROCm Disty0 2025-05-23 18:24:47 +03:00
  • 50e3a134ca Merge pull request #3941 from hypercryptoman/patch-1 Vladimir Mandic 2025-05-23 10:31:01 +02:00
  • abc081d242 Merge branch 'dev' into patch-1 Vladimir Mandic 2025-05-23 10:30:33 +02:00
  • ac05b96838 Update prompt_enhance.py hypercryptoman 2025-05-19 13:55:16 +10:00
  • 05fced7395 Update prompt_enhance.py hypercryptoman 2025-05-19 13:53:24 +10:00
  • ba2eaaf295 Fix: Correct model_file parameter usage for custom load button hypercryptoman 2025-05-19 13:45:05 +10:00
  • 2b824daf64 Revert MIOPEN_FIND_ENFORCE Disty0 2025-05-18 21:28:44 +03:00
  • 7f2d77e956 ROCm set MIOPEN_FIND_ENFORCE to SEARCH Disty0 2025-05-18 16:36:27 +03:00
  • b23162a36b Fix: Correct arguments for prompt_enhance.py apply method hypercryptoman 2025-05-18 23:35:55 +10:00
  • 3d8390de9b IPEX return devices.dtype instead of bf16 Disty0 2025-05-18 04:44:51 +03:00
  • d0e6f01286 IPEX remove GradScaler and use torch.amp instead Disty0 2025-05-18 04:39:48 +03:00
  • a009e17d2b NNCF use per token input quantization with int8 matmul Disty0 2025-05-17 19:46:47 +03:00
  • 12ebadccd4 Merge pull request #3940 from vladmandic/dev 2025-05-16 Vladimir Mandic 2025-05-16 10:40:17 -04:00