1
0
mirror of https://github.com/vladmandic/sdnext.git synced 2026-01-27 15:02:48 +03:00

26 Commits

Author SHA1 Message Date
CalamitousFelicitousness
6b10f0df4f refactor(caption): address PR review feedback
Rename WD14 module and settings to WaifuDiffusion:
- Rename wd14.py to waifudiffusion.py
- Rename WD14Tagger class to WaifuDiffusionTagger
- Rename WD14_MODELS constant to WAIFUDIFFUSION_MODELS
- Rename settings: wd14_model -> waifudiffusion_model,
  wd14_character_threshold -> waifudiffusion_character_threshold
- Update all log messages from "WD14" to "WaifuDiffusion"

Code quality improvements:
- Simplify threshold parameter defaulting using `or` operator
- Extract save_output logic into _save_tags_to_file() helper with
  isolated error handling to prevent single file failures from
  impacting entire batch
- Fix timing log format consistency (remove 's' suffix)
2026-01-21 11:56:07 +00:00
CalamitousFelicitousness
becb19319d refactor(caption): unify tagger settings and reorganize Caption Tab UI
Consolidate WD14 and DeepBooru tagger settings into unified options:
- Merge wd14_general_threshold + deepbooru_score_threshold → tagger_threshold
- Merge wd14_include_rating + deepbooru_include_rating → tagger_include_rating
- Rename interrogate_score → tagger_show_scores
- Rename tagger_escape → tagger_escape_brackets
- Rename CLiP → OpenCLiP in caption type choices

UI reorganization:
- Add Interrogate tab to Caption Tab with default caption type selector
- Move interrogate_offload to Model Offloading section as "Offload caption models"
- Hide Interrogate settings section (all settings now in Caption Tab UI)
- Update locale_en.json for OpenCLiP naming

Code improvements:
- DeepBooru tag_multi() now accepts same parameters as WD14 for unified interface
- Fix setting references in interrogate.py for consolidated settings
- Add comprehensive tagger test suite (cli/test-tagger.py)
2026-01-21 11:56:07 +00:00
CalamitousFelicitousness
656e86a962 refactor(caption): consolidate interrogate settings into Caption Tab UI
Hide all CLiP, VLM, and Tagger settings from Settings > Interrogate page
while keeping them in shared.opts for persistence. Caption Tab UI becomes
the single control point with change handlers that save directly to config.

Changes:
- Hide OpenCLiP, VLM, and Tagger settings with visible=False
- Add change handlers to save settings when UI controls change
- Rename "Booru Tags" tab to "Tagger", update choice labels
- Update interrogate.py to use unified tagger interface with all settings
2026-01-21 11:56:07 +00:00
CalamitousFelicitousness
09b8fe9761 feat(caption): integrate DeepBooru into unified Booru Tagger UI
Add DeepBooru as a model option alongside WD14 models in the Booru Tags
tab, with dynamic UI that disables inapplicable controls.

Changes:
- Create modules/interrogate/tagger.py as unified adapter module
- Add batch, load/unload, get_models functions to deepbooru.py
- Update ui_caption.py to use unified tagger interface
- Consolidate shared tagger settings in shared.py
- Add implementation plan for future settings consolidation

UI behavior:
- Model dropdown shows DeepBooru + all WD14 models
- Character threshold and include rating disabled for DeepBooru
- All controls re-enable when WD14 model selected
2026-01-21 11:56:07 +00:00
CalamitousFelicitousness
db97c42320 feat(caption): add WD14 tagger with Booru Tags tab
Add SmilingWolf's WD14/WaifuDiffusion tagger models for anime/illustration
tagging as a new "Booru Tags" tab in the Caption panel.

- Support 9 models (v2 and v3 variants) via HuggingFace
- ONNX backend chosen due to safetensors v3 variants exhibiting
  unacceptable accuracy loss
- Separate thresholds for general/character tags
- Batch processing with progress bar
- Consolidate debug env var to SD_INTERROGATE_DEBUG
2026-01-21 11:56:07 +00:00
awsr
0faabffc14 Simplify options init/save/load 2026-01-10 13:27:38 -08:00
vladmandic
a72b98848c cleanup
Signed-off-by: vladmandic <mandic00@live.com>
2025-12-10 10:17:37 +01:00
CalamitousFelicitousness
d277392103 feat(ui): caption tab label styling and CLIP analysis text output
Add clip_labels_text component for CLIP analysis results and standardize
label capitalization across VLM and CLiP sections for consistency.
2025-12-09 18:54:44 +00:00
CalamitousFelicitousness
5193285bc7 refactor(vqa): convert to class-based singleton
Refactor VQA module from module-level globals to a VQA class singleton
  pattern with self-contained per-model loading methods.

Changes:
- Add VQA class with model/processor state and detection data storage
- Extract load methods for clean model pre-loading via UI
- Interrogate to return string only; store detection data on instance
- Add vqa_draw.py for bounding box/point annotation utilities
    Stub, further transfer of drawing functions to follow
- Update moondream3.py to store detection data on VQA singleton
- Update endpoints.py and ui_caption.py for new return type
2025-12-05 20:53:18 +00:00
CalamitousFelicitousness
2b6226b62b feat(vqa): persist thinking mode and improve reasoning output formatting
- Add interrogate_vlm_thinking_mode setting to save checkbox state
- Update ui_caption to restore Thinking Mode preference on load
- Add blank line before 'Answer:' label for visual separation
- Remove '\n\n' replacement in clean() that stripped blank lines
- Fix Qwen reasoning detection when <think> tag is in prompt, not response
- Add reasoning icon to Moondream 2 and 3 model names
2025-12-05 00:00:25 +00:00
CalamitousFelicitousness
506515b018 feat(vqa): add load/unload model buttons to Caption tab
- Add load_model() function to pre-load VLM into memory
- Add unload_model() function to free VLM from memory
- Add Load/Unload buttons to Caption tab UI
2025-12-05 00:00:25 +00:00
CalamitousFelicitousness
a90d85ddfd feat(ui): add dynamic task selection based on VLM model
- Rename "Predefined question" to "Task"
- Task dropdown updates choices when model changes
- Prompt placeholder updates based on selected task
- Model-specific tasks: Florence-2 gets detection tasks, Moondream gets point/detect
2025-12-05 00:00:25 +00:00
CalamitousFelicitousness
4df6aa7944 fix(ui): set prefill text to empty by default 2025-12-05 00:00:25 +00:00
CalamitousFelicitousness
0d88fcd396 feat(ui): add prefill and thinking controls to Caption tab
Add minimal UI controls to expose new VQA functionality:
- Prefill Text input for guiding VLM responses
- Thinking Mode checkbox for reasoning models
- Keep Thinking Trace checkbox for output retention
- Keep Prefill checkbox for output retention
- Annotated Image output panel for detection visualization
- Updated button handlers to pass new parameters
2025-12-05 00:00:24 +00:00
CalamitousFelicitousness
78711fb1d4 Merge branch 'dev' into patch-2 2025-10-01 20:58:58 +01:00
CalamitousFelicitousness
78820a14dc Allow VLM temp setting temperature to 0
Allow VLM temp setting temperature to 0
2025-10-01 20:52:04 +01:00
Vladimir Mandic
cd79f92dff add opts models_not_to_offload
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-09-19 11:21:54 -04:00
Vladimir Mandic
05dd0096c9 set default vqa model
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-09-04 08:38:29 -04:00
Vladimir Mandic
b2dbef53e5 restyled all toolbuttons to be modernui native
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-08-31 15:01:50 -04:00
Vladimir Mandic
8473bae0fc 1000 papercuts
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-05-13 21:51:33 -04:00
Vladimir Mandic
9bf6838962 update video tab
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-03-20 14:39:38 -04:00
Vladimir Mandic
dbfd59434f add gemma3
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-03-15 15:30:57 -04:00
Vladimir Mandic
b6990151c4 caption tab modernui support
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-17 10:59:22 -05:00
Vladimir Mandic
a4b3dc269e modernize clip interrogate
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-16 19:37:09 -05:00
Vladimir Mandic
f3dd9b9646 vlm advanced settings and batch processing
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-15 14:34:28 -05:00
Vladimir Mandic
e95bd93f67 caption ui redesign
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-02-15 12:57:19 -05:00