sdnext

AI/sdnext

mirror of https://github.com/vladmandic/sdnext.git synced 2026-01-27 15:02:48 +03:00

Author	SHA1	Message	Date
CalamitousFelicitousness	0659759e90	fix(vqa): improve unload logging consistency Add before/after debug messages when unloading VQA model to match the pattern used in prompt enhance for better debugging visibility.	2026-01-12 00:17:20 +00:00
vladmandic	ffe1e2a861	cleanup Signed-off-by: vladmandic <mandic00@live.com>	2026-01-10 11:32:32 +01:00
vladmandic	a72b98848c	cleanup Signed-off-by: vladmandic <mandic00@live.com>	2025-12-10 10:17:37 +01:00
vladmandic	3f161b5532	lint moondream Signed-off-by: vladmandic <mandic00@live.com>	2025-12-08 18:16:00 +01:00
vladmandic	69f0d6bf5d	lint Signed-off-by: vladmandic <mandic00@live.com>	2025-12-08 18:12:47 +01:00
CalamitousFelicitousness	a51e1501d6	fix(vqa): no moondream3 compile during explicit load - Initialize KV caches before moving model to device - Disable flex_attention decoding to avoid torch.compile hang - Remove unused compile step (controlled by cuda_compile setting) The flex_attention's create_block_mask triggers torch compilation which can hang the system when called during model preload.	2025-12-06 02:26:34 +00:00
CalamitousFelicitousness	7714f71994	feat(vqa): un/load support and extract detection Make external VQA handlers (moondream3, joytag, joycaption, deepseek) compatible with VQA load/unload mechanism for consistent model lifecycle. - Added vqa_detection.py, add shared detection helpers - Add load and unload functions to all external handlers - Replace device_map="auto" with sd_models.move_model in joycaption - Update dispatcher and moondream handlers to use shared helpers	2025-12-05 23:52:02 +00:00
CalamitousFelicitousness	5193285bc7	refactor(vqa): convert to class-based singleton Refactor VQA module from module-level globals to a VQA class singleton pattern with self-contained per-model loading methods. Changes: - Add VQA class with model/processor state and detection data storage - Extract load methods for clean model pre-loading via UI - Interrogate to return string only; store detection data on instance - Add vqa_draw.py for bounding box/point annotation utilities Stub, further transfer of drawing functions to follow - Update moondream3.py to store detection data on VQA singleton - Update endpoints.py and ui_caption.py for new return type	2025-12-05 20:53:18 +00:00
CalamitousFelicitousness	d1b1d574a6	fix(vqa): add graceful error for empty "Use Prompt" task Replace silent fallback to "Describe the image" with explicit error when user selects "Use Prompt" but leaves the prompt field empty. Follows the same pattern as missing image validation.	2025-12-05 01:48:07 +00:00
CalamitousFelicitousness	a8a9e6d836	fix(vqa): separate Moondream 2 and 3 task prompts Moondream 3 does not support gaze detection (detect_gaze method), so "Detect Gaze" task is now only shown for Moondream 2.	2025-12-05 01:38:28 +00:00
CalamitousFelicitousness	2b6226b62b	feat(vqa): persist thinking mode and improve reasoning output formatting - Add interrogate_vlm_thinking_mode setting to save checkbox state - Update ui_caption to restore Thinking Mode preference on load - Add blank line before 'Answer:' label for visual separation - Remove '\n\n' replacement in clean() that stripped blank lines - Fix Qwen reasoning detection when <think> tag is in prompt, not response - Add reasoning icon to Moondream 2 and 3 model names	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	a4b5e84a13	feat(vqa): enhance Moondream 2 with reasoning mode, gaze detection, and annotations - Add thinking_mode/reasoning parameter to enable reasoning mode - Add Detect Gaze task with placeholder hint - Parse point/detect results to return annotation data for visualization - Handle keep_thinking setting: format as "Reasoning:\n...\nAnswer:\n..." or discard - Add comprehensive debug logging throughout handler	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	c75a09be83	fix(vqa): handle Moondream point and detect tasks Add handlers for "Point at..." and "Detect..." tasks in moondream() that were falling through to answer_question() and failing.	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	506515b018	feat(vqa): add load/unload model buttons to Caption tab - Add load_model() function to pre-load VLM into memory - Add unload_model() function to free VLM from memory - Add Load/Unload buttons to Caption tab UI	2025-12-05 00:00:25 +00:00
CalamitousFelicitousness	27fa48cc99	feat(vqa): major VQA handler refactor with prefill, thinking, and visualization Comprehensive overhaul of the VQA interrogation system including: - Prefill text support for guiding VLM responses - Thinking mode support with tag cleanup/retention - Dynamic prompt/task selection based on model type - Bounding box visualization for detection results - Debug infrastructure (SD_VQA_DEBUG env var) - New model support: MiMo-VL, Nidum Gemma, Allura Gemma - Model-specific prompt lists (Florence, Moondream)	2025-12-05 00:00:24 +00:00
CalamitousFelicitousness	0a322c0faf	feat(vqa): add Moondream 3 Preview handler Add support for Moondream 3 Preview VLM with: - Text query, caption, point, and detect capabilities - Bounding box visualization for object detection - Max pixels setting for resolution control - Device offloading support	2025-12-05 00:00:24 +00:00
CalamitousFelicitousness	85cd222793	fix(vqa): sort CLiP analysis results and add text output Improvements to the OpenCLIP interrogation: - Sort all ranking dicts by similarity score (descending) - Add format_category() helper for text formatting - Add formatted text output for CLIP labels textbox - Return additional text update in analyze_image()	2025-12-02 21:48:09 +00:00
CalamitousFelicitousness	eb832a4850	fix(vqa): respect offload setting in JoyCaption, add max_pixels Two fixes for the JoyCaption handler: - Only offload model if shared.opts.interrogate_offload is True - Add max_pixels=1024*1024 to AutoProcessor for consistent image handling	2025-12-02 21:46:09 +00:00
Vladimir Mandic	f2835499b1	kanvas bindings Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-11-07 12:21:48 -05:00
Vladimir Mandic	58581896f5	cleanup Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-10-26 10:01:24 -04:00
CalamitousFelicitousness	33f335a98c	VQA class fix f-statement fix	2025-10-26 06:39:05 +00:00
CalamitousFelicitousness	25607693ca	Merge branch 'dev' into qwen3-vl	2025-10-26 06:16:38 +00:00
CalamitousFelicitousness	80bb331169	Prompt enhance resizing and Qwen VL fix	2025-10-26 06:01:33 +00:00
CalamitousFelicitousness	3fc9efa9ee	Add remaining Qwen3VL models up to 8B	2025-10-26 02:53:34 +00:00
CalamitousFelicitousness	1b80147881	Add Qwen3-VL-4B-Instruct	2025-10-25 22:12:20 +01:00
CalamitousFelicitousness	c5d937b9c4	Fix typo in Qwen2.5 VL 4B to 3B https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct has been wrongly named 4B in Captioning menu.	2025-10-25 20:26:38 +01:00
Vladimir Mandic	3e47f3dd9a	video prompt enhance Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-10-05 20:17:32 -04:00
Disty0	81bb2b99ef	update florence promptgen repo ids	2025-10-01 21:43:02 +03:00
Vladimir Mandic	22074f4727	cleanup vqa Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-10-01 12:02:55 -04:00
Vladimir Mandic	5d0a3e5e8a	fix microsoft-florence Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-10-01 10:58:52 -04:00
Vladimir Mandic	d351fdb98f	add more job state updates and update history tab Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-09-13 10:54:04 -04:00
Vladimir Mandic	175e9cbe29	cleanup/refactor state history Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-09-12 16:12:45 -04:00
Vladimir Mandic	d665ac254e	add apple-fastvlm Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-09-05 14:25:37 -04:00
Vladimir Mandic	05dd0096c9	set default vqa model Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-09-04 08:38:29 -04:00
Vladimir Mandic	863e172aad	add Qwen/Qwen2.5-VL-3B-Instruct Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-08-12 15:09:08 -04:00
Vladimir Mandic	fa44521ea3	offload-never and offload-always per-module and new highvram profile Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-07-31 11:40:24 -04:00
Vladimir Mandic	d8e03bb855	improve handling of wan22 stages Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-07-30 11:22:08 -04:00
Vladimir Mandic	f243c35892	improve traceback display Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-07-21 19:07:35 -04:00
Vladimir Mandic	287c3600d7	torch compile for llm Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-07-20 12:07:30 -04:00
Vladimir Mandic	c559e26616	add builtin framepack Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-07-08 15:47:07 -04:00
Vladimir Mandic	b625884031	add gemma3n to caption/vlm and promptenhance Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-07-07 10:01:02 -04:00
Vladimir Mandic	e8b5ea3847	major refactor: remove backend original Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-07-05 13:16:46 -04:00
Vladimir Mandic	1b4e1ff0ef	enable quants for vlm-captioning Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-06-29 11:48:05 -04:00
Vladimir Mandic	78330142ae	add moondream2, sdnq xyzgrid timing info Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-06-27 09:41:32 -04:00
Vladimir Mandic	5b486a6ef1	sdnq add xyz grid support, improve offloading compatibility Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-06-25 15:32:37 -04:00
Vladimir Mandic	f0d81ee1e0	cleanup Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-05-10 08:28:19 -04:00
Vladimir Mandic	6489e4c37d	prompt-enhance api support and img2img support Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-05-08 15:31:07 -04:00
Vladimir Mandic	d12cfdb537	add vlm prompt enhancer Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-05-05 12:39:45 -04:00
Disty0	dca11dd806	Add jxl to image extension lists	2025-05-01 16:02:50 +03:00
Vladimir Mandic	d1c3b97c65	add prompt enhance Signed-off-by: Vladimir Mandic <mandic00@live.com>	2025-03-28 14:05:28 -04:00

1 2

71 Commits