1
0
mirror of https://github.com/vladmandic/sdnext.git synced 2026-01-27 15:02:48 +03:00

5 Commits

Author SHA1 Message Date
vladmandic
3f161b5532 lint moondream
Signed-off-by: vladmandic <mandic00@live.com>
2025-12-08 18:16:00 +01:00
CalamitousFelicitousness
a51e1501d6 fix(vqa): no moondream3 compile during explicit load
- Initialize KV caches before moving model to device
- Disable flex_attention decoding to avoid torch.compile hang
- Remove unused compile step (controlled by cuda_compile setting)

The flex_attention's create_block_mask triggers torch compilation
which can hang the system when called during model preload.
2025-12-06 02:26:34 +00:00
CalamitousFelicitousness
7714f71994 feat(vqa): un/load support and extract detection
Make external VQA handlers (moondream3, joytag, joycaption, deepseek)
compatible with VQA load/unload mechanism for consistent model lifecycle.

- Added vqa_detection.py, add shared detection helpers
- Add load and unload functions to all external handlers
- Replace device_map="auto" with sd_models.move_model in joycaption
- Update dispatcher and moondream handlers to use shared helpers
2025-12-05 23:52:02 +00:00
CalamitousFelicitousness
5193285bc7 refactor(vqa): convert to class-based singleton
Refactor VQA module from module-level globals to a VQA class singleton
  pattern with self-contained per-model loading methods.

Changes:
- Add VQA class with model/processor state and detection data storage
- Extract load methods for clean model pre-loading via UI
- Interrogate to return string only; store detection data on instance
- Add vqa_draw.py for bounding box/point annotation utilities
    Stub, further transfer of drawing functions to follow
- Update moondream3.py to store detection data on VQA singleton
- Update endpoints.py and ui_caption.py for new return type
2025-12-05 20:53:18 +00:00
CalamitousFelicitousness
0a322c0faf feat(vqa): add Moondream 3 Preview handler
Add support for Moondream 3 Preview VLM with:
- Text query, caption, point, and detect capabilities
- Bounding box visualization for object detection
- Max pixels setting for resolution control
- Device offloading support
2025-12-05 00:00:24 +00:00