|
|
A Python script that uses a local [Ollama](https://ollama.com/) multimodal model to generate captions for your images. It features a rich, interactive terminal user interface (TUI) for easy operation, configuration, and live progress tracking. This is mainly a tool for preparing image datasets for training with FLUX. They are captions, as unlike Stable Diffusion, FLUX relies on natural language processing over keyword processing.
|