|
|
-A Python script that uses a local [Ollama](https://ollama.com/) multimodal model to generate captions for your images. It features a rich, interactive terminal user interface (TUI) for easy operation, configuration, and live progress tracking. This is mainly a tool for preparing image datasets for training with FLUX. They are captions, as unlike Stable Diffusion, FLUX relies on natural language processing over keyword processing.
|
|
|
+A Python script that uses a local [Ollama](https://ollama.com/) multimodal model to generate captions for your images in bulk. You can use the prompt to guide the vision model to include certain keywords, to describe a certain person by their name. It features a rich, interactive terminal user interface (TUI) for easy operation, configuration, and live progress tracking. This is mostly a helper tool for preparing image datasets for training with FLUX. They are captions, as unlike Stable Diffusion, FLUX relies on natural language processing over keyword processing.
|