Japanese VTT Translator

A Python-based tool for translating Japanese WebVTT subtitle files to English using the Ollama AI server and TranslationGemma model.

Features

Intelligent Chunking: Automatically chunks VTT files to respect the TranslationGemma:12b context window (~32k tokens)
Accurate Translation: Uses the official TranslationGemma model for professional-quality Japanese-to-English translation
Beautiful TUI: Terminal-aware progress display that adapts to your terminal width
Quality Assurance: Automatic sanity checks to verify translations contain no Japanese characters and no empty subtitles
Preservation: Maintains exact timestamp formatting - critical for video synchronization
Retry Logic: Automatically retries failed translations once
Complete Reassembly: Combines all translated chunks back into a single, complete VTT file

Requirements

Python 3.7+
An Ollama server running the translategemma:12b model
Internet access to reach the Ollama server

Installation

Clone or download this project
Install Python dependencies:
```
pip install -r requirements.txt
```

Configuration

The tool uses environment variables for configuration:

# Ollama server base URL (default: http://localhost:11434/)
export OLLAMA_BASE_URL="http://localhost:11434/"

# Ollama model name (default: translategemma:12b)
export OLLAMA_MODEL="translategemma:12b"

If these aren't set, the script will use the default values above.

Usage

Run the main script:

python3 translate_vtt.py

The script will:

Prompt you to select a Japanese VTT file
Validate the Ollama server connection
Load and analyze the input file
Display file duration and estimated chunk count
Chunk the file respecting token limits
Translate each chunk via Ollama
Verify translations with sanity checks
Reassemble into a final -EN.vtt file

Example Workflow

$ python3 translate_vtt.py

╔════════════════════════════════════════════════════════════╗
║                Japanese VTT Translator                    ║
╚════════════════════════════════════════════════════════════╝

ℹ Enter the path to your Japanese VTT file:
  > /path/to/episode.vtt

✓ Selected: /path/to/episode.vtt

[2/6] Validate Ollama Connection
ℹ Server URL: http://ai-house:11434/
ℹ Model: translategemma:12b
✓ Connected to Ollama
✓ Model 'translategemma:12b' is available

[3/6] Load and Analyze VTT File
ℹ Loading VTT file...
✓ Loaded 1511 subtitles
📄 File: episode.vtt
⏱  Duration: 118 minutes (1.97 hours)
📦 Chunks: 1 (estimated based on 32k token limit)

[4/6] Chunk VTT File
ℹ Chunking file respecting token limits...
✓ Created 1 chunks
ℹ Average tokens per chunk: 1900

[5/6] Translate Chunks
ℹ Translating 1 chunks via Ollama (this may take several minutes)...
  Chunk   1/1: ⏳ Processing... - 1511 subtitles
  Chunk   1/1: ✓ Translated
✓ All 1 chunks translated successfully

[6/6] Reassemble and Finalize
ℹ Reassembling translated chunks...
✓ Reassembled into single file

╔════════════════════════════════════════════════════════════╗
║                Translation Complete!                      ║
╚════════════════════════════════════════════════════════════╝

ℹ Output file: /path/to/episode-EN.vtt
✓ Translation pipeline completed successfully!

Output

The translated file is saved with the same name as the input, but with -EN appended before the file extension.

Example:

Input: episode.vtt
Output: episode-EN.vtt

File Structure

translate_vtt.py - Main orchestration script (run this)
vtt_utils.py - VTT file parsing and utilities
chunker.py - Intelligent chunking logic
ollama_client.py - Ollama API communication
translator.py - Translation and sanity checking
reassembler.py - Chunk reassembly
tui.py - Terminal UI components
requirements.txt - Python dependencies

Technical Details

Chunking Strategy

The tool conservatively estimates tokens to ensure no overflow:

Max tokens per chunk: 15,000
Reserved for overhead: 300 tokens (prompt + instructions)
Result capacity: ~17,000 tokens for output
Total budget: ~32,000 tokens

This ensures safe operation even with the conservative token estimates.

Translation Quality

Each translation undergoes sanity checks:

Non-empty verification: All subtitles must contain text
Language verification: No Japanese characters allowed in output
Retry logic: Failed chunks are retried once
Deterministic failure: If a chunk fails twice, it's marked as failed

Timestamp Preservation

All timestamps are preserved exactly as they appear in the original file. This is critical for video synchronization.

Troubleshooting

"Cannot connect to Ollama server"

Verify Ollama is running: curl http://ai-house:11434/api/tags
Check the URL matches your setup
Ensure network connectivity to the Ollama host

"Could not verify model availability"

Make sure the TranslationGemma model is pulled: ollama pull translategemma:12b
Verify the model name is correct

Translation fails or produces Japanese output

The model may be overwhelmed - try splitting into smaller files manually
Verify the Ollama server has sufficient resources
Check the console output for specific error messages

Empty output file

All chunks failed translation - check Ollama logs
Verify the model is properly loaded and responsive

Limitations

Large files (>3 hours) may need to be split manually before processing
Translation quality depends on the TranslationGemma model's performance
Processing time scales with video duration (typically 30-60 minutes for 2-hour videos)

Performance Notes

Translation time depends on video length, system resources, and Ollama server performance
Typical speed: 1-2 minutes of video per minute of processing time
Chunks are processed sequentially

License

This project is provided as-is for personal use.

README.md 6.4 KB Histórico Em bruto