A production-ready Python script for translating Android string resources (strings.xml and arrays.xml) using the Google Gemini API.
- Format Preservation: Ensures comments, spacing (blank lines), and structure match the source file exactly.
- Placeholder & Markup Safety: Freezes placeholders (e.g.,
%s,%1$d) and markup tags (e.g.,<b>,<xliff:g>) before translating to guarantee they are preserved and kept in the correct order. - Source Attribute Propagation: Copies attributes like
formatted,product, andtools:*to the translated strings. - Robust Error Handling: Includes batch translation with individual string fallback on failure, and automatic retry mechanisms for rate limits (429) or model overloads (503).
- Change Detection: Tracks source strings through a simple hash-based snapshot mechanism (
.translation_snapshots/). Only new strings and strings whose source text modified are re-translated, saving time and tokens. - Advanced Resource Support: Translates single
<string>, ordered<string-array>, and<plurals>resources out-of-the-box. - Character Compatibility: Manages HTML entity conversions and robust Android special character escaping.
- AAPT2 Compatibility: Implements proper
xliffnamespace handling to prevent build errors.
- Python 3.8+
- Required packages:
pip install google-genai lxml
- A Google Gemini API Key
Set your Google Gemini API key as an environment variable:
export GEMINI_API_KEY=your_api_key_here(You can customize the environment variable name via the --api-key-env flag).
Run the script in apply mode to fetch missing strings and write translated files directly to their respective values-{locale} folders.
# Basic usage
python translate.py --mode apply --locales es,de,fr
# Using a specific model and fine-tuning batch parameters
python translate.py \
--mode apply \
--repo-root . \
--locales ar \
--model gemma-3-27b-it \
--batch-size 15 \
--request-delay 4.0Run the script in check mode inside CI/CD workflows to simply verify whether all strings are translated without making any actual API calls or file modifications.
python translate.py --mode check --locales es,de,frIn check mode, the script exits with code 2 if translations are missing.
--mode(Required):apply(to translate and write xml) orcheck(to only check for missing keys).--locales: A comma-separated list of target Android language/region codes (e.g.es,fr,de,ar). Default ises,de.--repo-root: The path to the root of the Android project (where to search forsrc/*/res/values/strings.xmlor Compose Multiplatform equivalent). Default is..--model: The Gemini API model to use. Default isgemini-2.0-flash.--batch-size: Number of strings to send in a single Gemini API request. Default is20(capped at15for Gemma models).--request-delay: Delay in seconds between API requests to prevent immediate rate-limiting. Default is2.0(forced to4.0for Gemma models).--api-key-env: Name of the environment variable used to retrieve the API key. Default isGEMINI_API_KEY.--no-validate: Disable automatic malformed XML checks after writing translations.--verbose/-v: Enable debug-level logging.
When you successfully translate strings, the script saves a JSON file in .translation_snapshots/ within the source module. Subsequent runs will compare current source text against these hashes, allowing translate.py to seamlessly fix previously translated strings if you tweak the original English wording.
In apply mode, if a developer deletes a string or an array item from the english source, the script reliably detects and strips the orphaned translation from all localized strings files to avoid accumulation of unused strings.
If the Google Gemini backend responds with 429 Rate limited or 503 Service Unavailable, translate.py will automatically backoff and retry according to --max-retries and the wait times embedded in API responses.