Releases · jegly/Box

08 May 11:13

jegly

v1.0.6

30d0d30

Box v1.0.6 Latest

Latest

### Box v1.0.6

New: On-Device Image Generation

First stable release of on-device image generation. Previous builds had crashes and instability in the Stable Diffusion
pipeline — v1.0.6 resolves these and is the first version we'd consider ready for daily use.

- Image generation time dropped from ~27 minutes to under 4 minutes — earlier builds required 20 steps with base SD 1.5;

LCM-SSD-1B (recommended) produces quality results in just 4 steps thanks to consistency distillation, combined with the CPU
optimisations below

Added Image Gen — generate images from text prompts fully on-device using Stable Diffusion
6 models available to download, powered by stable-diffusion.cpp:
- LCM-SSD-1B Q4_K (~2.2 GB) — recommended, fast SDXL-class results in 4 steps
- SDXL-Lightning 4-step Q4_0 (~2.8 GB) — ByteDance distillation, high quality in 4 steps
- SDXL-Turbo Q4_0 (~4.2 GB) — vivid results in 1–4 steps
- SDXL Base Q4_0 (~3.9 GB) — full SDXL at native 1024² resolution
- SD 1.5 Q4_0 (~2.1 GB) — classic reliable all-rounder
- SD 1.5 Q8_0 (~4.0 GB) — higher precision SD 1.5
Adjustable steps, CFG scale, negative prompt, and image size (256² up to 1024²)
Save generated images directly to your gallery
Import your own GGUF model files from device storage
Fixed crash (SIGSEGV) on second generation — was a use-after-free in the sd.cpp context; context is now reloaded before each
generation

### Performance

Enabled ARM dotprod + FP16 CPU kernels for all ggml-based inference (Stable Diffusion, SmolLM,llama.cpp) — these were silently disabled
at build time due to cross-compilation defaults
Upgraded ggml release builds from -O2 to -O3 optimisation
Enabled flash attention in Stable Diffusion — reduces memory pressure during sampling
Compiled in experimental Vulkan GPU backend for Stable Diffusion (auto-falls back to CPU if unsupported)

Voice Input / Audio Scribe

Removed Whisper Tiny — was hanging on transcription; Whisper Base is now the smallest option

General

Model cards now show descriptions when expanded — tap any model card to see details and recommended settings
All Image Gen model cards include plain-English descriptions and recommended CFG/step settings

Two variants available:

Box_v1.0.6_Main_Signed_Release.apk — stock Android 15 +
Box_v1.0.6_custom-rom-support_Signed_Release.apk — GrapheneOS / custom ROMs without Google Play Services

APK size reduced from ~890 MB to ~550 MB — native libraries are now compiled for arm64 only, dropping unused x86 and 32-bit
ARM builds. All devices that can run Box (Android 15+) are arm64.

Assets 4

07 May 07:47

jegly

v1.0.5

d482b4d

Box 1.0.5

Main — v1.0.5

New Models

Gemma 4 E2B & E4B model files refreshed — updated commit hashes from HuggingFace.

New Features

Audio Scribe
Record audio directly in the app or import a WAV file. Whisper transcribes it on-device, optional speaker diarization labels each speaker, then
the LLM can summarise, analyse, or answer questions about the content. Entirely offline.

Ask Audio (multimodal chat)
Send a recorded audio clip or WAV file directly into AI Chat with Gemma 4 E2B / E4B. The model hears and responds to the content.

Text-to-Speech screen
Dedicated TTS tab for on-device voice synthesis using system TTS or an imported Piper/ONNX model.

Real-time voice reply updated
Enable in Settings to have the AI speak its reply sentence by sentence as it generates, rather than waiting for the full response. Works with
Android system TTS or an imported Piper model.

TTS voice picker
Settings → TTS Voice — choose which installed offline system voice is used for AI replies.

AI Chat shortcut
Long-press the Box icon → AI Chat. Navigates directly into chat even from a cold start.

In-app update checker
Settings → Check for updates. Fetches the latest release from GitHub and shows a direct download button if a newer version is available.

Model import from list
Whisper and TTS models can now be imported directly from the model list screen without needing a model already downloaded.

Performance

Speculative decoding / MTP working — capability is now checked against the model file itself, not the allowlist. Gemma 4 E2B reaches 66–91
tok/s with GPU + speculative decoding on real conversation text vs ~52 tok/s plain GPU (Galaxy S26 Ultra).
Sustained Performance Mode — CPU/GPU clocks locked during inference. No more thermal throttle mid-conversation.
Benchmark speculative decoding toggle — benchmark screen now exposes the toggle for models that support it.

Bug Fixes

Fixed app shortcut navigating to home screen instead of chat on cold start.

### custom-rom-support — v1.0.5

All changes from Main above, plus:

New Models

Same model set as Main. Accelerator order includes NPU (gpu,npu,cpu) for Tensor devices on supported ROMs.

Performance

Speculative decoding and Performance Mode as above.
GPU is the recommended accelerator on Snapdragon. NPU path on Snapdragon is untested on custom ROMs — falls back to GPU automatically.

custom-rom-support specific

Piper TTS (Amy) available as a built-in download in the TTS tab — no third-party TTS app needed on de-Googled ROMs.
No Google Play Services dependency — AICore/Firebase paths are bypassed.
TTS voice reply works out of the box with the bundled Piper engine.

Notes

This build is for GrapheneOS, LineageOS, CalyxOS, and other custom ROMs without Google Play Services. If you are on stock Android (Pixel,
Samsung, etc.) use the Main APK. If you previously installed Main, uninstall it before switching — the package ID is the same and you will
get a signature conflict.

Assets 4

01 May 13:59

jegly

v1.0.4

933fbc0

Box v1.0.4

● main branch — v1.0.4

What's new in v1.0.4

Bug fixes

Fixed missing GPU option on Pixel 10 Pro Fold — GPU is now available as an accelerator alongside CPU (NPU/TPU works on custom OS roms)

Assets 3

28 Apr 07:26

jegly

v1.0.3

a43e30a

Box 1.0.3

v1.0.3 — Stock Android

Bug Fixes

Images now persist across sessions — Photos sent in a conversation are saved to local storage and correctly restored when returning to
or reopening a conversation. Previously, images would disappear after navigating away or restarting the app.

Assets 3

01 May 14:04

jegly

custom-rom-support-v1.0.4

b742aae

custom-rom-support v1.0.4

custom-rom-support — v1.0.4

What's new in v1.0.4

Bug fixes

Fixed missing GPU option on some devices. — GPU is now available as an accelerator alongside TPU and CPU
Fixed accelerator setting resetting to GPU after app is fully closed — your chosen accelerator (TPU, GPU, or CPU) now persists across app restarts

Assets 3

28 Apr 08:26

jegly

custom-rom-support-v1.0.3

b742aae

custom-rom-support-v1.0.3

v1.0.3 — Non-Stock Android (GrapheneOS / LineageOS / CalyxOS)

Bug Fixes

1 - Images now persist across sessions — Photos sent in a conversation are saved to local storage and correctly restored when returning to
or reopening a conversation. Previously, images would disappear after navigating away or restarting the app.
2 - Benchmark no longer crashes when NPU is unavailable — Running the benchmark on devices without NPU/AICore support (e.g. older Tensor
chips on GrapheneOS/LineageOS no longer causes a crash. The backend now correctly falls back to CPU in this case.
3 - Accelerator setting no longer resets after navigation — Changing the accelerator (CPU / GPU / NPU) in model config is now saved
correctly. Previously the setting would silently revert to its default value after navigating away and returning.

Huge thanks to aryoda for pointing out these bugs !

Assets 3

26 Apr 04:48

jegly

custom-rom-support-v1.0.2

b742aae

Custom Rom Support v1.0.2

I have now created a separate branch called custom-rom-support, along with a corresponding release section specifically for users on third-party operating systems.
If you are using a custom ROM, please use the custom-rom-support
branch/release instead of the main branch. This branch supports TPU/NPU
acceleration on Tensor devices; however, Snapdragon acceleration
remains untested.
Please expect broken features if you are using a custom ROM and running the current release or branch from main. A separate APK and branch (custom-rom-support)
are now available for users on third-party Android operating systems,
including but not limited to LineageOS, GrapheneOS, and CalyxOS.
Note:
The primary reason for these limitations is that third-party operating
systems typically lack AICore and system-level Text-to-Speech (TTS)
components. As a result, features such as voice-to-voice mode and
NPU/GPU acceleration are either unavailable or significantly impaired on
these ROMs.

Stock Android users should continue using the standard main branch build — this build is
not needed and adds no benefit on stock Android
- Amy (Piper TTS) must be downloaded from the Voice tab before voice reply works offline

custom-rom-support — Release Notes

Custom ROM Support (GrapheneOS · LineageOS · CalyxOS)

This release brings full voice-to-voice AI chat support for custom Android ROMs that ship
without Google services (AICore, Google TTS, Google STT).

What's New

Voice-to-Voice on Custom ROMs

Microphone button now works on GrapheneOS and other de-Googled ROMs using the built-in
Whisper speech recognition engine
Voice Activity Detection (VAD) automatically sends your message after 1.5 seconds of
silence — no need to press a button to stop recording
Speaker button now works on custom ROMs using the Amy (Piper) TTS engine — auto-downloads
and initialises on first use
Real-time voice reply now streams speech sentence-by-sentence as the AI generates,
matching stock Android behaviour
Amy TTS pre-loads when AI Chat opens so real-time playback is ready from your first
message
Microphone automatically restarts after the AI finishes speaking, enabling continuous
hands-free conversation

Performance

AI Chat inference is now fast on Tensor G5 devices running GrapheneOS — routes through
Android NNAPI to reach the TPU without requiring AICore

Bug Fixes

Fixed Amy TTS model download failing on GrapheneOS due to a HuggingFace redirect URL bug
Fixed mic not auto-restarting after AI response on custom ROMs

Settings

Added "Non-Stock Android" notice in Settings clarifying this build is intended for custom
ROM users

Assets 3

25 Apr 10:38

jegly

v1.0.2

8fca8ea

Box

v1.0.2 — Voice Mode & Vision Talk / NPU/TPU Acceleration

New Features

Free Talk — Hands-Free Voice Conversation

Continuous speech-to-speech AI conversation with no button press between turns
Tap the mic button to start, speak naturally, AI replies and listens again automatically
Real-time TTS: each sentence is spoken as it generates rather than waiting for the full response
Enable Real-time voice reply in Settings to turn it on

Vision Talk — Live Camera + Voice AI

Stream your back camera live to the AI during a voice conversation
Point at anything and ask about it out loud — AI sees the current frame and speaks the answer back
Fully hands-free: mic → AI sees scene → AI speaks → mic restarts automatically
Requires a vision-capable model (Gemma 4 E2B or E4B)

TTS Voice Picker

Choose your preferred offline system voice in Settings
Only locally installed voices shown (network-dependent voices filtered out)
GrapheneOS / de-Googled ROM users: install RHVoice or eSpeak NG from F-Droid

LaTeX Math Rendering

Greek letters, operators, fractions, integrals, summations rendered inline as Unicode
Works automatically in assistant messages

Improvements

GGUF model context size now scales with file size to prevent hangs on Android 17 Beta 4 (MemoryLimiter) — large models use a reduced KV cache automatically
Removed QNN Hexagon skeleton .so libraries ( will be added back next release to support snapdragon acceleration )
Removed interactive map skill

Bug Fixes

Fixed model initialization hanging indefinitely — now times out after 30 seconds with a clear error
Fixed app crash on invalid task ID (force unwrap replaced with safe exit)
Fixed real-time TTS stalling on responses with no sentence punctuation (code blocks, URLs now flush correctly)
Fixed TTS audio playback wiring in AI Chat

Assets 3

0 Join discussion

24 Apr 06:31

jegly

v1.0.1

75a646d

Box

New Inference Engines

Local Diffusion (stable-diffusion.cpp)
On-device image generation powered by stable-diffusion.cpp. Runs Stable Diffusion 1.5 in GGUF format, fully offline. Configurable steps, CFG scale, seed, and image size presets. Import
your own GGUF diffusion models.

Voice (whisper.cpp)
On-device speech-to-text powered by whisper.cpp. Record audio or pick a WAV file, transcribe offline. Supports Whisper Tiny through Small in multiple languages. Audio never leaves the
device.

NPU / TPU Acceleration
All Qualcomm Hexagon NPU variants (Snapdragon 8 Gen 2 / 8 Gen 3 / 8 Elite / newer), Google Tensor TPU (Pixel 8–10), and MediaTek NPU bundled in a single APK. Select NPU in the model
accelerator dropdown — Box auto-detects the chip at runtime.

Security

Biometric App Lock — Optional lock via BiometricPrompt, re-prompts on every foreground
Encrypted Chat History — All conversations persisted to SQLCipher-encrypted Room database (AES-256 at rest)
Passphrase isolation — BiometricEncryptionManager + PassphraseHolder keep the DB key in memory only while authenticated
Hard Offline Mode — Toggle in Settings forces full airgap; all download attempts are blocked
Security Audit Log — On-device append-only log of security-relevant events
Prompt sanitisation — SecurityUtils.sanitizePrompt() strips control characters before inference and persistence
Tapjacking protection — filterTouchesWhenObscured on the chat scaffold

Chat

Conversations persist across sessions and resume where you left off
Chat History screen — browse, resume, or delete past conversations
Multimodal input in AI Chat with Gemma 4 E2B / E4B — attach documents, record audio, take photos
Improved message input and scroll behaviour

Agent Skills

New built-in skills:

Budget Tracker
Password Generator
Translator

Updated skills: Calculate Hash, QR Code

Removed: Kitchen Adventure, Text Spinner

Model Management

Import any GGUF file from local storage at runtime
Set display name and accelerator (CPU / GPU / NPU) at import time
Fixed: models appearing multiple times after screen rotation (deduplication guard in loadModelAllowlist)
Fixed: Voice and Local Diffusion showing 0 models after allowlist reload

Gradle 9.4.1 compatibility
stable-diffusion.cpp and whisper.cpp added as native submodules
llama.cpp updated
Plus a whole lot more !

Assets 3

17 Apr 15:04

jegly

v1.0.0

95d79b3

v1.0.0

Box v1.0.0 — Initial Release

Box is a security-hardened Android app for running large language models fully on-device. Forked from Google AI Edge Gallery with significant privacy and usability improvements.

Features

On-device LLM inference via LiteRT and llama.cpp
GGUF model support — import any compatible GGUF model from local storage
Accelerator selection — choose CPU, GPU, or NPU per model
Encrypted chat history — all conversations stored with SQLCipher encryption
Biometric lock — optional fingerprint/face authentication to access the app
Offline mode — block all network requests, run entirely air-gapped
Chat persistence — conversations are saved and automatically resumed per model
Copy to clipboard — long-press any AI response to copy it

Privacy

No telemetry or data collection
All inference runs locally on-device
Chat history never leaves the device
FLAG_SECURE prevents screenshots and screen recording

Notes

Gemma models are subject to the Gemma Terms of Use
GGUF import requires models compatible with llama.cpp

Assets 3

Uh oh!

Releases: jegly/Box

Box v1.0.6

Uh oh!

Box 1.0.5

New Features

Uh oh!

Box v1.0.4

What's new in v1.0.4

Bug fixes

Uh oh!

Box 1.0.3

Uh oh!

custom-rom-support v1.0.4

What's new in v1.0.4

Bug fixes

Uh oh!

custom-rom-support-v1.0.3

v1.0.3 — Non-Stock Android (GrapheneOS / LineageOS / CalyxOS)

Uh oh!

Custom Rom Support v1.0.2

Uh oh!

Box

Uh oh!

Box

Uh oh!

v1.0.0

Uh oh!