- Introduction
- Core Concepts
- Installation
- Basic Usage
- Advanced Usage
- Working with Models
- Device Management
- Performance Optimization
- Analyzing Results
- Best Practices
- FAQ
OVMobileBench is a comprehensive benchmarking pipeline for OpenVINO inference on mobile devices. It automates the entire workflow from building the runtime to generating performance reports.
- End-to-end automation: From build to report in one command
- Multi-device support: Android (primary), Linux ARM, iOS (planned)
- Flexible configuration: YAML-based experiment definitions
- Rich metrics: Throughput, latency, device utilization
- Reproducible results: Full provenance tracking
- Build: Compile OpenVINO runtime for target platform
- Package: Bundle runtime, libraries, and models
- Deploy: Transfer bundle to target device(s)
- Run: Execute benchmarks with specified parameters
- Parse: Extract metrics from benchmark output
- Report: Generate structured reports
All experiments are defined in YAML files with these sections:
project: Experiment metadatabuild: OpenVINO build configurationdevice: Target device settingsmodels: Neural network models to benchmarkrun: Benchmark execution parametersreport: Output format and destinations
The run matrix defines parameter combinations to test:
niter: Number of iterationsapi: Sync or async executionnireq: Number of inference requestsnstreams: Number of parallel streamsthreads: CPU thread countdevice: Target plugin (CPU, GPU, etc.)
- Python 3.11+
- Git
- CMake 3.24+
- Ninja 1.11+
- Android NDK r26d+ (for Android targets)
- Android SDK Platform Tools
git clone https://github.com/embedded-dev-research/OVMobileBench.git
cd OVMobileBench
pip install -e .[dev]pip install -r requirements.txt
pip install -e .# Android development
export ANDROID_NDK_HOME=/path/to/android-ndk-r26d
export ANDROID_HOME=/path/to/android-sdk
export PATH=$ANDROID_HOME/platform-tools:$PATH
# OpenVINO (if using prebuilt)
export INTEL_OPENVINO_DIR=/opt/intel/openvino
source $INTEL_OPENVINO_DIR/setupvars.shovmobilebench all -c experiments/config.yaml --verbose# Build OpenVINO
ovmobilebench build -c experiments/config.yaml
# Create deployment package
ovmobilebench package -c experiments/config.yaml
# Deploy to devices
ovmobilebench deploy -c experiments/config.yaml
# Run benchmarks
ovmobilebench run -c experiments/config.yaml
# Generate reports
ovmobilebench report -c experiments/config.yamlovmobilebench --help # Show help
ovmobilebench all --help # Show help for 'all' command
ovmobilebench all -c config.yaml # Run with config
ovmobilebench all -c config.yaml -v # Verbose output
ovmobilebench all -c config.yaml --dry-run # Preview without executionbuild:
enabled: false
openvino_repo: "/path/to/prebuilt/openvino"build:
enabled: true
openvino_repo: "/path/to/openvino/source"
build_type: "Release"
options:
ENABLE_INTEL_GPU: "OFF"
ENABLE_ONEDNN_FOR_ARM: "ON"
ENABLE_PYTHON: "OFF"
CMAKE_CXX_FLAGS: "-march=armv8.2-a"device:
kind: "android"
serials: ["device1", "device2", "device3"]
push_dir: "/data/local/tmp/ovmobilebench"run:
repeats: 5
cooldown_sec: 30
matrix:
niter: [100, 200, 500]
api: ["sync", "async"]
nireq: [1, 2, 4, 8]
nstreams: ["1", "2", "AUTO"]
device: ["CPU", "GPU"]
threads: [1, 2, 4, 8, 16]
infer_precision: ["FP32", "FP16", "INT8"]report:
tags:
branch: "feature/optimization"
experiment: "thread-scaling"
hardware: "snapdragon-888"
owner: "maintainer"# Download model
omz_downloader --name resnet-50-tf -o models/
# Convert to IR format
omz_converter --name resnet-50-tf --precision FP16 -d models/# Convert ONNX model
mo --input_model model.onnx --output_dir models/ --data_type FP16models:
- name: "resnet50"
path: "models/resnet50_fp16.xml"
precision: "FP16"
tags:
dataset: "imagenet"
accuracy: "76.1%"
- name: "mobilenet_v2"
path: "models/mobilenet_v2_int8.xml"
precision: "INT8"
tags:
dataset: "imagenet"
compressed: true# List available models
ls -la models/
# Verify model files
ovmobilebench validate-models -c experiments/config.yaml
# Calculate checksums
sha256sum models/*.xml models/*.bin > models/checksums.txt# Enable Developer Options and USB Debugging
# Connect device via USB
# Verify connection
adb devices
# Get device information
adb shell getprop ro.product.model
adb shell getprop ro.board.platformdevice:
kind: "android"
serials: ["R3CN30XXXX"]
push_dir: "/data/local/tmp/ovmobilebench"
use_root: false# Set up SSH key authentication
ssh-copy-id user@device.local
# Test connection
ssh user@device.local uname -adevice:
kind: "linux_ssh"
host: "192.168.1.100"
user: "ubuntu"
key_path: "~/.ssh/id_rsa"
push_dir: "/home/ubuntu/ovmobilebench"# Android: Disable animations
adb shell settings put global window_animation_scale 0
adb shell settings put global transition_animation_scale 0
adb shell settings put global animator_duration_scale 0
# Turn screen off
adb shell input keyevent 26
# Set CPU governor (requires root)
adb shell "echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor"run:
cooldown_sec: 60 # Wait between runs
warmup_runs: 2 # Discard initial runs# Pin to big cores (device-specific)
export TASKSET_MASK="0xF0" # Cores 4-7build:
options:
ENABLE_LTO: "ON" # Link-time optimization
CMAKE_BUILD_TYPE: "Release"run:
matrix:
threads: [1, 2, 4, 8] # Test different thread counts
nstreams: ["AUTO"] # Let OpenVINO optimize- Throughput (FPS): Inferences per second
- Latency (ms): Time per inference
- Average: Mean across all iterations
- Median: Middle value (robust to outliers)
- Min/Max: Range of values
- Efficiency: FPS per thread
- Scaling: Performance vs. resource usage
# Quick view
cat experiments/results/output.csv
# Pretty JSON
python -m json.tool experiments/results/output.json
# Import to pandas
python -c "import pandas as pd; df = pd.read_csv('output.csv'); print(df.describe())"import pandas as pd
import matplotlib.pyplot as plt
# Load results
df = pd.read_csv('results.csv')
# Group by configuration
grouped = df.groupby(['threads', 'nstreams'])['throughput_fps'].median()
# Plot
grouped.plot(kind='bar')
plt.ylabel('Throughput (FPS)')
plt.title('Performance by Configuration')
plt.show()# Compare with baseline
baseline = pd.read_csv('baseline.csv')
current = pd.read_csv('current.csv')
# Calculate regression
regression = (current['throughput_fps'] - baseline['throughput_fps']) / baseline['throughput_fps'] * 100
# Flag regressions > 5%
regressions = regression[regression < -5]
if not regressions.empty:
print(f"Performance regressions detected: {regressions}")- Start simple: Single model, single device
- Isolate variables: Change one parameter at a time
- Multiple runs: Use repeats ≥ 3 for statistical validity
- Document context: Record temperature, battery, background apps
- Version control: Track configurations in git
- Record metadata: Build flags, device info, timestamps
- Use checksums: Verify model integrity
- Archive results: Keep raw outputs
- Warm up: Discard initial runs
- Cool down: Prevent thermal throttling
- Stable power: Use consistent charging state
- Minimize noise: Disable unnecessary services
- No secrets in configs: Use environment variables
- Validate inputs: Check model sources
- Limit permissions: Avoid root when possible
- Clean up: Remove temporary files
Yes, set build.enabled: false and point to your prebuilt OpenVINO directory.
List multiple device serials in the configuration:
device:
serials: ["device1", "device2", "device3"]- Sync: Blocking inference, simpler, good for single-request scenarios
- Async: Non-blocking, allows parallel requests, better throughput
- Latency: Use sync API, nireq=1, optimize single-thread performance
- Throughput: Use async API, multiple nireq, optimize parallelism
Yes, ensure your custom layers are built into OpenVINO and the model uses them correctly.
- Increase cooldown time between runs
- Use external cooling if available
- Monitor temperature during benchmarks
- Run shorter iterations
Start with:
run:
matrix:
niter: [100]
api: ["sync"]
nireq: [1]
nstreams: ["1"]
threads: [4]- Use
--verboseflag - Check logs in
artifacts/directory - Run individual stages separately
- Verify device connectivity
- Check available disk space
Yes, CSV output can be opened directly in Excel or converted:
import pandas as pd
df = pd.read_csv('results.csv')
df.to_excel('results.xlsx', index=False)- Quantize your model to INT8
- Specify precision in model config
- Ensure CPU plugin supports INT8 (ARM may have limitations)
- GitHub Issues - Bug reports and feature requests
- Documentation - This guide and API reference
- Discussions - Project discussions
- Email: nesterov.alexander@outlook.com