Skip to content

Latest commit

 

History

History
118 lines (102 loc) · 4.56 KB

File metadata and controls

118 lines (102 loc) · 4.56 KB

flatbuffer_direct Migration Guide

Goal

Operate with the default flatbuffer_direct backend while preserving production stability and diagnosability, and keep tf_converter available only as an explicit compatibility path when needed.

Backend differences (quick view)

Item tf_converter flatbuffer_direct
Default No (explicit fallback) Yes
Final generation path TensorFlow Lite Converter Direct FlatBuffer builder
Optimization behavior TF-path accumulated rewrites/heuristics Direct preprocess + strict dispatch constraints
Failure model Many patterns absorbed by TF conversion Explicit failure with reason_code
Custom op path Implicitly minimized by TF path Explicit opt-in + allowlist
Fallback N/A N/A (no fallback)

Recommended rollout

  1. Keep the default CI lane on flatbuffer_direct and enable --report_op_coverage there.
  2. Add an explicit compatibility lane with --tflite_backend tf_converter if you still need to monitor legacy behavior.
  3. Resolve direct-path failures by reason_code and adjust model/export options.
  4. Only after stable float32/float16 conversion, enable quantization and split evaluation.

Stage-by-stage commands

Stage 0: Baseline direct export + diagnostics

python -m onnx2tf.onnx2tf \
  -i model.onnx \
  -o out \
  --report_op_coverage

Stage 1: Quantization + ONNX-based accuracy check

python -m onnx2tf.onnx2tf \
  -i model.onnx \
  -o out \
  -odrqt -oiqt \
  --eval_with_onnx \
  --eval_target_tflite full_integer_quant \
  --eval_compare_mode dequant \
  --report_op_coverage

Stage 2: Split generation + split accuracy check

python -m onnx2tf.onnx2tf \
  -i model.onnx \
  -o out \
  --auto_split_tflite_by_size \
  --tflite_split_target_bytes 1060000000 \
  --tflite_split_max_bytes 1073741824 \
  --eval_split_models \
  --report_op_coverage

Stage 3: Production strict-fail operation

python -m onnx2tf.onnx2tf \
  -i model.onnx \
  -o out

When direct export fails, conversion stops with an explicit error. Use tf_converter explicitly if the legacy TensorFlow Lite Converter path is still required operationally.

Preprocess scope in direct path

flatbuffer_direct applies staged preprocess rules before lowering:

  1. pattern_fusion_wave2
    • ReLU/Clip chain normalization
    • GELU chain fusion
    • SpaceToDepth chain fusion
  2. pseudo_ops_wave1
    • HardSwish / LeakyRelu / PRelu / Gelu / limited Pow rewrites
  3. constant_fold_a5
    • Limited constant folding for shape/axes and arithmetic helper chains
  4. normalize_attrs_a5
    • perm/axes normalization and softmax-axis bridge insertion

Use preprocess_report.applied_rules in *_op_coverage_report.json to inspect actual rewrites.

Custom OP policy

Use custom-op lowering only when builtin mapping is not feasible.

python -m onnx2tf.onnx2tf \
  -i model.onnx \
  -o out \
  --tflite_backend flatbuffer_direct \
  --flatbuffer_direct_allow_custom_ops \
  --flatbuffer_direct_custom_op_allowlist Einsum,TopK \
  --report_op_coverage

Behavior:

  1. Without custom-op enablement, custom candidates fail with reason_code=custom_op_candidate_disabled.
  2. If allowlist is specified and op is missing, conversion fails with reason_code=custom_op_not_in_allowlist.

Known limitations and mitigation

Symptom (reason_code) Cause Mitigation
unsupported_onnx_op No direct builtin/custom path Use tf_converter or model rewrite
requires_constant_input Dynamic axes/perm/shape where constants are required Pre-fold graph (onnxsim) or rewrite to constants
unsupported_attribute_value Direct constraints unmet (axis/rank/mode) Adjust exporter flags or rewrite subgraph
custom_op_candidate_disabled Custom candidate encountered while custom mode disabled Enable custom ops only if runtime supports them
custom_op_not_in_allowlist Candidate op not in allowlist Add to allowlist explicitly

Report files

  1. Accuracy report: *_accuracy_report.json
  2. Split plan: *_split_plan.json
  3. Split manifest: *_split_manifest.json
  4. Split accuracy: *_split_accuracy_report.json
  5. OP coverage: *_op_coverage_report.json

Operational checklist

  1. Keep the default flatbuffer_direct lane green at all times.
  2. Keep an explicit tf_converter lane only if you still rely on that compatibility path.
  3. Gate flatbuffer_direct rollout by model family (small -> medium -> large).
  4. Require --report_op_coverage in CI for the direct lane.
  5. Review unsupported_reason_counts and custom_op_policy for every failure.
  6. Avoid custom-op expansion unless runtime/serving side is ready.