Feat/abdul/yolo#27
Open
AbdelrahmanKatkat wants to merge 74 commits into
Open
Conversation
…files, and tests - Introduced YOLOv8-v1 and YOLOv8-v2 for building footprint segmentation. - Added ZenML pipelines for training and inference. - Created Dockerfiles for isolated runtime environments. - Implemented comprehensive smoke tests to validate functionality. - Updated .gitignore to include new sample data directories.
…from STAC - Added functions to load model weights and hyperparameters from STAC Item JSON files. - Updated preprocess and training_pipeline functions to utilize loaded hyperparameters. - Enhanced stac-item.json files for both YOLOv8-v1 and YOLOv8-v2 with additional metadata and structure. - Improved documentation for clarity on model configuration and usage.
- Deleted YOLOv8-v1 Dockerfile, pipeline, README, STAC item, and tests to streamline the model repository. - Updated .gitignore to exclude new directories for runs and weights. - Consolidated focus on YOLOv8-v2 for building footprint segmentation.
Feature/yolo
…artifact tracking - Modified the run_preprocessing function to return a list of tuples containing image data and corresponding label data. - Enhanced error handling in train_model to raise an error if the data loader is empty. - Updated training_pipeline to accommodate the new data loader structure.
…into feature/yolo
- Removed unnecessary line breaks and consolidated code for better clarity. - Updated error message formatting for consistency. - Minor adjustments in the test file for improved readability.
refactor(pipeline): streamline code formatting and improve readability
- Introduced a new step to split an existing YOLO dataset into train and validation sets. - Implemented shuffling and validation fraction control via hyperparameters. - Ensured proper directory structure and error handling for dataset integrity. - Updated the training pipeline to include the dataset splitting step.
- Updated the `run_preprocessing` function to return a list of tuples containing image data and corresponding label data for ZenML artifact tracking. - Added error handling to ensure the data loader is not empty before proceeding with model training. - Adjusted the training pipeline to utilize the new data loader structure.
- Added a new function to resolve input directories for local and remote datasets. - Updated the `preprocess` function to return the preprocessed directory path. - Refactored the `split_dataset` function to generate YOLO train/val splits and return split metadata. - Adjusted the smoke test to validate the new preprocessing and dataset splitting workflow. - Added `split_seed` parameter to the configuration for reproducibility.
- Added a missing comma in the metadata dictionary returned by the split_dataset function for improved syntax correctness.
Merge : Master
kshitijrajsharma
requested changes
Apr 12, 2026
Member
There was a problem hiding this comment.
- rename yolov8v2 to yolo_v8_segmetation ! as we will not have multiple yolo version anymore
- get rid of the bash scripts on tests and move tests to production test cases with pytest as defined in the instructions , check here : https://hotosm.github.io/fAIr-models/contributing/model/#testing and example here : https://github.com/hotosm/fAIr-models/tree/master/models/yolo11n_detection/tests , tests should validate each function defined in pipeline
- Separate dockerfile to 3 stages , builder runtime and test : check here ; https://hotosm.github.io/fAIr-models/contributing/model/#dockerfile, with example here : https://github.com/hotosm/fAIr-models/blob/master/models/yolo11n_detection/Dockerfile
- Slimdown readme.md to be userfriendly model card rather than development decisions , they can live in PR description
- I haven't reviwed the pipeline yet , but the CI will validate the pipeline first and then i will have a look in near future !
…ing and update p_val parameter
…related documentation
…rovider info to stac-item.json
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
… into feat/abdul/yolo
… and improve GeoJSON processing
…rained_model_artifact
…eoJSON/JSON files and improved error handling
…rties in label processing
…rror handling for EPSG:4326 normalization
…n and enhance logging in training processes
… evaluation functions; update CI workflow to prevent cancellation of long-running builds
…ter adjustments; update CI workflow to allow cancellation of in-progress runs
…est script with dynamic model URI and increased predict timeout
… for training module
…for YOLO segmentation
…o improve performance
…mes in image preparation
…mance and correct is_identity property usage in image preparation
…eturn GeoJSON features. Update prediction logic to utilize the new postprocess function for improved clarity and maintainability.
…n pipeline to streamline the codebase.
… improved code readability.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Adds YOLOv8 building-footprint segmentation (
yolo_v8_segmentation): instance building-footprint segmentation for very high resolution RGB aerial imagery, built on an Ultralytics YOLOv8-Seg architecture (PAN-FPN neck with detection and mask-prototype heads). Output is GeoJSON polygons in EPSG:4326, one per detected building, with a per-polygon confidence score.Summary
FeatureCollectionofPolygonfeatures in EPSG:4326, each withclassandconfidenceWhen to pick this model
confidence_threshold,iou_threshold) provide practical control over precision vs recall.Intended use
Direct inference on OpenAerialMap tiles, and optional fine-tuning on small downstream labelled sets (~100–200 chips) when local imagery differs from the base training distribution.
Typical workflow
Where it works best
Where to use caution
How to use
Live inference
A long-running container exposes
POST /predicton port 8080.{ "model_uri": "https://huggingface.co/hotosm/yolo/resolve/ff7f436c881a3fa02ce574f9e4cab6ac2f0a16da/yolov8s-seg.onnx", "image_uri": "https://tiles.openaerialmap.org/.../{z}/{x}/{y}", "bbox": [west, south, east, north], "zoom": 18, "params": {"confidence_threshold": 0.5, "iou_threshold": 0.3} }The server fetches tiles for the requested bbox, runs inference, and returns a GeoJSON
FeatureCollection. Each feature hasproperties.class = 1andproperties.confidencein[0, 1].Fine-tuning on a local area
Drop a directory of RGB OAM chips and a single
labels.geojsonof OSM building polygons into the platform, then triggertraining_pipeline. The pipeline produces a fine-tuned ONNX and a metrics report.Inference parameters
Catalog defaults come from the model STAC item.
confidence_thresholdiou_thresholdInputs and outputs
Input contract
A directory of georeferenced RGB GeoTIFF chips (
.tif/.tiff/.pngwith.aux.xmlsidecars). The platform tile downloader produces this layout automatically for any TMS URL plus a bounding box.Output contract
A GeoJSON
FeatureCollectionin EPSG:4326. Each feature:{ "type": "Feature", "properties": {"class": 1, "confidence": 0.84}, "geometry": {"type": "Polygon", "coordinates": [...]} }The confidence is the per-instance detection score from the YOLOv8-Seg head after thresholding and NMS.
Compute footprint
Model size
.pt)Reference inference benchmark
Standardised CPU-only baseline for capacity planning. Single-threaded ONNX Runtime, cold session, synthetic RGB input. Measured on Intel Core i7-14650HX.
Workload: one OAM chip per forward pass (256×256 chip resized internally to the exported ONNX input 640×640).
Note: training and dataset chips use a 256×256 contract (
training.imgsz=256); the exported baseline ONNX tensor is 640×640, so inference resizes each chip before forward pass.Estimating larger AOIs
For a bbox that requires N OAM tiles at a given zoom:
Examples on this baseline (single-thread, session loaded once):
Architecture
Training data and recipe
Base weights (published checkpoint)
yolov8s_v2-seg.pt— YOLOv8s-seg initialized for single-class building footprint segmentationyolov8s-seg.onnx(baseline inference artifact)Per-area fine-tune (quality Banepa run)
buildings-banepa-instance-segmentationval_ratio=0.2,split_seed=42)Normalisation
imgsz=256(chip PNG/TIF → model input)Loss (Ultralytics YOLOv8 segmentation)
Composite loss from the utilities
HYPERPARAM_CHANGESrecipe:box = 7.48109cls = 0.775dfl = 1.5yolov8*-seg)Optimiser and schedule
autoauto(Ultralytics selects AdamW, effective lr ≈ 0.002)0.00854learning_ratewhen optimizer ≠auto0.012320.952750.000583.821770.81423fAIr-models patches only
optimizerandlr0from STAC; all other hyperparameters remain the utilities recipe.Regularisation and augmentation
true0(off)015.750.5/0.255hsv_h=0.01269,hsv_s=0.68143,hsv_v=0.270falsetruepc)3.0(STAC; applied viaYOLOSegWithPosWeight)Batch size and schedule (quality Banepa finetune)
Evaluation
Object-level metrics use polymetrics (IoU@0.5, Hungarian matching).
confidence_threshold=0.5,iou_threshold=0.3yolov8s-seg.onnx)pc=3.0)Banepa, Nepal (standard test patch, 2720 OSM GT polygons)
The fine-tune improves object-level footprint matching on this patch: +11.6 pp F1 vs baseline (0.526 vs 0.410), driven mainly by higher recall (+13.5 pp) with improved precision (+7.2 pp).
Chip-level validation (training signal, not polymetrics)
best.pt)best.pt)Train/val split (for the local finetune step)
val_ratiosplit_seedFine-tuning details
The
train_modelZenML step expects a directory of OAM RGB chips plus a single GeoJSON of OSM building polygons. Labels are preprocessed into the YOLO training layout; training runs throughhot_fair_utilities.training.yolo_v8.trainwith theHYPERPARAM_CHANGESrecipe.Default fine-tuning budget (STAC catalog): 30 epochs, batch 16,
pc=3.0, optimizerauto, learning rate 0.002,imgsz=256,val_ratio=0.2,split_seed=42.License
Apache-2.0
Citation