██████╗ ██╗██╗  ██╗ █████╗ ██╗   
 ██╔══██╗██║╚██╗██╔╝██╔══██╗██║     
 ██████╔╝██║ ╚███╔╝ ███████║██║     
 ██╔═══╝ ██║ ██╔██╗ ██╔══██║██║     
 ██║     ██║██╔╝ ██╗██║  ██║███████╗ 
 ╚═╝     ╚═╝╚═╝  ╚═╝╚═╝  ╚═╝╚══════╝
  PIXAL – PIXel-based Anomaly Locator

PIXAL (PIXel-based Anomaly Locator) is a modular deep learning framework designed for image-based anomaly detection in high-resolution scientific data. Currently applied to identifying defects in detector hardware components for the ATLAS experiment, PIXAL supports training and validation of deep neural networks, with a focus on Autoencoder-based architectures.

The framework includes tools for:

Image preprocessing, including background removal, alignment, zero-pruning, and ML input processing
Flexible training with optional one-hot labels and configurable architectures
Modular validation and anomaly visualization (heatmaps, ROC, loss histograms)
Metadata tracking and reproducibility for experimental pipelines

PIXAL is highly extensible — other model types and preprocessing pipelines can be added with minimal changes.

Setup

PIXAL is tested and works best with Python 3.10.9. For consistent results, we recommend creating a clean virtual environment with this version.

1. Clone the Repository

git clone https://github.com/OSU-HEP-HDL/pixal.git
cd pixal

2. Setup the Environment

source setup.sh

This script will:

Detect your platform (Linux, Windows via WSL or Git Bash, or macOS)
Create a Python virtual environment in .venv/
Activate the environment
Install required packages from requirements.txt or requirements-cpu.txt (macOS fallback)
Set up base configuration files

Note

For GPU training, ensure you have a compatible NVIDIA driver and CUDA/cuDNN stack installed. The framework is tested with TensorFlow 2.15+.

Important

Note for Windows users: Native Windows is not officially supported. Use WSL2 (Windows Subsystem for Linux) or Git Bash for best results.

Warning

Note for macOS users: Due to hardware and driver limitations, TensorFlow and related tools will run in CPU-only mode. Training and inference will be slower, but fully functional.

3. Verify the Environment

Check to see if the PIXAL framework was properly setup by running the help command.

pixal -h

Input Data Formatting

Since components have different types of images, they should be separated in different directories that are labeled accordingly. The framework parses through nested folders and uses the naming convention for the output.

Configuration System and Parameters

PIXAL uses modular YAML-based configuration files to define preprocessing steps, model training parameters, and all path resolutions. This design enables reproducibility, clarity, and easy experimentation. There are two main configuration files that can be found within the /configs folder, they are parameters.yaml and paths.yaml.

Parameters

The parameters.yaml file contains all high-level control flags. The file is split into three sections, preprocessing, model_training, and plotting.

Preprocessing

Defines how images are cleaned and transformed:

remove_background: Max workers are the number of threads for parallel processing when removing backgrounds from the images.
alignment: parameters for KNN and RANSAC-based image alignment. Includes addtional metric and image flags.
preprocessor: controls pooling, zero pruning, color channels, and .npz output.
rename_images: optionally renames images to folder-consistent names.

Model Training

Covers everything needed to build and train the neural network:

Memory handling: GPU/CPU flags, threading, memory growth, and hybrid options.
Architecture: latent layer size, encoder/decoder depth, label encoding, one-hot encoding flag.
Training control: batch size, learning rate, optimizer settings, loss functions.
Regularization: supports l1, l2, or combined with tunable coefficients.
Early stopping: using patience and min_delta.

Plotting

Choose what diagnostic plots to generate after training:

ROC/Recall, pixel-wise MSE/MAE, distribution comparisons, confusion matrix, etc.
Log-based vs absolute loss plotting.
Loss cut threshold to define anomaly threshold

Paths

PIXAL resolves all data inputs/outputs relative to a few base directories. There are two main base paths, all preprocessing and model trainings are output to /out and all validation and detection are output to /validate. This YAML allows centralized control of:

component_model_path: where trained models and logs are saved.
component_validate_path: path used during validation and detection.

The naming of these two sections are the only names the user should alter. Each section (like remove_background_path, aligned_images_path, etc.) defines a name and a base, which are combined at runtime using PIXAL’s recursive path resolution system.

Example

aligned_images_path:
  aligned_images: "aligned_images"
  base: *preprocessed_images_path

This lets PIXAL dynamically build:

out/R0_Triplet_Data_Flex_F1_pink_prune_2pool_rgb/preprocessed_images/aligned_images

Advanced Behavior

Hierarchical Namespacing: All configurations are parsed into nested Python namespaces (config.preprocessing.preprocessor.pool_size, etc.) for intuitive access.
Metadata: PIXAL automatically stores and saves parameters, including bounding box crop data from zero-pruning as metadata for use in validation.
Multi-file Merging: PIXAL merges multiple metadata YAMLs in a directory into one logical config object. These merged multiple YAMLs in a directory into one logical config object. This gives users separate reusable preprocessing.yaml, model_training.yaml, and plotting.yaml files while still combining them at runtime.

Preprocessing Pipeline

PIXAL includes a modular and efficient preprocessing pipeline designed to prepare image data for machine learning-based anomaly detection. The image shown is the front of the R0 Triplet Data Flex Flavor 1 which will be used as an example going through this pipeline, taken by a Tagarno Microscope. Below are the key stages:

Background Removal

Removes the background from each input image to isolate the object of interest. This is done using the rembg library with optional multithreaded support.

Purpose: Reduce noise and standardize input for feature extraction.

Config settings:

preprocessing:
  remove_background:
    max_workers: 8
  rename_images: true

Output: component/preprocessed_images/background_removed/

Image Alignment

Aligns each background-removed image to a reference using feature matching (KNN, RANSAC). Ensures consistent orientation and spatial scale.

Purpose: Standardize object placement across the dataset.

Config settings:

preprocessing:
  alignment:
    knn_ratio: 0.8
    number_of_points: 5
    ransac_threshold: 7.0
    MIN_SCORE_THRESHOLD: 0.5
    MAX_MSE_THRESHOLD: 10.0
    MIN_GOOD_MATCHES: 20
  draw_matches: true
  save_metrics: true
  save_overlays: true

Output: preprocessed_images/aligned_images/ figures/aligned_metrics/

Zero Pruning (Optional)

Cropping step that removes zero-valued background pixels after alignment. The system finds the tightest bounding box around the non-zero pixels (with configurable padding) and crops all images to the same region.

Purpose: Reduce input dimensionality while preserving relevant information.

Config settings:

preprocessing:
  preprocessor:
    zero_pruning: true
    zero_pruning_padding: 5

Output Internally processed images; cropping dimensions are saved in: metadata/preprocessing.yaml

Preprocesor -> ML Input Conversion

Converts aligned (and optionally pruned) images into normalized ML-ready inputs. This includes:

Channel selection can be any combination of (R, G, B, H, S, V)
Average pooling to reduce resolution
Per-channel normalization
.npz output containing data, labels (if applicable), and shape

preprocessing:
  preprocessor:
    file_name: "out.npz"
    pool_size: 2
    channels: ["R", "G", "B"]

Output: out/<component>/<type>/out.npz

Metadata Output

Important parameters like crop_box, input_dim, and processing shapes are saved to: out/<component>/<type>/metadata/preprocessing.yaml

Model Training

PIXAL supports flexible and modular training of deep learning models (currently autoencoders) for anomaly detection in pixel-aligned image data.

The Autoencoder Architecture

An Autoencoder is a type of neural network that learns to compress and reconstruct its input. It's structured into three parts:

Encoder: Compresses the input image into a smaller latent representation. This part captures the most essential features of the data.
Latent Space: The compressed representation. It’s the "bottleneck" that forces the network to learn meaningful features.
Decoder: Attempts to reconstruct the original image from the latent representation.

In the context of PIXAL, this model learns to reproduce defect-free components. During validation, poor reconstruction (i.e., higher pixel-wise loss) indicates anomalous or defective regions.

Input Format

Before training, images must be preprocessed and converted into .npz files using the preprocessing pipeline (see previous section). Each .npz file contains:

data: flattened, normalized image vectors
labels: (only if using one-hot encoding)
shape: original image shape post-pooling or zero-pruning

Training Modes

PIXAL supports two training modes:

1. Per-Type Model (default)

Trains a separate model for each image type (e.g. component variant or class). Each .npz file corresponds to a single type.

model_training:
  one_hot_encoding: False

Benefits: Higher performance, more specific models
Model Output: The model is saved both as a .keras file and its weights as <model_name>.weights.h5, these can be found in: out/<component>/<type>/model/<model_name>.weights.h5 Currently, models are loaded and rebuilt using the <model_name>.weights.h5 for validation.

2. One-Hot Encoding Mode

Trains a single model on all types of images, with one-hot encoded class labels appended to the latent space.

model_training:
  one_hot_encoding: True

Benefits: Generalized model across types
Model Output: Just as the per-type mode, the model is saved both as a .keras file and its weights as <model_name>.weights.h5, these can be found in: out/<component>/model/<model_name>.weights.h5 Currently, models are loaded and rebuilt using the <model_name>.weights.h5 for validation.

Validation and Anomaly Detection

Once a model is trained, PIXAL performs validation and anomaly detection by comparing reconstructed images to their input counterparts. Deviations between the input and reconstruction indicate potential anomalies (e.g., damaged hardware regions).

Validation Workflow

The validation process mirrors the preprocessing and training workflow:

1. New Image Set

A new directory of unseen images (e.g., from a production batch) is passed into the validation routine.
These images are organized in per-type folders (if one_hot_encoding=False) or as a flat directory (if True).

2. Preprocessing

Background removal
Image alignment (using previously saved reference images)
Zero pruning using pre-saved crop box metadata
Normalization & pooling
Conversion into .npz format

3. Model Selection

Each .npz file is paired with its trained model and metadata (architecture, crop box, etc.).
Model is rebuilt and weights are loaded.

4. Prediction

The model reconstructs the input image(s).
The reconstruction is compared to the original input to compute pixel-wise reconstruction errors.

Detection Logic

PIXAL uses the Mean Squared Error (MSE) between input and reconstruction to assess anomalies.

Low MSE → normal reconstruction
High MSE → possible anomaly

You can configure:

plotting:
  loss_cut: 0.7              # Threshold for anomaly
  use_log_loss: False        # Use log-scale loss when computing anomaly mask

Detection Output

For each validated image type, PIXAL saves:

validate/
  └── <component>/
      └── <type>/
          ├── logs/
          ├── metadata/
          ├── figures/
          │   ├── anomaly_overlay_*.png
          │   ├── pixel_loss_histogram.png
          │   └── ...
          └── aligned_metrics/

Visual outputs include:

Output	Description
`anomaly_overlay_*.png`	Heatmap of pixel-wise anomaly regions
`pixel_loss_histogram.png`	Histogram of MSE across all pixels
`combined_distribution_log.png`	Overlay of predicted and true pixel values
`roc_curve`, `pr_curve`	ROC/PR curve using pixel-wise MSE scores
`confusion_matrix.png`	Optional confusion matrix (if thresholds used)

How to Run PIXAL

The commands to run PIXAL are streamlined to reduce the amount of input of the user. The commands arguments can be manually inputted, if not, it will follow the paths.yaml configuration file to find the relevant files used for the process.

Important

Prior to preprocessing your dataset, alter the section component_model_path: &component_model_path in the paths.yaml file to match your component name

The commands included in the PIXAL framework can be seen using the -h

Pixel-based Anomaly Detection CLI

positional arguments:
  {preprocess,remove_bg,align,make_input,train,validate,detect}
    preprocess          Run all preprocessing steps on input images
    remove_bg           Remove background from images
    align               Align images
    make_input          Uses ImagePreprocessor to make ML input
    train               Train autoencoder model(s)
    validate            Run validation (preprocess + detect) on new images
    detect              Run anomaly detection on new images

options:
  -h, --help            show this help message and exit

Preprocessing

The preprocessing pipeline is included in a single command, but each step can be ran separately if needed. Ensure the dataset and the nested directories are properly named prior to running. To run the entire pipeline:

pixal preprocess -i /path/to/component/

Loading bars are shown for each preprocessing step.

If separate steps are needed to be ran, make sure to use the proper input for an argument.

pixal align -i /path/to/remove_bg/images/

Training

The train command can take in input or assume you're training a model based on the preprocessed input dictated by the paths.yaml configuration file. If it's safe to assume you're using this preprocessed data, you can just run:

pixal train

Otherwise,

pixal train -i /path/to/preprocessed/data/

Validation

Validation preprocesses the image that needs to be validated while also running and production the detection plots.

Important

Prior to validating your image, alter the section ccomponent_validate_path: &component_validate_path in the paths.yaml file to match your component name

To run the validation pipeline, run:

pixal validate -i /path/to/image/

Model saving & MLflow integration

PIXAL currently writes trained models to the out/.../model/ directory in two forms:

Full Keras checkpoint file (.keras) — this is the checkpoint file written by the Keras ModelCheckpoint callback during training. It is suitable for restoring model weights via model.load_weights(...) or for Keras to read back a saved model depending on how it was written.
Weights file (<model_name>.weights.h5) — the project currently also calls save_weights(...) after training; validation and downstream code rebuilds the model architecture and then calls load_weights(...) to restore weights.

Example file locations:

out/<component>/<type>/model/<model_name>.keras
out/<component>/<type>/model/<model_name>.weights.h5

Loading the model for validation (current behavior)

from pixal.train_model.autoencoder import Autoencoder

# build same `params` dict used for training
model = Autoencoder(params)
model.build_model(input_dim=params['input_dim'])
model.load_weights(str(weights_path))

MLflow quick-start (optional)

MLflow is now installed with setup.sh

By default MLflow will create an mlruns/ directory in the current working directory. To use a remote tracking server, set:

PIXAL includes a small, best-effort integration helper at pixal/mlflow_utils.py. When mlflow is present, training runs (via the pixal train entrypoints) will:

Start an MLflow run and log the params dictionary as run parameters
Log per-epoch metrics (loss/val_loss) to MLflow
After training, log the saved model/weights and the metadata YAML as artifacts

To start a local server, run:

mlflow ui

This automatically runs a MLflow server on port 5000.

Go to "http://your-mlflow-server:5000" or however you access port 5000.

Available Variables for ML Input

The preprocessing framework supports multiple representations of image pixel values. You can specify a subset (e.g., ["H","S","V"]) or request all variables by passing channels="ALL". Each variable is normalized to a consistent range (typically [0,1]) to simplify downstream training.

Below is a detailed description of the currently available feature variables: R, G, B — Red, Green, Blue channels (linear or gamma-corrected depending on preprocessing). Each channel contains per-pixel intensity values normalized to [0, 1].

H, S, V — Hue, Saturation, Value from HSV colorspace. H is represented as degrees mapped to [0,1] (or optionally as sin/cos pairs in derived channels), S and V are normalized to [0,1].

Y, Cr, Cb — YCbCr colorspace channels: luminance (Y) and chroma components (Cr, Cb). Useful for separating brightness from color information.

LAB_L, LAB_a, LAB_b — CIE LAB color space channels: lightness (L) and opponent color axes (a, b). These are perceptually uniform channels useful when color differences should match human perception.

LCh_C, LCh_sinH, LCh_cosH — Polar form of Lab: chroma (C) and hue angle encoded as sin(H) and cos(H) for continuous, wrap-safe representation of hue.

r_chroma, g_chroma — Simple chroma-derived channels computed relative to the R and G channels (e.g., R / (R+G+B + eps)). These emphasize the contribution of a single color channel relative to total intensity.

Opp_O1, Opp_O2, Opp_O3 — Color-opponent channels (e.g., variants of R-G, R-B, G-B or other opponent transforms). Opponent channels help highlight color contrasts that may indicate defects.

GradMag — Gradient magnitude of the luminance or chosen channel (e.g., Sobel magnitude). Useful for edge and texture information.

Laplacian — Laplacian filter response (second derivative) capturing blob-like intensity variations and helping detect local defects or spots.

LocalStd — Local (neighborhood) standard deviation of intensity (texture measure). Useful to capture local texture variance and noise.

Notes and tips

You can request specific channels using the channels preprocessing option, e.g. channels: ["H","S","V","GradMag"].
For hue information, prefer LCh_sinH/LCh_cosH or H wrapped as sin/cos to avoid discontinuities near 0/360 degrees.
If using multiple chroma/opponent channels, normalize each channel independently (already performed by PIXAL) so the network can weigh them fairly.
Derived channels (GradMag, Laplacian, LocalStd) add texture/structure information and are especially helpful for detecting small or low-contrast defects.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
configs		configs
pixal		pixal
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
pyproject.toml		pyproject.toml
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Setup

1. Clone the Repository

2. Setup the Environment

3. Verify the Environment

Input Data Formatting

Configuration System and Parameters

Parameters

Preprocessing

Model Training

Plotting

Paths

Example

Advanced Behavior

Preprocessing Pipeline

Background Removal

Image Alignment

Zero Pruning (Optional)

Preprocesor -> ML Input Conversion

Metadata Output

Model Training

The Autoencoder Architecture

Input Format

Training Modes

1. Per-Type Model (default)

2. One-Hot Encoding Mode

Validation and Anomaly Detection

Validation Workflow

1. New Image Set

2. Preprocessing

3. Model Selection

4. Prediction

Detection Logic

Detection Output

How to Run PIXAL

Preprocessing

Training

Validation

Model saving & MLflow integration

Available Variables for ML Input

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages