This repository contains three modeling pipelines for pedestrian trajectory prediction and crossing-intention prediction:
- Baseline LSTM (JAAD)
- JAAD-only Transformer
- SUMO-pretrained → JAAD-finetuned Transformer (Single-Agent + Multi-Agent)
We model pedestrian behavior in urban traffic scenes using:
- Trajectory prediction (future pedestrian path)
- Intention prediction (Crossing vs Non-Crossing)
Core idea: learn motion + interaction patterns from real-world JAAD, and improve generalization with synthetic SUMO simulations (then fine-tune on JAAD).
JAAD provides real-world driving videos with pedestrian bounding-box annotations and behavior labels (e.g., crossing/non-crossing).
- Input: observed pedestrian motion sequence (relative displacements; normalized)
- Backbone: LSTM encoder
- Heads:
- Trajectory head (future relative displacements)
- Intention head (crossing vs non-crossing)
- Evaluation:
- Trajectory: ADE / FDE
- Intention: Accuracy, Precision/Recall/F1, Balanced Accuracy, AUC
- Input embedding: motion features (REL + NORM) + optional context features
- Encoder: Transformer encoder (multi-head self-attention)
- Heads:
- CLS token → Intention head
- Last observed token → Trajectory head
- Loss: Multi-task learning (Trajectory regression + Intention classification)
We generate synthetic pedestrian–vehicle interaction sequences using SUMO to cover controlled, diverse traffic conditions.
- Run SUMO simulation with configured routes, signal phases, pedestrian flows, and vehicle types
- Export FCD / trajectory logs
- Convert to windowed sequences (T_obs, T_pred) for:
- Single-Agent (one pedestrian track)
- Multi-Agent (pedestrian + nearby agents, depending on your implementation)
- Apply the same feature format used by JAAD models (REL + NORM) for compatibility
- Pretraining: Train Transformer on SUMO synthetic windows (learn motion/interaction priors)
- Finetuning: Adapt the pretrained model on JAAD windows
- Tracks supported:
- Single-Agent Transformer
- Multi-Agent Transformer
- Outputs (both tracks):
- Trajectory prediction (ADE/FDE)
- Intention prediction (Acc/Precision/Recall/F1, Balanced Acc, AUC)






