Implementation of:
-
The Wisdom of a Crowd of Brains: A Universal Brain Encoder — Roman Beliy*, Navve Wasserman*, Amit Zalcher, Michal Irani. arXiv:2406.12179
-
Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer — Roman Beliy*, Amit Zalcher*, Jonathan Kogman, Navve Wasserman, Michal Irani. Accepted at ICLR 2026. arXiv:2510.25976
* Stands for equal contribution.
Environment requirements are in env.yml. To create the conda environment:
conda env create -f env.yml
conda activate brain-itThis repository implements the Universal Brain Encoder (image-to-fMRI encoding) and Brain-IT (fMRI-to-image reconstruction with the Brain-Interaction Transformer), as described in the papers above.
- NSD data preparation — end-to-end scripts and documentation for preparing inputs from Natural Scenes Dataset (NSD) data
- Model checkpoints — published pretrained weights and instructions for where to place them under
results/saved_models/ - External models — links and setup for third-party models
- Transfer learning — full training code and configs for adapting Brain-IT to new subjects or datasets.
For inference on pretrained models, please run:
python inference/full_inference.pyBrain-IT/
├── data/
│ ├── nsd_data/ # NSD dataset files
│ ├── derived_data/ # Derived data (clusters, embeddings)
│ ├── external_models/ # External pretrained models
│ └── scripts/ # Data processing scripts
├── models/ # Model architectures
├── train/ # Training scripts
├── inference/ # Inference scripts
├── utils/ # Utility functions
└── results/ # Output directory
├── saved_models/ # Trained model checkpoints
└── reconstructions/ # Inference outputs
python data/scripts/prepare_imgs.pypython data/scripts/prepare_clip.pypython train/train_encoder.pyAfter training the encoder, map voxels to clusters and generate synthetic fMRI:
# Map voxels to clusters
python data/scripts/get_clusters.py
# Generate synthetic fMRI
python data/scripts/pred_fmri_ext.py# VGG-based decoder with contrastive loss
python train/train_decoder.py --VGG --CONT --EXT --SAVE
# CLIP-guided decoder (stage 1)
python train/train_decoder.py --CLIPG --EXT --SAVEpython train/train_decoder_stage2.py --EXTNote: Stage 2 training requires significant GPU memory:
- 2x H200 GPUs, or 4x H100 GPUs (if 4 H100 are used set batch size to 4: modify
BATCH_SIZE = 4in the script)
Run the full inference pipeline:
python inference/full_inference.py --run_name my_experimentResults are saved in results/reconstructions/{run_name}/:
results/reconstructions/{run_name}/
└── subject_{n}/
├── low_level/ # Low-level VGG reconstructions
│ └── img_*.png
├── semantic/ # Semantic diffusion reconstructions
│ └── img_*.png
├── enhanced/ # Enhanced SDXL reconstructions (224x224)
│ └── img_*.png
├── low_level_recons.npy # Full arrays (uint8)
├── semantic_recons.npy
└── enhanced_recons.npy
This code accompanies the arXiv preprints linked at the top of this README. The PDFs are shared under the license stated on each arXiv record (see the “license” icon on the abstract pages). If you use this code, please cite those papers. Third-party or vendored code may have its own terms—check the relevant subdirectories (e.g. model bundles under src/).
For questions or inquiries: roman.beliy@weizmann.ac.il.
