GENET is a computational workflow designed for the processing, analysis, and visualization of biomedical entities and relations in scientific literature. It integrates multiple components to support tasks that involve trait-gene association discovery, literature mining, knowledge graph construction, and interactive visualizations.
This repository includes the following modules:
- Snp2TraitNet – Predicts associations between SNPs and traits using a dual-encoder architecture.
- LitMiner – Extracts biomedical entities and relations from literature using in-context learning.
- Emb2KG – Produces and then converts embedding representations of biomedical entities and relations into structured knowledge graphs.
- GENETViz – Provides interactive visualizations for exploring biomedical entities networks.
To get started with this project, you'll need to install all four modules: snp2traitnet, litminer, emb2kg, and genetviz. Then, you'll need to download and place model weights and datafiles in proper places. You can do these by running the following commands on CLI:
git clone https://github.com/BiomedSciAI/genet.git
cd genet
chmod +x install.sh
./install.sh
To enable local inference via Ollama, follow these steps:
- Download and install Ollama from https://ollama.com
- Pull a desired model:
ollama pull granite4:tiny-h
- create .env in litminer (litminer/.env) and add the model:
OLLAMA_MODEL=granite4:tiny-h
- create apikey.js in GENETViz (genetviz/GENETViz/static/apikey/apikey.js) and add the model:
const API_CONFIG = {
'ollama':
{
'API_END_POINT': 'http://localhost:11434/api/generate',
'MODEL': 'granite4:tiny-h'
}
}
Snp2TraitNet enables the discovery of associations between traits, genes, and SNPs using a dual-encoder model trained on curated biomedical datasets.
Identify genes potentially associated with a given disease or trait. For example, to discover genes linked to Marfan syndrome, run:
run_snp2trait --mode trait2gene --keyword "marfan syndrome" --ckpt_path snp2traitnet/Snp2TraitNet/output/snp2trait/checkpoints/snp2trait-checkpoint.ckpt --data_path snp2traitnet/Snp2TraitNet/datasets/snp2trait.csv --output_path snp2trait.txt
Note: Use escape characters (\", \') if your keyword contains quotes.
Retrieve diseases or traits associated with a specific gene. For example, to find traits linked to PCSK9:
run_snp2trait --mode gene2trait --keyword PCSK9 --ckpt_path snp2traitnet/Snp2TraitNet/output/snp2trait/checkpoints/snp2trait-checkpoint.ckpt --data_path snp2traitnet/Snp2TraitNet/datasets/snp2trait.csv --output_path snp2trait.txt
Discover traits associated with a specific SNP (RS ID). For example, to query rs362307:
run_snp2trait --mode snp2trait --keyword rs362307 --ckpt_path snp2traitnet/Snp2TraitNet/output/snp2trait/checkpoints/snp2trait-checkpoint.ckpt --data_path snp2traitnet/Snp2TraitNet/datasets/snp2trait.csv --output_path snp2trait.txt
LitMiner performs literature-based mining to extract biomedical entities and relationships from PubMed abstracts using keyword-driven queries and LLM-based inference.
To search for articles and extract entities and relations using a specific keyword (e.g., Alkaptonuria):
nohup run_litminer --query "Alkaptonuria" --retmax 100 --backend ollama > output.log 2>&1 &
To use the output from Snp2TraitNet as input for literature mining:
nohup run_litminer --query snp2trait.txt --retmax 100 --backend ollama > output.log 2>&1 &
Emb2KG transforms extracted entities and relations into structured knowledge graphs, generates embeddings, and performs clustering analysis.
run_emb2kg --input_path litminer/LitMiner/output/msx --output_path genetviz/GENETViz/static/data --n_clusters 5
run_genet
Then open your browser and navigate to http://127.0.0.1:11748
If you find our work useful for your research, we ask you to cite our work:
@misc{kwon2025genet,
title={GENET: AI-Powered Interactive Visualization Workflows to Explore Biomedical Entity Networks},
author={Bum Chul Kwon and Natasha Mulligan and Joao Bettencourt-Silva and Ta-Hsin Li and Bharath Dandala and Feng Lin and Ching-Huei Tsou and Pablo Meyer},
year={2025},
doi = {10.64898/2025.12.12.694029},
eprint = {https://www.biorxiv.org/content/early/2025/12/16/2025.12.12.694029.full.pdf},
journal = {bioRxiv}
}