CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Repository Overview

This repository contains three distinct machine learning research projects focused on different domains:

ST-LLM+ - Graph Enhanced Spatio-Temporal Large Language Models for Traffic Prediction
ST-LLM - Original Spatial-Temporal Large Language Model for Traffic Prediction
SeisMoLLM - Seismic Motion Large Language Model for earthquake/seismic data analysis

Environment Setup

Each project has its own environment requirements:

ST-LLM and ST-LLM+

Python 3.8.19, PyTorch 2.4.1, CUDA 11.7, torchvision 0.19.1
Setup: conda env create -f env_ubuntu.yaml (in respective project directories)
Dependencies: transformers, torch-geometric, pandas, numpy, scipy

SeisMoLLM

Python environment (specific version not explicitly documented)
PyTorch-based with seismic data processing capabilities

Training Commands

ST-LLM+

CUDA_VISIBLE_DEVICES=0
nohup python train_plus.py --data taxi_pick > your_log_name.log &

ST-LLM

CUDA_VISIBLE_DEVICES=0
nohup python train.py --data taxi_pick > your_log_name.log &

SeisMoLLM

# Distributed training example
torchrun --nnodes 1 --nproc_per_node 4 --master_port 10000 main.py \
    --seed 0 --mode "train" --model-name "SeisGPT_baz" \
    --data "/datasets/DiTing330km" --dataset-name "diting_light" \
    --epochs 200 --batch-size 128 --base-lr 0.0005

Key Architecture Components

ST-LLM+

Partially Frozen Graph Attention (PFGA) module for capturing localized dependencies
LoRA-augmented training strategy for efficient fine-tuning
Proximity-based adjacency matrix integration for traffic network modeling
Core model: model_ST_LLM_plus.py

ST-LLM

Spatial-temporal embedding for learning location and temporal patterns
Fusion convolution for unified spatial-temporal representation
Partially frozen attention strategy for LLM adaptation
Core model: model_ST_LLM.py

SeisMoLLM

Multi-task seismic analysis including phase picking, magnitude estimation, azimuth prediction
Configurable model architecture via config.py with regex-based model matching
Multiple loss functions (BCE, Focal, Huber, Combination losses)
Distributed training support with torchrun

Data Structure

ST-LLM/ST-LLM+: Traffic datasets (taxi_pick, taxi_drop, bike_pick, bike_drop) with adjacency matrices
SeisMoLLM: Seismic datasets (DiTing, STEAD) with various input channels (z, n, e components)

Model Configuration (SeisMoLLM)

The config.py file contains comprehensive model configurations using regex patterns for model names:

Input/output channel definitions
Loss function assignments
Evaluation metrics specification
Transform functions for data preprocessing

Training Parameters

Common patterns across projects:

Learning rates: 1e-3 to 1e-4 range
Batch sizes: 64-128
Early stopping: patience around 30-100 epochs
Weight decay: 1e-4
CUDA memory optimization: Various split size configurations

Log Management

All projects generate training logs in ./logs/ directories with timestamped filenames and model-specific naming conventions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Repository Overview

Environment Setup

ST-LLM and ST-LLM+

SeisMoLLM

Training Commands

ST-LLM+

ST-LLM

SeisMoLLM

Key Architecture Components

ST-LLM+

ST-LLM

SeisMoLLM

Data Structure

Model Configuration (SeisMoLLM)

Training Parameters

Common patterns across projects:

Log Management

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Repository Overview

Environment Setup

ST-LLM and ST-LLM+

SeisMoLLM

Training Commands

ST-LLM+

ST-LLM

SeisMoLLM

Key Architecture Components

ST-LLM+

ST-LLM

SeisMoLLM

Data Structure

Model Configuration (SeisMoLLM)

Training Parameters

Common patterns across projects:

Log Management