⚠️ 100% Burn Period Active: The Covenant72B training phase is complete. Validators are burning 100% of emissions (seeneurons/burn.py) and are not evaluating miner gradients. Templar: Crusades information coming soon. See the main README for current status.
This document provides a comprehensive guide on how to set up and run a validator using validator.py. Validators are crucial components of τemplar, responsible for evaluating miners' contributions by assessing their uploaded gradients.
- Validator Setup
This guide will help you set up and run a validator for τemplar. Validators play a critical role in maintaining the integrity of the network by evaluating miners' contributions and updating weights accordingly.
- NVIDIA GPU with CUDA support
- Minimum required: 4x H200 GPUs
- Ubuntu (or Ubuntu-based Linux distribution)
- Docker and Docker Compose
- Git
- Python 3.12+ (for manual installation)
- Hugging Face Authentication:
- Create a Hugging Face account and generate a token at https://huggingface.co/settings/tokens
- Accept the Gemma model terms at https://huggingface.co/google/gemma-3-270m (required for tokenizer access)
- Set
HF_TOKENenvironment variable with your token
- Cloudflare R2 Bucket Configuration:
- Dataset Setup: Please refer to Shared Sharded Dataset Documentation for complete dataset setup instructions, including:
- R2 bucket settings
- Dataset download process
- Gradient Bucket Setup:
- Create a Bucket: Name it the same as your account ID and set the region to ENAM.
- Generate Tokens:
- Read Token: Admin Read permissions.
- Write Token: Admin Read & Write permissions.
- Store Credentials: You'll need these for the
.envfile.
- Dataset Setup: Please refer to Shared Sharded Dataset Documentation for complete dataset setup instructions, including:
-
Install Docker and Docker Compose:
Follow the same steps as in the Miner Setup section.
-
Enable Docker GPU Support:
Follow the official NVIDIA Container Toolkit installation guide:
# 1. Configure the production repository curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list # 2. Update package listings sudo apt-get update # 3. Install the NVIDIA Container Toolkit sudo apt-get install -y nvidia-container-toolkit # 4. Configure Docker runtime sudo nvidia-ctk runtime configure --runtime=docker # 5. Restart Docker daemon sudo systemctl restart docker # 6. Test GPU support docker run --rm --gpus all nvidia/cuda:11.8.0-base-ubuntu22.04 nvidia-smi
If you see the
nvidia-smioutput, GPU support is working correctly.For detailed instructions and other Linux distributions, refer to the official NVIDIA Container Toolkit installation guide.
-
Clone the Repository:
git clone https://github.com/one-covenant/templar.git cd templar -
Navigate to the Docker Directory:
cd docker -
Create and Populate the
.envFile:Create a
.envfile in thedockerdirectory by copying the.env.example:cp .env.example .env
Populate the
.envfile with your configuration. Variables to set:# Required: Hugging Face token for tokenizer access HF_TOKEN=<your_huggingface_token> # Add your Weights & Biases API key WANDB_API_KEY=<your_wandb_api_key> # Cloudflare R2 Credentials - Add your R2 credentials below R2_GRADIENTS_ACCOUNT_ID=<your_r2_account_id> R2_GRADIENTS_BUCKET_NAME=<your_r2_bucket_name> R2_GRADIENTS_READ_ACCESS_KEY_ID=<your_r2_read_access_key_id> R2_GRADIENTS_READ_SECRET_ACCESS_KEY=<your_r2_read_secret_access_key> R2_GRADIENTS_WRITE_ACCESS_KEY_ID=<your_r2_write_access_key_id> R2_GRADIENTS_WRITE_SECRET_ACCESS_KEY=<your_r2_write_secret_access_key> # Dataset R2 credentials - See docs/shared_sharded_dataset.md for instructions R2_DATASET_ACCOUNT_ID=<your_dataset_account_id> R2_DATASET_BUCKET_NAME=<your_dataset_bucket_name> R2_DATASET_READ_ACCESS_KEY_ID=<your_dataset_read_access_key_id> R2_DATASET_READ_SECRET_ACCESS_KEY=<your_dataset_read_secret_access_key> DATASET_BINS_PATH="anneal/" # Aggregator R2 credentials R2_AGGREGATOR_ACCOUNT_ID=8af7f92a8a0661cf7f1ac0420c932980 R2_AGGREGATOR_BUCKET_NAME=aggregator R2_AGGREGATOR_READ_ACCESS_KEY_ID=bb4b9f02a64dacead181786b8f353b67 R2_AGGREGATOR_READ_SECRET_ACCESS_KEY=f50761d0fbb0773c55f61debdf87439735c32c096fe4b1ab6aa6bfb7f52aa30b # Wallet Configuration WALLET_NAME=<your_wallet_name> WALLET_HOTKEY=<your_wallet_hotkey> # Network Configuration NETWORK=finney NETUID=3 # GPU Configuration (automatically handled by Docker) # Validator service uses GPUs 0, 1, and 2 from the host # Node Type NODE_TYPE=validator # Additional Settings DEBUG=false
Note: Set
NODE_TYPEtovalidator. -
Update
docker-compose.yml:Ensure that the
docker-compose.ymlfile is correctly configured for your setup. -
Run Docker Compose:
Start the validator using Docker Compose:
docker compose -f docker/compose.yml up -d
If you prefer to run the validator without Docker, follow the instructions in the Running Without Docker section.
After completing the installation steps, your validator should be running. Check it with:
docker psYou should see a container named templar-validator-<WALLET_HOTKEY>.
-
Install System Dependencies:
Same as in the miner setup.
-
Install NVIDIA CUDA Drivers:
Install the appropriate NVIDIA CUDA drivers.
-
Clone the Repository:
git clone https://github.com/one-covenant/templar.git cd templar -
Set Up Python Environment:
export HF_TOKEN=your_huggingface_token # Required for tokenizer access export WANDB_API_KEY=your_wandb_api_key export NODE_TYPE=your_node_type export WALLET_NAME=your_wallet_name export WALLET_HOTKEY=your_wallet_hotkey # GPU is automatically assigned by Docker (GPUs 0,1,2 for validator) export NETWORK=your_network export NETUID=your_netuid export DEBUG=your_debug_setting # Gradients R2 credentials export R2_GRADIENTS_ACCOUNT_ID=your_r2_account_id export R2_GRADIENTS_BUCKET_NAME=your_r2_bucket_name export R2_GRADIENTS_READ_ACCESS_KEY_ID=your_r2_read_access_key_id export R2_GRADIENTS_READ_SECRET_ACCESS_KEY=your_r2_read_secret_access_key export R2_GRADIENTS_WRITE_ACCESS_KEY_ID=your_r2_write_access_key_id export R2_GRADIENTS_WRITE_SECRET_ACCESS_KEY=your_r2_write_secret_access_key # Dataset R2 credentials - See docs/shared_sharded_dataset.md for instructions export R2_DATASET_ACCOUNT_ID=your_dataset_account_id export R2_DATASET_BUCKET_NAME=your_dataset_bucket_name export R2_DATASET_READ_ACCESS_KEY_ID=your_dataset_read_access_key_id export R2_DATASET_READ_SECRET_ACCESS_KEY=your_dataset_read_secret_access_key export DATASET_BINS_PATH="anneal/" # Aggregator R2 credentials export R2_AGGREGATOR_ACCOUNT_ID=8af7f92a8a0661cf7f1ac0420c932980 export R2_AGGREGATOR_BUCKET_NAME=aggregator export R2_AGGREGATOR_READ_ACCESS_KEY_ID=bb4b9f02a64dacead181786b8f353b67 export R2_AGGREGATOR_READ_SECRET_ACCESS_KEY=f50761d0fbb0773c55f61debdf87439735c32c096fe4b1ab6aa6bfb7f52aa30b export GITHUB_USER=your_github_username
-
Create and Register Validator Wallet:
# Create coldkey if not already created btcli wallet new_coldkey --wallet.name default --n-words 12 # Create and register validator hotkey btcli wallet new_hotkey --wallet.name default --wallet.hotkey validator --n-words 12 btcli subnet pow_register --wallet.name default --wallet.hotkey validator --netuid <netuid> --subtensor.network <network>
-
Log into Weights & Biases (WandB):
wandb login your_wandb_api_key
-
Set Environment Variables:
Export necessary environment variables as in the miner setup.
-
Run the Validator:
torchrun --standalone --nnodes 1 --nproc_per_node 4 \ neurons/validator.py \ --wallet.name <wallet_name> \ --wallet.hotkey <hotkey> \ --device cuda \ --netuid 3 \ --subtensor.network <network> \ --use_wandb
Set the following in the docker/.env file when using Docker Compose:
# Required: Hugging Face token for tokenizer access
HF_TOKEN=your_huggingface_token
WANDB_API_KEY=your_wandb_api_key
INFLUXDB_TOKEN=your_influxdb_token
# Cloudflare R2 Credentials
R2_ACCOUNT_ID=your_r2_account_id
R2_READ_ACCESS_KEY_ID=your_r2_read_access_key_id
R2_READ_SECRET_ACCESS_KEY=your_r2_read_secret_access_key
R2_WRITE_ACCESS_KEY_ID=your_r2_write_access_key_id
R2_WRITE_SECRET_ACCESS_KEY=your_r2_write_secret_access_key
# Additional Gradient R2 credentials
R2_GRADIENTS_ACCOUNT_ID=your_r2_gradients_account_id
R2_GRADIENTS_BUCKET_NAME=your_r2_gradients_bucket_name
R2_GRADIENTS_READ_ACCESS_KEY_ID=your_r2_gradients_read_access_key_id
R2_GRADIENTS_READ_SECRET_ACCESS_KEY=your_r2_gradients_read_secret_access_key
R2_GRADIENTS_WRITE_ACCESS_KEY_ID=your_r2_gradients_write_access_key_id
R2_GRADIENTS_WRITE_SECRET_ACCESS_KEY=your_r2_gradients_write_secret_access_key
# Dataset R2 credentials - See docs/shared_sharded_dataset.md for instructions
R2_DATASET_ACCOUNT_ID=your_dataset_account_id
R2_DATASET_BUCKET_NAME=your_dataset_bucket_name
R2_DATASET_READ_ACCESS_KEY_ID=your_dataset_read_access_key_id
R2_DATASET_READ_SECRET_ACCESS_KEY=your_dataset_read_secret_access_key
DATASET_BINS_PATH="anneal/"
# Aggregator R2 credentials
R2_AGGREGATOR_ACCOUNT_ID=8af7f92a8a0661cf7f1ac0420c932980
R2_AGGREGATOR_BUCKET_NAME=aggregator
R2_AGGREGATOR_READ_ACCESS_KEY_ID=bb4b9f02a64dacead181786b8f353b67
R2_AGGREGATOR_READ_SECRET_ACCESS_KEY=f50761d0fbb0773c55f61debdf87439735c32c096fe4b1ab6aa6bfb7f52aa30b
# Wallet Configuration
WALLET_NAME=default
WALLET_HOTKEY=your_validator_hotkey_name
# Network Configuration
NETWORK=finney
NETUID=3
# GPU Configuration (automatically handled by Docker)
# Validator service uses GPUs 0, 1, and 2 from the host
# Node Type
NODE_TYPE=validator
# Additional Settings
DEBUG=falseNote: The R2 permissions remain unchanged.
- GPU Requirements:
- Minimum required: 4x H200 GPUs (as defined in min_compute.yml)
- Minimum CPU: 64 cores, 3.5 GHz (recommended: 128 cores, 4.0 GHz)
- Minimum RAM: 1200 GB (recommended: 1500 GB)
- Minimum Network: 1024 Mbps download/upload bandwidth
- Storage: 1000GB minimum, 2000GB recommended for model and evaluation data
- Network: High-bandwidth, stable connection for state synchronization
- Mainnet (Finney):
- Network:
finney - Netuid:
3
- Network:
- Testnet:
- Network:
test - Netuid:
223
- Network:
- Local:
- Network:
local - Netuid:
1
- Network:
Optional InfluxDB configuration variables include:
INFLUXDB_TOKEN: Authentication tokenINFLUXDB_HOST: Custom host addressINFLUXDB_PORT: Connection port (default 8086)INFLUXDB_DATABASE: Database nameINFLUXDB_ORG: Organization identifier
Example configuration:
INFLUXDB_HOST=custom-influxdb-host.example.com
INFLUXDB_PORT=8086
INFLUXDB_DATABASE=custom-database
INFLUXDB_ORG=custom-org
INFLUXDB_TOKEN=your-influxdb-token
These settings are optional and will fall back to default values if not provided.
-
Docker Logs:
docker logs -f templar-validator-${WALLET_HOTKEY} -
Weights & Biases:
- Ensure
--use_wandbis enabled - Monitor evaluation metrics and network statistics
- Ensure
Key metrics to monitor:
- GPU utilization
- Memory usage
- Network bandwidth
- Evaluation throughput
- Weight setting frequency
- State Synchronization Failures: Check network settings and ensure the validator is properly registered and connected.
- Out of Memory Errors: Reduce
--actual_batch_size. - Network Connectivity Issues: Verify firewall settings and network configurations.
- The validator synchronizes its model with the latest global state.
- It gathers and applies gradients from miners to maintain consistency.
- Collect Miner Gradients: Gathers compressed gradients submitted by miners.
- Evaluate Contributions: Assesses the impact of each miner's gradient on model performance.
- Compute Scores: Calculates scores based on loss improvement.
- Update Weights: Adjusts miners' weights on the blockchain accordingly.
- Scoring Mechanism: Based on the performance improvement contributed by miners.
- Update Frequency: Weights are periodically updated on the blockchain.
- Impact: Influences reward distribution and miner reputation in the network.