A high-performance Rust tool for generating GEDCOM files with millions of people records using customizable rulesets.
Available as both a CLI tool and web application with REST API! 🚀
See INSTALL.md for detailed installation instructions for all platforms.
Quick install: Download the binary for your platform from the latest release.
- 52 Language Presets: Built-in support for European (including English USA & UK), Asian, Middle Eastern, Pacific, African, and Latin American languages with culturally appropriate names and locations
- Ruleset-Based Generation: Define custom rules for names, dates, locations, relationships, and LDS ordinances
- Fast Generation: Optimized for creating millions of records efficiently
- Family Relationships: Generate realistic multi-generational families with marriages, divorces, and children
- GEDCOM Parser: Parse and validate existing GEDCOM 5.5.1 files with strict/lenient modes
- IOUS Generator: Create "Individuals of Unusual Size" - highly connected people with multiple marriages and extensive descendants
- Unicode Support: Full UTF-8 support for non-Latin scripts (Arabic, Chinese, Japanese, Korean, etc.)
- LDS Ordinances: Optional support for baptism, endowment, sealing, and other LDS temple ordinances
- Streaming Output: Writes directly to file without loading everything into memory
- Progress Tracking: Real-time progress bar with ETA
- Highly Configurable: Customize every aspect through JSON ruleset files
- Single Binary: All 52 presets embedded - no external files needed
- REST API: Full-featured REST API with 6 endpoints for preset management and GEDCOM generation
- Swagger Documentation: Interactive API documentation at
/api/docs - Web Interface: User-friendly web UI for generating GEDCOM files
- Preview Mode: Generate small samples (10-100 records) for testing
- Batch Generation: Create files with up to 10M individuals
- Real-time Statistics: View generation metrics (individuals, families, time)
See README_WEB.md for web application documentation.
See INSTALL.md for detailed installation instructions.
Download: Get the binary for your platform from the latest release.
git clone https://github.com/yourusername/Rfamily.git
cd Rfamily
./install.shThe install script will:
- Build the optimized release binary
- Let you choose installation location (system-wide or user)
- Optionally copy to your PATH for easy access
Clone this repository and build:
git clone https://github.com/yourusername/Rfamily.git
cd Rfamily
cargo build --releaseThe compiled binary will be in target/release/rfamily
After building, the standalone binary can be copied anywhere and run independently:
# Copy binary to a directory in your PATH
cp target/release/rfamily /usr/local/bin/
# Or run directly from build directory
./target/release/rfamily --helpThe binary is completely self-contained with all 52 language presets embedded at compile time.
Binary Size: ~1.5 MB (includes all presets and dependencies)
Start the web server:
# Using cargo
cargo run -p rfamily-web
# Or run the binary directly
./target/release/rfamily-webThen visit:
- Web Interface: http://localhost:3000
- API Documentation: http://localhost:3000/api/docs
See README_WEB.md and API_DOCUMENTATION.md for complete web application documentation.
# List all available language presets
rfamily --list-presets
# Generate with a specific language preset
rfamily --preset japanese --count 100000 --output japan.ged
# Generate with custom count and output file
rfamily -p german -c 50000 -o germany.ged# List all available language presets
cargo run --release -- --list-presets# Generate with Japanese names and locations
cargo run --release -- --preset japanese --count 100000 --output japan.ged
# Generate with Arabic names (UTF-8 encoded)
cargo run --release -- --preset arabic --count 50000 --output arabic.ged
# Generate with German names
cargo run --release -- --preset german --count 75000 --output germany.gedcargo run --release -- --preset english --count 100000 --output family.ged
# or simply:
cargo run --release -- --count 100000 --output family.gedcargo run --release -- --preset lds --count 50000 --output lds-family.gedCreate highly connected individuals with multiple marriages and extensive descendants:
# Generate IOUS with default settings (3 marriages, 5 siblings, 5 generations)
rfamily generate-ious --preset english --output ious.ged
# Customize IOUS generation
rfamily generate-ious \
--preset japanese \
--output ious-japan.ged \
--marriages 4 \
--children-per-marriage 3.5 \
--siblings 6 \
--descendant-gens 4 \
--total-descendants 500
# Minimal IOUS (1 marriage, no siblings, 2 generations)
rfamily generate-ious \
--preset spanish \
--output ious-minimal.ged \
--marriages 1 \
--children-per-marriage 2.0 \
--siblings 0 \
--descendant-gens 2IOUS Parameters:
--marriages: Number of marriages (1-10, default: 3)--children-per-marriage: Mean children per marriage (0-15, default: 4.0)--siblings: Number of siblings for IOUS (0-20, default: 5)--descendant-gens: Generations of descendants (1-10, default: 5)--total-descendants: Optional limit on total individuals
Use the library API to parse existing GEDCOM files:
use rfamily_core::gedcom::{GedcomParser, ParseMode};
// Parse in lenient mode (accepts real-world GEDCOM quirks)
let mut parser = GedcomParser::new(ParseMode::Lenient);
let gedcom = parser.parse_file("family.ged")?;
println!("Parsed {} individuals", gedcom.individuals.len());
println!("Parsed {} families", gedcom.families.len());
// Access parsed data
for (xref, individual) in &gedcom.individuals {
println!("{}: {}", xref, individual.name.as_ref().unwrap());
}
// Check for warnings
for warning in parser.warnings() {
println!("Warning: {}", warning);
}Three working examples are provided in rfamily-core/examples/:
# Example 1: Parse an existing GEDCOM file
cargo run -p rfamily-core --example parse_gedcom -- path/to/file.ged
# Example 2: Generate an IOUS (Individual of Unusual Size)
cargo run -p rfamily-core --example generate_ious
# Example 3: Round-trip test (generate → parse → verify)
cargo run -p rfamily-core --example round_tripExample Output:
parse_gedcom: Parses GEDCOM files, shows individuals/families, validates referencesgenerate_ious: Creates a 200-person IOUS family tree with 3 marriages, 5 siblings, 4 generationsround_trip: Generates 100 individuals, parses them back, verifies data integrity
European Languages (30): Albanian, Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Icelandic, Italian, Latvian, Lithuanian, Macedonian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swedish, Turkish, Ukrainian
Asian Languages (7): Chinese (Traditional), Japanese, Korean, Khmer (Cambodian), Mongolian, Thai, Vietnamese
Middle Eastern Languages (3): Arabic, Armenian, Farsi (Persian)
Pacific Languages (6): Fijian, Malagasy (Madagascar), Malay, Samoan, Tongan, Tagalog (Filipino)
African Languages (1): Swahili
Caribbean & Latin American Languages (3): Haitian Creole, Guarani (Paraguayan), Cebuano (Filipino)
Special Presets (1): LDS (Latter-day Saints with temple ordinances)
cargo run --release -- --generate-ruleset my-ruleset.json- Names: Male/female given names, surnames, naming conventions (Western, Eastern, Patronymic, Icelandic)
- Dates: Birth year ranges, marriage ages, life expectancy, parent age ranges
- Locations: Countries, cities, languages with probability weights
- Demographics: Sex ratio, twin/triplet rates
- Relationships: Marriage probability, divorce rates, children distribution, multi-generational families
- Ordinances: LDS temple ordinance settings (baptism, endowment, sealing, etc.)
cargo run --release -- --ruleset my-ruleset.json --count 200000 --output custom.gedOptions:
-c, --count <COUNT> Number of individuals to generate [default: 100000]
-o, --output <OUTPUT> Output file path [default: output.ged]
-p, --preset <PRESET> Language preset to use (see --list-presets)
--list-presets List all available language presets
-r, --ruleset <RULESET> Custom ruleset configuration file (JSON)
--generate-ruleset <FILE> Generate example ruleset file
-h, --help Print help
-V, --version Print version
Deprecated options (still supported for backward compatibility):
--lds,--icelandic,--spanish,--french,--italian- Use--preset <name>instead
See the generated example-ruleset.json file for complete configuration options. Key sections include:
Names, Dates, Locations, Demographics, Relationships, and Ordinances.
Refer to the full documentation in the source code for detailed parameter descriptions.
On a typical modern machine, this tool can generate:
- 100,000 records with families in ~5-10 seconds
- 1 million records in ~30-60 seconds
- 10 million records in ~5-10 minutes
Actual performance depends on your CPU, disk I/O speed, and complexity of family relationships.
The generated GEDCOM file includes:
- Standard GEDCOM 5.5.1 header
- Individual records (INDI) with:
- Full names (given name and surname)
- Sex (based on demographic rules)
- Birth date and place
- Death date and place (optional)
- Language
- Family relationships (parents and spouses)
- LDS ordinances (optional)
- Family records (FAM) with:
- Husband and wife references
- Children references
- Marriage date and place
- Divorce date (if applicable)
- Proper GEDCOM trailer
0 HEAD
1 SOUR Rfamily
2 VERS 0.2.0
1 GEDC
2 VERS 5.5.1
0 @I1@ INDI
1 NAME James /Smith/
2 GIVN James
2 SURN Smith
1 SEX M
1 BIRT
2 DATE 15 MAR 1985
...
0 TRLR- Genealogy Software Testing: Generate realistic test data for genealogy applications in 51 different languages
- Performance Testing: Test how software handles large GEDCOM files with millions of records
- Data Analysis: Create datasets for studying genealogical patterns across different cultures
- LDS Family History: Generate data with temple ordinances for testing FamilySearch integrations
- Cultural Studies: Generate families following specific cultural naming conventions and demographics
- Internationalization Testing: Test genealogy software with Unicode names and non-Latin scripts
- Database Population: Quickly populate databases with realistic multi-generational family data
- Format: GEDCOM 5.5.1 standard
- Encoding: UTF-8 with full Unicode support
- Language: Rust for optimal performance and memory safety
- Architecture: Streaming output for minimal memory footprint
- Distribution: Single self-contained binary with all 51 presets embedded
- Binary Size: ~1.5 MB (includes all presets and dependencies)
- Platform: macOS (Apple Silicon/Intel), Linux, Windows (via cross-compilation)
The compiled binary is completely standalone and can be distributed without any dependencies:
Included in Binary:
- ✅ All 51 language presets (embedded at compile time)
- ✅ Complete GEDCOM generation engine
- ✅ UTF-8 support for all character sets
- ✅ No external files required
- ✅ No runtime dependencies
To distribute:
- Build the release binary:
cargo build --release - The binary is located at
target/release/rfamily(~1.5 MB) - Copy to any location - it works standalone
- Optional: Copy to PATH for system-wide access:
cp target/release/rfamily /usr/local/bin/
Binary Information:
- macOS: Mach-O 64-bit executable (Apple Silicon: arm64, Intel: x86_64)
- All 51 language presets embedded (204 KB preset data)
- Self-contained - no external files or dependencies needed
Cross-compilation for other platforms:
# For Linux from macOS
rustup target add x86_64-unknown-linux-gnu
cargo build --release --target x86_64-unknown-linux-gnu
# For Windows from macOS
rustup target add x86_64-pc-windows-gnu
cargo build --release --target x86_64-pc-windows-gnuPotential features to add:
- More language presets (Hindi, Tamil, Telugu, Urdu, etc.)
- More sophisticated relationship modeling
- Historical accuracy improvements
- DNA/genetic relationship modeling
- Import/merge with existing GEDCOM files
- Custom name frequency distributions
- Migration patterns across locations
MIT
Contributions are welcome! Please feel free to submit a Pull Request.