Skip to content

Arjunheregeek/-GenAI-Sports-Tournament-Discovery-System

Repository files navigation

πŸ† GenAI Tournament Calendar System

Advanced AI-Powered Tournament Discovery & Management Platform

Python 3.12+ UV Package Manager Flask API License: MIT

πŸš€ Project Overview

GenAI Tournament Calendar System is an intelligent tournament discovery platform that automatically finds, extracts, and organizes tournament information from across the web. Built with modern Python technologies and AI integration, it supports 13+ sports with comprehensive tournament data collection.

🎯 Key Features

  • πŸ€– AI-Powered Query Generation - Generates smart search queries using OpenAI GPT
  • 🌐 Multi-Sport Support - Cricket, Football, Tennis, Badminton, and 9+ more sports
  • πŸ“Š Comprehensive Data Collection - 6 queries per sport with 8 results each (48 results total)
  • 🏒 Official Governing Bodies - Targets ICC, FIFA, ITF, BWF and other official sources
  • πŸ”„ Automated Pipeline - From query generation to data export in one workflow
  • πŸš€ Modern Architecture - UV package management, Flask API, React-ready frontend
  • πŸ“ˆ Real-time Processing - Live tournament discovery and processing
  • πŸ’Ύ Multiple Export Formats - CSV, JSON, and database integration

πŸ—οΈ System Architecture

graph TB
    A[Query Generator] --> B[Search Collector]
    B --> C[Content Extractor]
    C --> D[Data Processor]
    D --> E[Tournament Filter]
    E --> F[Data Exporter]
    
    G[Flask API] --> H[Frontend Interface]
    G --> I[Processing Engine]
    I --> A
    
    J[UV Environment] --> K[50+ Dependencies]
    L[OpenAI GPT] --> A
    M[Serper API] --> B
    N[Firecrawl API] --> C
Loading

πŸ”§ Core Components

Component Purpose Technology
Query Generator AI-powered search query generation OpenAI GPT-3.5
Search Collector Web search results collection Serper API
Content Extractor Tournament data extraction Firecrawl API
Data Processor Deduplication & validation Python/Pandas
Flask API RESTful backend services Flask 3.0+
Frontend Tournament search interface HTML5/CSS3/JS

πŸ“‹ Prerequisites

System Requirements

  • Python 3.12+ (Required)
  • UV Package Manager (Recommended)
  • Git (For version control)
  • 4GB+ RAM (For processing)
  • Internet Connection (For API calls)

API Keys Required

  • Serper API - Web search functionality
  • OpenAI API - AI query generation
  • Firecrawl API - Content extraction

πŸ› οΈ Installation & Setup

1️⃣ Quick Setup (Recommended)

# Clone the repository
git clone https://github.com/Arjunheregeek/Gen-AI.git
cd Gen-AI

# Switch to the enhanced UV branch
git checkout uv

# Install UV package manager (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or on Windows: powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

# Install all dependencies with UV
uv sync

2️⃣ Environment Configuration

# Copy environment template
cp .env.example .env

# Edit .env file with your API keys
nano .env  # or code .env

Required API Keys in .env:

SERPER_API_KEY=your_serper_api_key_here
OPENAI_API_KEY=your_openai_api_key_here  
FIRECRAWL_API_KEY=your_firecrawl_api_key_here

3️⃣ Verify Installation

# Activate UV environment and test
uv run python -c "import tournament_system; print('βœ… Installation successful!')"

πŸš€ Running the System

Option 1: Development Mode (Both Servers)

# Start both frontend and backend together
uv run python run.py dev

Access Points:

Option 2: API Server Only

# Start backend API server
uv run python run.py server --port 8000

# Or with debug mode
uv run python run.py server --port 8000 --debug

Option 3: Batch Processing

# Run complete tournament discovery pipeline
uv run python run.py batch --sport Cricket

# Limit results
uv run python run.py batch --sport Tennis --max 50

πŸ”„ Tournament Discovery Pipeline

Complete Workflow Overview

1. Query Generation β†’ 2. Web Search β†’ 3. Content Extraction β†’ 4. Data Processing β†’ 5. Export

πŸ” Step-by-Step Pipeline

Step 1: AI Query Generation

  • Generates 6 smart queries per sport
  • Targets official governing bodies (ICC, FIFA, ITF, etc.)
  • Uses OpenAI GPT for enhanced query optimization
  • Covers men's and women's tournaments

Step 2: Web Search Collection

  • Executes 8 search results per query (48 total per sport)
  • Uses Serper API for comprehensive web search
  • Filters for official tournament sources
  • Implements rate limiting and retry logic

Step 3: Content Extraction

  • Extracts structured tournament data using Firecrawl
  • Identifies tournament names, dates, venues, levels
  • Validates data quality and relevance
  • Handles multiple content formats

Step 4: Data Processing

  • Deduplicates tournaments using intelligent matching
  • Standardizes date formats and venue information
  • Applies confidence scoring
  • Filters for relevance and quality

Step 5: Data Export

  • Exports to CSV, JSON formats
  • Generates processing statistics
  • Creates data validation reports
  • Saves to organized file structure

πŸƒ Quick Start Examples

Basic Tournament Search

# Search Cricket tournaments
curl "http://localhost:8000/search?sport=Cricket"

# Search Tennis tournaments  
curl "http://localhost:8000/search?sport=Tennis"

Comprehensive Processing

# Full pipeline for Football
uv run python run.py batch --sport Football

# Multiple sports processing
for sport in Cricket Tennis Football; do
    uv run python run.py batch --sport $sport --max 25
done

API Health Check

# Check system status
curl "http://localhost:8000/health"

🎯 Supported Sports

13 Sports with Official Governing Bodies

Sport International Body National Body (India) Queries Generated
🏏 Cricket ICC BCCI 6 queries
⚽ Football FIFA AIFF 6 queries
🏸 Badminton BWF BAI 6 queries
🎾 Tennis ITF AITA 6 queries
πŸƒ Running World Athletics AFI 6 queries
🚴 Cycling UCI CFI 6 queries
🏊 Swimming World Aquatics SFI 6 queries
πŸ€ Basketball FIBA BFI 6 queries
β™ŸοΈ Chess FIDE AICF 6 queries
πŸ“ Table Tennis ITTF TTFI 6 queries
🀼 Kabaddi IKF KFI 6 queries
🧘 Yoga IYF Ministry of AYUSH 6 queries
πŸ‹οΈ Gym IWF IWF India 6 queries

Total: 78 queries across all sports


πŸ“Š API Documentation

Core Endpoints

πŸ” Tournament Search

GET /search?sport={sport}&level={level}

Parameters:

  • sport (required): Sport name (Cricket, Tennis, etc.)
  • level (optional): Tournament level (International, National)

Response Example:

{
  "success": true,
  "sport": "Cricket",
  "count": 25,
  "tournaments": [
    {
      "tournament_name": "ICC World Cup 2025",
      "level": "International", 
      "start_date": "2025-10-01",
      "end_date": "2025-11-15",
      "venue": "India",
      "governing_body": "ICC",
      "registration_url": "https://icc-cricket.com/worldcup2025"
    }
  ]
}

πŸ“Š System Health

GET /health

πŸ“‹ Supported Sports

GET /sports

πŸ—‚οΈ Project Structure

tournament_system/
β”œβ”€β”€ πŸ”§ api/                     # Flask API Backend
β”‚   β”œβ”€β”€ routes/                 # API route handlers
β”‚   β”œβ”€β”€ services/               # Business logic services  
β”‚   └── utils/                  # API utilities
β”œβ”€β”€ 🧠 core/                    # Core Processing Engine
β”‚   β”œβ”€β”€ query_generator.py      # AI-powered query generation
β”‚   β”œβ”€β”€ search_collector.py     # Web search collection
β”‚   β”œβ”€β”€ content_extractor.py    # Tournament data extraction
β”‚   └── data_processor.py       # Data processing & validation
β”œβ”€β”€ πŸ“€ exporters/               # Data Export Modules
β”œβ”€β”€ πŸ—ƒοΈ database/                # Database integration
β”œβ”€β”€ πŸ› οΈ utils/                   # Shared utilities
└── πŸ“ final_output/            # Generated tournament data

frontend/                       # Web Interface
β”œβ”€β”€ index.html                  # Tournament search interface
└── assets/                     # CSS, JS, images

πŸ”§ Configuration Files
β”œβ”€β”€ pyproject.toml              # UV package configuration
β”œβ”€β”€ uv.lock                     # Dependency lock file
β”œβ”€β”€ .env.example                # Environment template
└── run.py                      # Main entry point

πŸ§ͺ Testing

Run Test Suite

# Run all tests
uv run python -m pytest

# Test specific components
uv run python -m pytest tests/test_query_generator.py

Manual Testing

# Test query generation
uv run python test_query_generator.py

# Test API endpoints
curl -X GET "http://localhost:8000/health"
curl -X GET "http://localhost:8000/search?sport=Cricket"

πŸš€ Performance & Scaling

Current Capacity

  • 78 queries across 13 sports
  • 624 search results per full run (78 Γ— 8)
  • ~100-300 tournaments discovered per sport
  • Processing time: 5-15 minutes per sport

Optimization Features

  • ⚑ UV Package Manager - 2x faster dependency resolution
  • πŸ”„ Batch Processing - Efficient bulk operations
  • 🎯 Smart Filtering - Reduces irrelevant results by 70%
  • πŸ’Ύ Caching - Avoids duplicate API calls
  • πŸ”€ Parallel Processing - Multi-threaded extraction

πŸ› οΈ Development

Adding New Sports

# In tournament_system/core/query_generator.py
self.governing_bodies["New Sport"] = {
    "international": "International Body",
    "national": "National Body", 
    "website": "official-website.com"
}

Customizing Queries

# Modify query templates
self.official_query_templates = [
    'your custom query template here',
    # Add more templates...
]

Contributing

  1. Fork the repository
  2. Create feature branch: git checkout -b feature/new-feature
  3. Commit changes: git commit -m 'Add new feature'
  4. Push to branch: git push origin feature/new-feature
  5. Submit Pull Request

πŸ”§ Troubleshooting

Common Issues

UV Installation Issues

# Reinstall UV
curl -LsSf https://astral.sh/uv/install.sh | sh
source ~/.bashrc  # or restart terminal

API Key Issues

# Verify .env file exists and has correct keys
cat .env
# Check API key validity
uv run python -c "import os; print(os.getenv('SERPER_API_KEY'))"

Port Already in Use

# Use different port
uv run python run.py server --port 8001

Import Errors

# Reinstall dependencies
uv sync --reinstall

🀝 Credits & Acknowledgments

πŸ‘¨β€πŸ’» Developer

Arjun - Project Creator & Lead Developer

πŸ› οΈ Technologies Used

  • Python 3.12+ - Core programming language
  • UV Package Manager - Modern dependency management
  • Flask 3.0+ - Web framework and API development
  • OpenAI GPT-3.5 - AI-powered query generation
  • Serper API - Web search functionality
  • Firecrawl API - Content extraction and processing

πŸ† Sports Organizations

Special recognition to the governing bodies that make tournament data accessible:

  • ICC (International Cricket Council)
  • FIFA (FΓ©dΓ©ration Internationale de Football Association)
  • ITF (International Tennis Federation)
  • BWF (Badminton World Federation)
  • And all other international sports federations

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸš€ What's Next?

Planned Features

  • 🌍 Multi-language Support - Tournament discovery in multiple languages
  • πŸ“± Mobile App - React Native mobile application
  • πŸ”— Calendar Integration - Google Calendar, Outlook sync
  • πŸ€– Advanced AI - GPT-4 integration for better query generation
  • πŸ“Š Analytics Dashboard - Tournament trends and statistics
  • πŸ”” Real-time Notifications - Tournament deadline alerts

πŸ“ž Support

Getting Help

Quick Commands Reference

# Start development servers
uv run python run.py dev

# Run batch processing
uv run python run.py batch --sport Cricket

# Check system health  
curl http://localhost:8000/health

# Run tests
uv run python -m pytest

πŸ† Built with ❀️ by Arjun | Powering the future of tournament discovery

GitHub stars GitHub fork

About

An AI-powered pipeline to discover and structure sports tournament data using Serper and FireCrawl. This branch is managed with the uv package manager.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages