Get up and running with Clipz in 5 minutes! This guide covers installation, first run, and basic usage.
📖 For detailed documentation, see the full README
- Python 3.11 installed
- FFmpeg installed (Download here)
- OpenRouter API key (Get free key)
git clone https://github.com/Jit-Roy/Clipz.git
cd Clipz# Create virtual environment
python -m venv venv
# Activate it
# Windows:
venv\Scripts\activate
# macOS/Linux:
source venv/bin/activatepip install -r requirements.txtThis will download all required packages including PyTorch, Whisper, YOLO, and CLIP models.
Create a .env file in the project root:
OPENROUTER_API_KEY=your_api_key_here💡 Tip: Copy from
.env.exampleif available
python main.py path/to/your/video.mp4What happens:
- ✅ Extracts audio from video
- ✅ Analyzes audio for excitement (loudness, rhythm, laughter, etc.)
- ✅ Analyzes video for visual interest (motion, faces, composition)
- ✅ Transcribes speech with timestamps
- ✅ Uses AI (GPT-4o-mini) to intelligently select and merge clips
- ✅ Exports clips to
output/clips_<timestamp>/
First run takes longer (downloads models, no cache). Subsequent runs are much faster!
python main.py video.mp4 --query "give me 5 funny clips"python main.py podcast.mp4 --audio-weight 0.7 --video-weight 0.3python main.py sports.mp4 --audio-weight 0.3 --video-weight 0.7python main.py video.mp4 --min-duration 5 --max-duration 15 --query "give me 10 viral moments"python main.py video.mp4 --fps 1Lower FPS = faster analysis (less accurate)
If you want to integrate Clipz into your own Python scripts:
from main import ViralClipExtractor
# Initialize
extractor = ViralClipExtractor(
audio_weight=0.5,
video_weight=0.5,
output_dir="output"
)
# Process video
results = extractor.process(
video_path="video.mp4",
user_query="give me 10 interesting clips",
export=True
)
# Access results
for clip in results["clips"]:
print(f"⏱️ {clip['start']:.1f}s - {clip['end']:.1f}s")
print(f"📝 {clip['transcript'][:100]}...")
print(f"⭐ Score: {clip['llm_interest_score']}/10")
print(f"💡 {clip['reason']}")
print()📖 For full API documentation, see README.md
After processing, your files are organized as follows:
Your exported video clips (ready to upload!)
clip_001.mp4
clip_002.mp4
clip_003.mp4
...
📁 .cache/ (Hidden folder with metadata)
JSON files for each clip with details:
{
"clip_number": 1,
"start_time": 45.2,
"end_time": 58.7,
"duration": 13.5,
"transcript": "...",
"interest_score": 9.5,
"reason": "Emotional storytelling with dramatic pause",
"tags": ["emotional", "dramatic"]
}Complete analysis report for the entire video
Cached features for faster re-processing
💡 Tip: The
.cache/folder makes re-runs much faster! Delete it to force fresh analysis.
| Problem | Solution |
|---|---|
| Models downloading on first run | Normal! YOLO (~6MB) and CLIP models download automatically. |
FFmpeg not found |
Install FFmpeg and add to system PATH |
OpenRouter API error |
Verify .env file has valid OPENROUTER_API_KEY |
Out of memory |
Use --fps 1 or process shorter videos |
| Slow processing | First run is slow (downloads models). Use caching for re-runs. |
Now that you've run your first extraction, explore more:
- 📖 Full Documentation - API reference, architecture, advanced features
- 🤝 Contributing Guide - Help improve Clipz
- 🔬 Test Individual Modules:
python Audio/audio.py audio.wav- Test audio analysispython video/video.py video.mp4- Test video analysispython Transcription/transcribe.py audio.wav- Test transcription
✅ First run is slow - Models download, no cache (~10-15 min for 10-min video)
✅ Re-runs are fast - Cached features make it 3-5x faster
✅ GPU recommended - Install CUDA PyTorch for 2-3x speedup
✅ Adjust weights - Podcasts need high audio weight, sports need high video weight
✅ Natural language queries - "Extract emotional moments", "Give me exciting gameplay", etc.
✅ Check .cache/metadata/ - See why each clip was selected
Happy clipping !!!
Questions? Check the README or open an issue on GitHub.