Skip to content

feat: async engine with aiohttp for ~2x performance improvement#2798

Open
matthew6s wants to merge 1 commit intosherlock-project:masterfrom
matthew6s:feat/async-engine
Open

feat: async engine with aiohttp for ~2x performance improvement#2798
matthew6s wants to merge 1 commit intosherlock-project:masterfrom
matthew6s:feat/async-engine

Conversation

@matthew6s
Copy link

Closes #2797

Summary

Adds an async engine using aiohttp as a drop-in replacement for the synchronous requests-futures ThreadPoolExecutor approach.

Benchmark Results (real-world, 478 sites)

Engine Timeout Time Results Found
Sync (current) 15s 39.0s 13
Async (new) 15s 21.4s 14
Sync (current) 60s 65.8s 13
Async (new) 60s 63.6s 14

The async engine is ~1.8x faster and found an additional result that the sync engine missed (likely due to more efficient connection handling).

Changes

File Change
sherlock_project/async_engine.py New — async engine module
sherlock_project/sherlock.py Import + CLI flags + call routing
pyproject.toml Add aiohttp dependency

New CLI Flags

  • --workers N, -w N — max concurrent requests (default: 100)
  • --sync — use legacy synchronous engine

Backwards Compatibility

  • Default behavior switches to async
  • --sync preserves the existing behavior exactly
  • Return value is identical (same dict structure, same QueryResult objects)
  • All existing CLI flags work unchanged

New Dependency

  • aiohttp ^3.9.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: async engine with aiohttp for 3-5x performance improvement

1 participant