A Python library for formal regular expression manipulation using Brzozowski derivatives.
Unlike Python's re module, which treats regexes as opaque matchers, revex treats
them as first-class mathematical objects that support intersection, complement,
equivalence checking, and uniform random string generation.
import revex
# Check if two regexes match the same language
revex.equivalent(r'(ab)*', r'(ab)*(ab)*') # True
# Check if two regexes can match any common string
revex.intersects(r'[a-m]+', r'[g-z]+') # True
# Find a concrete string matching both regexes
revex.find_example(r'[a-m]+', r'[g-z]+') # e.g. 'g'
# Find a string matching one regex but not the other
revex.find_difference(r'[a-z]+', r'admin.*') # e.g. 'a'
# Check if one regex's language is a subset of another
revex.is_subset(r'aaa', r'a+') # True
# Generate a random string of exactly length N
revex.sample(r'[a-z]{3,8}', 5) # e.g. 'qxmtf'
# Compute the set difference of two regex languages
r = revex.subtract(r'[a-z]+', r'admin.*')
r.match('user') # True
r.match('admin') # False- Security: Find inputs that bypass frontend validation but reach the backend. Detect overlapping firewall rules. Verify PII redaction coverage.
- Testing: Generate strings of specific lengths uniformly at random. Create negative test cases from regex complements. Test regex intersections.
- Static analysis: Detect overlapping URL routes in web frameworks. Find dead regex branches. Validate schema constraints.
- Formal verification: Check regex equivalence after refactoring. Verify protocol message format coverage.
See the full documentation and ROADMAP.md for more.
pip install revexRequires Python 3.10+. Dependencies: networkx, numpy.
- Parse the regex using Python's own
re._parserinto an AST. - Compile the AST into
RegularExpressionobjects supporting the Brzozowski derivative. - Construct a DFA by iteratively computing derivatives with respect to each character in the alphabet.
- Analyze the DFA: minimization, equivalence checking, language finiteness, longest string computation.
- Generate strings uniformly at random using path weights (Bernardi & Giménez 2012).
The key insight is that intersection and complement have trivial derivative rules, making operations that are exponentially expensive with NFA-based approaches simple and direct.
pip install -e ".[docs]"
sphinx-build -b html docs docs/_build/html
# Open docs/_build/html/index.htmlRequires Graphviz for diagrams: apt install graphviz / brew install graphviz.
pip install -e ".[dev]"
pytest revex/tests/Apache 2.0