Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Performance Tips

Optimize fuzzy-regex for your use case.

Pattern Design

1. Use Specific Edit Limits

fn main() {
    // Good: Specific limit
    let _ = fuzzy_regex::FuzzyRegex::new("(?:hello){e<=1}").unwrap();

    // Less efficient: Higher limit
    let _ = fuzzy_regex::FuzzyRegex::new("(?:hello){e<=5}").unwrap();
    
    println!("Done");
}

Lower edit limits = faster matching.

2. Prefer Shorter Patterns

fn main() {
    // Bitap (fast): ≤64 chars
    let _ = fuzzy_regex::FuzzyRegex::new("(?:short){e<=1}").unwrap();

    // NFA (slower): >64 chars
    let _ = fuzzy_regex::FuzzyRegex::new("(?:very_long_pattern_that_exceeds_sixty_four_characters){e<=1}").unwrap();
    
    println!("Done");
}

3. Extract Exact Parts

fn main() {
    // Good: Exact prefix and suffix help prefilter
    let _ = fuzzy_regex::FuzzyRegex::new("exact_prefix (?:fuzzy){e<=1} exact_suffix").unwrap();

    // Slower: Entirely fuzzy
    let _ = fuzzy_regex::FuzzyRegex::new("(?:entirely_fuzzy){e<=1}").unwrap();
    
    println!("Done");
}

4. Use Greedy Suffix Patterns

fn main() {
    // Good: .*SUFFIX is optimized with reverse search
    let _ = fuzzy_regex::FuzzyRegex::new(".*test").unwrap();
    let _ = fuzzy_regex::FuzzyRegex::new(".*test~1").unwrap();
    
    // Also works with anchors
    let _ = fuzzy_regex::FuzzyRegex::new("^.*test$").unwrap();
    
    println!("Done");
}

Patterns like .*test automatically use reverse search to find the suffix first, then match everything before it. This is O(n) instead of O(n²).

Builder Options

1. Set Similarity Threshold

fn main() {
    use fuzzy_regex::FuzzyRegexBuilder;

    // Skip low-quality matches early
    let _ = FuzzyRegexBuilder::new("(?:hello){e<=2}")
        .similarity(0.8)
        .build();
    
    println!("Done");
}

2. Use Case Insensitive at Builder

fn main() {
    use fuzzy_regex::FuzzyRegexBuilder;

    // More efficient than inline (?i)
    let _ = FuzzyRegexBuilder::new("(?:hello)")
        .case_insensitive(true)
        .build();
    
    println!("Done");
}

API Usage

1. Use Streaming for Large Data

fn main() {
    use fuzzy_regex::FuzzyRegex;

    let re = FuzzyRegex::new("(?:hello){e<=1}").unwrap();
    
    // Good: Process in chunks
    let mut stream = re.stream();
    let data = b"hello world";
    for chunk in data.chunks(8) {
        // Process chunk
    }

    // Bad: Load all into memory
    let large_text = "hello world";
    let _matches: Vec<_> = re.find_iter(&large_text).collect();
}

2. Use find() for First Match

fn main() {
    use fuzzy_regex::FuzzyRegex;

    let re = FuzzyRegex::new("(?:hello){e<=1}").unwrap();
    let text = "hello world";

    // Good: Stop after first match
    if let Some(m) = re.find(text) {
        println!("Found: {}", m.as_str());
    }

    // Unnecessary: Find all when only first needed
    let _all: Vec<_> = re.find_iter(text).collect();
}

3. Check supports_streaming()

fn main() {
    use fuzzy_regex::FuzzyRegex;

    let re = FuzzyRegex::new("(?:hello){e<=1}").unwrap();
    
    if re.supports_streaming() {
        // Use streaming API for best performance
        let mut stream = re.stream();
        println!("Streaming supported");
    }
}

Build Configuration

1. Release Mode

cargo build --release

2. LTO

[profile.release]
lto = true
codegen-units = 1

3. SIMD

Enabled by default. Ensure target CPU supports it.

Common Pitfalls

IssueSolution
Slow with high editsLower edit limit
High memory usageUse streaming
Slow on long textUse exact prefix
Slow compilationEnable LTO

Pathological Patterns

Some regex patterns can cause O(n²) behavior in naive implementations:

#![allow(unused)]
fn main() {
// Pattern: .*a|b on text of all 'b's
// Each 'b' matches individually
// Naive: O(n) matches × O(n) scan = O(n²)
}

When It Happens

  • Alternation with wildcards: .*a|b, (a|b)+
  • Overlapping matches: Many ways to match the same text
  • Backtracking patterns: Complex alternation

Solution: Hardened Mode

Use find_all_hardened() for O(n) guaranteed performance:

#![allow(unused)]
fn main() {
use fuzzy_regex::FuzzyRegex;

let re = FuzzyRegex::new(".*a|b").unwrap();
let text = "bbbbbbbbbbbbbbbb";

// Hardened mode: O(n) guaranteed
let matches = re.find_all_hardened(text);
}

Performance Comparison

Text SizeStandardHardened
1,000 bytes1.08s69ms
10,000 bytes10.76s69ms

The hardened mode maintains constant time regardless of text size.

Trade-offs

Hardened mode may be slightly slower for well-behaved patterns (where O(n²) doesn’t occur), but it’s the safest choice when:

  • Pattern behavior is unknown
  • Text comes from untrusted sources
  • Worst-case performance is critical