Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Cost-Based Matching

Assign different costs to edit operations for fine-grained control.

Simple Cost Constraint

Use {c<=N} to limit total cost (all operations cost 1):

fn main() {
    use fuzzy_regex::FuzzyRegex;

    // Total cost ≤ 2 (any combination of edits, each costing 1)
    let re = FuzzyRegex::new("(?:hello){c<=2}").unwrap();

    assert!(re.is_match("hello")); // 0 edits, cost=0
    assert!(re.is_match("hallo")); // 1 sub, cost=1
    assert!(re.is_match("helo"));  // 1 del, cost=1
    assert!(re.is_match("hhello")); // 1 ins, cost=1
    assert!(re.is_match("hallo")); // 1 sub, cost=1
}

Weighted Costs

Assign different costs to different edit types:

fn main() {
    use fuzzy_regex::FuzzyRegex;

    // Insertions cost 2, others cost 1, total ≤ 3
    let re = FuzzyRegex::new("(?:ab){2i+1d+1s+1t<=3}").unwrap();

    assert!(re.is_match("abc")); // 1 insertion (cost=2)
    assert!(re.is_match("a"));   // 1 deletion (cost=1)
    assert!(re.is_match("ba"));  // 1 transposition (cost=1)
    assert!(!re.is_match("aabc")); // 2 insertions (cost=4) - exceeds limit
}

Cost Syntax

SyntaxDescription
{c<=N}Total cost ≤ N
{c<N}Total cost < N
{Ni+Md+St+Tt<=N}Custom costs for each type
{0i+...}Free insertions
{Ni...}Insertions cost N
fn main() {
    use fuzzy_regex::FuzzyRegex;

    // All operations cost 1: {1i+1d+1s+1t<=N} = {c<=N}
    let re1 = FuzzyRegex::new("(?:test){c<=2}").unwrap();

    // Insertions are expensive (typing errors less common)
    let re2 = FuzzyRegex::new("(?:test){3i+1d+1s<=3}").unwrap();

    // Insertions are free (OCR errors)
    let re3 = FuzzyRegex::new("(?:test){0i+1d+1s<=2}").unwrap();
    
    println!("re1: {}", re1.is_match("test"));
    println!("re2: {}", re2.is_match("test"));
    println!("re3: {}", re3.is_match("tset"));
}

Use Cases

  • Typing errors: Substitutions/deletions more common than insertions
  • OCR errors: Insertions more common (extra characters)
  • Genetic sequences: Different scoring matrices