Skip to main content

Mastering Regular Expression. Part 10: Custom Regex Classes and Engine Internals in C#

C# goes beyond standard regex usage — it gives you tools to compile and cache patterns into strongly-typed classes using RegexCompilationInfo, which boosts performance and lets you bundle regex logic in a reusable way.

In this part, you’ll learn:

  • How the .NET regex engine works under the hood
  • How to create precompiled regex classes using RegexCompilationInfo
  • Caching strategies
  • Performance considerations
  • When to compile vs. interpret

10.1 How the .NET Regex Engine Works

Interpreted vs Compiled Mode

  • Interpreted: default; parses and evaluates the regex every time.
  • Compiled: transforms regex into IL (intermediate language) code.

Key Options:

RegexOptions.Compiled
RegexOptions.IgnoreCase
RegexOptions.Singleline
RegexOptions.ExplicitCapture

Compiled trades memory for speed — ideal for regexes used frequently.


10.2 Regex Compilation via RegexCompilationInfo

You can precompile regex into standalone C# classes and generate a DLL that’s fast and reusable.

🔧 Step 1: Define Your Patterns

var regexes = new RegexCompilationInfo[]
{
    new RegexCompilationInfo(
        @"^\d{4}-\d{2}-\d{2}$",         // pattern
        RegexOptions.None,              // options
        "DateRegex",                    // class name
        "MyRegexLib",                   // namespace
        true                            // isPublic
    ),
    new RegexCompilationInfo(
        @"^[a-zA-Z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$",
        RegexOptions.IgnoreCase,
        "EmailRegex",
        "MyRegexLib",
        true
    )
};

🔧 Step 2: Generate Assembly

RegexCompilationInfo[] patterns = { /* as above */ };

var assemblyName = new AssemblyName("MyRegexLibrary");
Regex.CompileToAssembly(patterns, assemblyName);

This generates a DLL (MyRegexLibrary.dll) with your compiled regexes.


🔧 Step 3: Use Precompiled Regex

using MyRegexLib;

bool isDate = DateRegex.IsMatch("2025-08-04");
bool isEmail = EmailRegex.IsMatch("john@example.com");

✔️ No runtime parsing or interpretation. This is blazing fast.


10.3 Benefits of Precompiling Regex

Feature

Benefit

Speed

Skips parsing regex at runtime

Reusability

Centralizes regex logic

Compile-time checks

Errors during build, not runtime

Cleaner Code

Strongly-typed matchers


10.4 Caching Regex for Repeated Use

Even without precompiling, regex caching is essential for performance.

Bad (compiled every time):

bool match = Regex.IsMatch(text, pattern);

Good (reuse a static Regex):

static Regex phoneRegex = new Regex(@"\d{3}-\d{3}-\d{4}", RegexOptions.Compiled);

bool match = phoneRegex.IsMatch(input);

Avoid creating regex instances in loops.


10.5 Regex Class Generator Template

Let’s build a C# Regex Factory class:

public static class RegexFactory
{
    public static readonly Regex Email = new(
        @"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$",
        RegexOptions.Compiled | RegexOptions.IgnoreCase
    );

    public static readonly Regex Date = new(
        @"^\d{4}-\d{2}-\d{2}$",
        RegexOptions.Compiled
    );

    public static readonly Regex IPv4 = new(
        @"^((25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)\.){3}(25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)$",
        RegexOptions.Compiled
    );
}

Usage:

if (RegexFactory.Email.IsMatch(userInput)) { ... }

10.6 Regex Compilation: When to Use It

Scenario

Recommendation

Regex runs 1000s+ times/sec

Use Compiled

Runs occasionally

Skip Compiled

Regex logic reused across apps

Precompile to DLL

Regex built from user input

Avoid compiling


10.7 Benchmarking Regex Performance

Use Stopwatch to measure regex speed:

var sw = Stopwatch.StartNew();
for (int i = 0; i < 100000; i++)
{
    Regex.IsMatch("test@example.com", pattern);
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);

10.8 Limitations of Regex Compilation

  • Slower startup time if too many compiled regexes
  • Can't compile dynamic patterns
  • Not portable between platforms without same .NET version

10.9 Organizing Regex Libraries in C#

Structure your regex codebase like this:

/RegexLib/
    EmailRegex.cs
    DateRegex.cs
    InputValidators.cs
    RegexFactory.cs

Keep patterns:

  • Well-named
  • Tested
  • Version-controlled

Summary

By precompiling regex:

  • You gain performance
  • You reduce runtime bugs
  • You encapsulate and organize regex logic cleanly

You now know how to:

Use RegexCompilationInfo
Generate regex DLLs
Cache regex safely
Write reusable regex classes


Coming Up Next

In Part 11, we’ll wrap up the series with:

“Regex Testing, Debugging, and Tooling for C# Developers”

We’ll explore:

  • Online testers
  • Regex debuggers
  • Visualizers
  • Unit test strategies
  • Common libraries and extensions

Comments

Popular posts from this blog