Regex for Beginners: How to Test Regular Expressions Online
Learn regex from scratch. Patterns, quantifiers, groups, and how to test them in real-time with our free tester.

Regex Basics: A Practical Guide for Developers
Regular expressions (regex) are one of those skills that look intimidating at first but become indispensable once you master them. Whether you're validating email addresses, extracting data from logs, or performing complex search-and-replace operations, regex gives you superpowers in text processing. This guide covers the fundamentals you need to start writing effective patterns today.
What Is a Regular Expression?
A regular expression is a sequence of characters that defines a search pattern. Think of it as a mini-programming language designed specifically for matching and manipulating text. Most modern programming languages — JavaScript, Python, Ruby, Java, Go, and many others — support regex natively or through standard libraries.
The core idea is simple: you define a pattern, and the regex engine scans your input text to find matches. Patterns can range from a literal word like hello to complex expressions that match email addresses, URLs, or nested HTML tags.
You can experiment with patterns interactively using our regex tester tool, which provides real-time matching against sample text.
Common Patterns and Building Blocks
Most regex patterns are built from a small set of primitives. Here's a cheat sheet of the most frequently used building blocks:
| Pattern | Meaning | Example | Matches | ||
|---|---|---|---|---|---|
| `.` | Any character except newline | `c.t` | cat, cot, cut | ||
| `\d` | Any digit (0-9) | `\d{3}` | 123, 456, 000 | ||
| `\w` | Word character (a-z, A-Z, 0-9, _) | `\w+` | hello, test_123 | ||
| `\s` | Whitespace (space, tab, newline) | `\s+` | " ", "\t\n" | ||
| `^` | Start of string | `^Hello` | "Hello world" | ||
| `$` | End of string | `world$` | "Hello world" | ||
| `*` | Zero or more of preceding | `ab*c` | ac, abc, abbc | ||
| `+` | One or more of preceding | `ab+c` | abc, abbc (not ac) | ||
| `?` | Zero or one of preceding | `colou?r` | color, colour | ||
| `{n,m}` | Between n and m repetitions | `\d{2,4}` | 12, 123, 1234 | ||
| `[abc]` | Character class (any listed) | `[aeiou]` | Any vowel | ||
| `[^abc]` | Negated character class | `[^0-9]` | Any non-digit | ||
| `(x | y)` | Alternation (x or y) | `cat | dog` | cat or dog |
Master these, and you can construct patterns for 90% of everyday use cases. For example, a US phone number pattern might look like \d{3}-\d{3}-\d{4} — three digits, a hyphen, three digits, another hyphen, and four digits.
Regex Flags: Controlling the Engine
Flags modify how the regex engine interprets and applies your pattern. The most important ones are:
Flags are combined in different ways depending on the language. In JavaScript: /pattern/gi. In Python: re.findall(pattern, text, re.IGNORECASE | re.DOTALL). In most online tools, they're available as toggle buttons.
Greedy vs Lazy Quantifiers
One of the most common regex pitfalls is the difference between greedy and lazy matching. By default, quantifiers like *, +, and {n,m} are greedy — they match as much text as possible.
Consider the string with the pattern <.+>. A greedy match would go from the first < all the way to the last >, matching the entire string. That's rarely what you want.
Adding a ? after a quantifier makes it lazy (also called non-greedy or reluctant). The pattern <.+?> matches as little as possible, stopping at the first > — so it matches , then , then separately.
| Pattern | Behavior | Match on "abc123" |
|---|---|---|
| `\d+` | Greedy — grabs all digits | `123` |
| `\d+?` | Lazy — grabs one digit | `1`, then `2`, then `3` |
| `.*` | Greedy — matches everything | `abc123` |
| `.*?` | Lazy — matches nothing (zero-length) | `""` (empty match) |
Use greedy by default and switch to lazy when you need minimal matching — for example, when extracting content between HTML tags.
Real-World Examples
Let's look at some practical patterns you can use today:
Email validation — A simplified but practical pattern: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ matches most valid email formats. Note that full RFC 5322 compliance requires a much more complex expression.
URL extraction — https?://[^\s]+ finds all HTTP/HTTPS URLs in a block of text. It matches the protocol followed by any non-whitespace characters.
Date parsing (YYYY-MM-DD) — ^\d{4}-\d{2}-\d{2}$ matches dates in ISO 8601 format. For named capture groups (supported in most engines), use ^(?P.
Log file parsing — A common Apache/Nginx log line: ^(\S+) (\S+) (\S+) \[([^\]]+)\] "([^"]*)" (\d{3}) (\d+)$ extracts the IP address, identity, user, timestamp, request, status code, and byte size.
Try these patterns in our regex tester against your own data. For more advanced operations like search-and-replace with backreferences, check out our string utilities and text tools pages.
FAQ
Q: What's the difference between literal characters and metacharacters?
A: Literal characters match themselves (like a matching "a"). Metacharacters like ., *, +, ?, [, ], (, ), {, }, ^, $, |, and \ have special meaning. To match a metacharacter literally, escape it with a backslash — \. matches a literal period.
Q: Why does my regex work in one tool but not in another?
A: Different regex engines have subtle differences. JavaScript, Python, and PCRE (PHP) implement different flavors. The most common differences involve backreferences, lookahead/lookbehind support, and Unicode handling. Always test in the same engine you'll use in production.
Q: What are capture groups and how do I use them?
A: Parentheses () create capture groups that store matched substrings for later use. For example, (\d{3})-(\d{4}) captures area code and local number separately. Use backreferences like \1 or $1 (depending on the engine) to refer to captured groups in replacements.
Q: How do I match across multiple lines?
A: Use the multiline flag (m) so ^ and $ match line boundaries. Use the dotall flag (s) if you want . to match newline characters. Without these flags, . stops at newlines and ^/$ only match the start/end of the entire string.
Q: What does the `\b` word boundary do?
A: \b matches the position between a word character (\w) and a non-word character (\W). It's useful for whole-word matching — \bcat\b matches "cat" but not "catalog" or "concatenate".
Q: Is regex the best tool for parsing HTML?
A: No. HTML is not a regular language — it has nested structures that regex cannot reliably parse. Use a proper DOM parser or HTML parser library instead. Regex works well for extracting simple patterns from HTML (like all href values), but not for parsing the document structure.
Q: How can I debug a complex regex pattern?
A: Use our regex tester with your sample data. Break the pattern into smaller pieces and test each one. Enable verbose mode (x flag) to add comments and whitespace. Many tools also show visual diagrams of how the engine matches your pattern.
Try it yourself with our free online tool:
Try Regex for Beginners: How to Test Regular Expressions Online →