package core:text/regex
Source

⌘K

Ctrl+K

Filter Results

Overview

package regex implements a complete suite for using Regular Expressions to match and capture text.

Regular expressions are used to describe how a piece of text can match to another, using a pattern language.

Odin's regex library implements the following features:

Alternation:           `apple|cherry`
Classes:               `[0-9_]`
Classes, negated:      `[^0-9_]`
Shorthands:            `\d\s\w`
Shorthands, negated:   `\D\S\W`
Wildcards:             `.`
Repeat, optional:      `a*`
Repeat, at least once: `a+`
Repetition:            `a{1,2}`
Optional:              `a?`
Group, capture:        `([0-9])`
Group, non-capture:    `(?:[0-9])`
Start & End Anchors:   `^hello$`
Word Boundaries:       `\bhello\b`
Non-Word Boundaries:   `hello\B`

These specifiers can be composed together, such as an optional group: (?:hello)?

This package also supports the non-greedy variants of the repeating and optional specifiers by appending a ? to them.

Of the shorthand classes that are supported, they are all ASCII-based, even when compiling in Unicode mode. This is for the sake of general performance and simplicity, as there are thousands of Unicode codepoints which would qualify as either a digit, space, or word character which could be irrelevant depending on what is being matched.

Here are the shorthand class equivalencies:

\d: [0-9]
\s: [\t\n\f\r ]
\w: [0-9A-Z_a-z]

If you need your own shorthands, you can compose strings together like so:

MY_HEX :: "[0-9A-Fa-f]"
PATTERN :: MY_HEX + "-" + MY_HEX

The compiler will handle turning multiple identical classes into references to the same set of matching runes, so there's no penalty for doing it like this.

``Some people, when confronted with a problem, think
  "I know, I'll use regular expressions." Now they have two problems.''

     - Jamie Zawinski

Regular expressions have gathered a reputation over the decades for often being chosen as the wrong tool for the job. Here, we will clarify a few cases in which RegEx might be good or bad.

When is it a good time to use RegEx?

You don't know at compile-time what patterns of text the program will need to match when it's running. As an example, you are making a client which can be configured by the user to trigger on certain text patterns received from a server. For another example, you need a way for users of a text editor to compose matching strings that are more intricate than a simple substring lookup. The text you're matching against is small (< 64 KiB) and your patterns aren't overly complicated with branches (alternations, repeats, and optionals). If none of the above general impressions apply but your project doesn't warrant long-term maintenance.

When is it a bad time to use RegEx?

You know at compile-time the grammar you're parsing; a hand-made parser has the potential to be more maintainable and readable. The grammar you're parsing has certain validation steps that lend itself to forming complicated expressions, such as e-mail addresses, URIs, dates, postal codes, credit cards, et cetera. Using RegEx to validate these structures is almost always a bad sign. The text you're matching against is big (> 1 MiB); you would be better served by first dividing the text into manageable chunks and using some heuristic to locate the most likely location of a match before applying RegEx against it. You value high performance and low memory usage; RegEx will always have a certain overhead which increases with the complexity of the pattern.

The implementation of this package has been optimized, but it will never be as thoroughly performant as a hand-made parser. In comparison, there are just too many intermediate steps, assumptions, and generalizations in what it takes to handle a regular expression.

Index

Types (9)

Capture
Compiler_Error
Creation_Error
Error
Flag
Flags
Match_Iterator
Parser_Error
Regular_Expression

Constants (0)

This section is empty.

Variables (0)

This section is empty.

Procedures (11)

create
create_by_user
create_iterator
destroy_capture
destroy_iterator
destroy_regex
match_and_allocate_capture
match_iterator
match_with_preallocated_capture
preallocate_capture
reset

Procedure Groups (2)

destroy
match

Types

Capture ¶
Source

Capture :: struct {
	pos:    [][2]int,
	groups: []string,
}

This struct corresponds to a set of string captures from a RegEx match.

pos will contain the start and end positions for each string in groups, such that str[pos[0][0]:pos[0][1]] == groups[0].

Related Procedures With Parameters

destroy_capture
match_with_preallocated_capture
destroy (procedure groups)
match (procedure groups)

Related Procedures With Returns

Compiler_Error ¶
Source

Compiler_Error :: regex_compiler.Error

Creation_Error ¶
Source

Creation_Error :: enum int {
	None, 
	// A `\` was supplied as the delimiter to `create_by_user`.
	Bad_Delimiter, 
	// A pair of delimiters for `create_by_user` was not found.
	Expected_Delimiter, 
	// An unknown letter was supplied to `create_by_user` after the last delimiter.
	Unknown_Flag, 
}

Error ¶
Source

Error :: union {
	regex_parser.Error, 
	regex_compiler.Error, 
	Creation_Error, 
}

Related Procedures With Returns

Flag ¶
Source

Flag :: regex_common.Flag

Flags ¶
Source

Flags :: bit_set[regex_common.Flag; u8]

Match_Iterator ¶
Source

Match_Iterator :: struct {
	regex:   Regular_Expression,
	capture: Capture,
	vm:      regex_vm.Machine,
	idx:     int,
	temp:    runtime.Allocator,
	threads: int,
	done:    bool,
}

An iterator to repeatedly match a pattern against a string, to be used with *_iterator procedures.

Related Procedures With Parameters

destroy_iterator
match_iterator
reset
destroy (procedure groups)
match (procedure groups)

Related Procedures With Returns

create_iterator

Parser_Error ¶
Source

Parser_Error :: regex_parser.Error

Regular_Expression ¶
Source

Regular_Expression :: struct {
	flags:      bit_set[regex_common.Flag; u8] `fmt:"-"`,
	class_data: []regex_vm.Rune_Class_Data `fmt:"-"`,
	program:    []regex_vm.Opcode `fmt:"-"`,
}

A compiled Regular Expression value, to be used with the match_* procedures.

Related Procedures With Parameters

Related Procedures With Returns

Constants

This section is empty.

Variables

This section is empty.

Procedures

create ¶
Source

create :: proc(pattern: string, flags: bit_set[regex_common.Flag; u8] = {}, permanent_allocator := context.allocator, temporary_allocator := context.temp_allocator) -> (result: Regular_Expression, err: Error) {…}

Create a regular expression from a string pattern and a set of flags.

Allocates Using Provided Allocators

Inputs:
pattern: The pattern to compile. flags: A bit_set of RegEx flags. permanent_allocator: The allocator to use for the final regular expression. (default: context.allocator) temporary_allocator: The allocator to use for the intermediate compilation stages. (default: context.temp_allocator)

Returns:
result: The regular expression. err: An error, if one occurred.

create_by_user ¶
Source

create_by_user :: proc(pattern: string, permanent_allocator := context.allocator, temporary_allocator := context.temp_allocator) -> (result: Regular_Expression, err: Error) {…}

Create a regular expression from a delimited string pattern, such as one provided by users of a program or those found in a configuration file.

They are in the form of:

[DELIMITER] [regular expression] [DELIMITER] [flags]

For example, the following strings are valid:

/hellope/i
#hellope#i
•hellope•i
つhellopeつi

The delimiter is determined by the very first rune in the string. The only restriction is that the delimiter cannot be \, as that rune is used to escape the delimiter if found in the middle of the string.

All runes after the closing delimiter will be parsed as flags:

'm': Multiline 'i': Case_Insensitive 'x': Ignore_Whitespace 'u': Unicode 'n': No_Capture '-': No_Optimization

Allocates Using Provided Allocators

Inputs:
pattern: The delimited pattern with optional flags to compile. str: The string to match against. permanent_allocator: The allocator to use for the final regular expression. (default: context.allocator) temporary_allocator: The allocator to use for the intermediate compilation stages. (default: context.temp_allocator)

Returns:
result: The regular expression. err: An error, if one occurred.

create_iterator ¶
Source

create_iterator :: proc(str: string, pattern: string, flags: bit_set[regex_common.Flag; u8] = {}, permanent_allocator := context.allocator, temporary_allocator := context.temp_allocator) -> (result: Match_Iterator, err: Error) {…}

Create a Match_Iterator using a string to search, a regular expression to match against it, and a set of flags.

Allocates Using Provided Allocators

Inputs:
str: The string to iterate over. pattern: The pattern to match. flags: A bit_set of RegEx flags. permanent_allocator: The allocator to use for the compiled regular expression. (default: context.allocator) temporary_allocator: The allocator to use for the intermediate compilation and iteration stages. (default: context.temp_allocator)

Returns:
result: The Match_Iterator. err: An error, if one occurred.

destroy_capture ¶
Source

destroy_capture :: proc(capture: Capture, allocator := context.allocator) {…}

Free all data allocated by the match_and_allocate_capture procedure.

Frees Using Provided Allocator

Inputs:
capture: A Capture. allocator: (default: context.allocator)

destroy_iterator ¶
Source

destroy_iterator :: proc(it: Match_Iterator, allocator := context.allocator) {…}

Free all data allocated by the create_iterator procedure.

Frees Using Provided Allocator

Inputs:
it: A Match_Iterator allocator: (default: context.allocator)

destroy_regex ¶
Source

destroy_regex :: proc(regex: Regular_Expression, allocator := context.allocator) {…}

Free all data allocated by the create* procedures.

Frees Using Provided Allocator

Inputs:
regex: A regular expression. allocator: (default: context.allocator)

match_and_allocate_capture ¶
Source

match_and_allocate_capture :: proc(regex: Regular_Expression, str: string, permanent_allocator := context.allocator, temporary_allocator := context.temp_allocator) -> (capture: Capture, success: bool) {…}

Match a regular expression against a string and allocate the results into the returned capture structure.

The resulting capture strings will be slices to the string str, not wholly copied strings, so they won't need to be individually deleted.

Allocates Using Provided Allocators

Inputs:
regex: The regular expression. str: The string to match against. permanent_allocator: The allocator to use for the capture results. (default: context.allocator) temporary_allocator: The allocator to use for the virtual machine. (default: context.temp_allocator)

Returns:
capture: The capture groups found in the string. success: True if the regex matched the string.

match_iterator ¶
Source

match_iterator :: proc(it: ^Match_Iterator) -> (result: Capture, index: int, ok: bool) {…}

Iterate over a Match_Iterator and return successive captures.

Inputs:
it: Pointer to the Match_Iterator to iterate over.

Returns:
result: Capture for this iteration. ok: A bool indicating if there was a match, stopping the iteration on false.

match_with_preallocated_capture ¶
Source

match_with_preallocated_capture :: proc(regex: Regular_Expression, str: string, capture: ^Capture, temporary_allocator := context.temp_allocator) -> (num_groups: int, success: bool) {…}

Match a regular expression against a string and save the capture results into the provided capture structure.

The resulting capture strings will be slices to the string str, not wholly copied strings, so they won't need to be individually deleted.

Allocates Using Provided Allocator

Inputs:
regex: The regular expression. str: The string to match against. capture: A pointer to a Capture structure with groups and pos already allocated. temporary_allocator: The allocator to use for the virtual machine. (default: context.temp_allocator)

Returns:
num_groups: The number of capture groups set into capture. success: True if the regex matched the string.

preallocate_capture ¶
Source

preallocate_capture :: proc(allocator := context.allocator) -> (result: Capture) {…}

Allocate a Capture in advance for use with match. This can save some time if you plan on performing several matches at once and only need the results between matches.

Inputs:
allocator: (default: context.allocator)

Returns:
result: The Capture with the maximum number of groups allocated.

reset ¶
Source

reset :: proc(it: ^Match_Iterator) {…}

Reset an iterator, allowing it to be run again as if new.

Inputs:
it: The iterator to reset.

Procedure Groups

destroy ¶
Source

destroy :: proc{
	destroy_regex,
	destroy_capture,
	destroy_iterator,
}

match ¶
Source

match :: proc{
	match_and_allocate_capture,
	match_with_preallocated_capture,
	match_iterator,
}

Source Files

Generation Information

Generated with odin version dev-2025-07 (vendor "odin") Windows_amd64 @ 2025-07-12 21:13:19.050261000 +0000 UTC

package core:text/regexSource

Overview

Index

Types

Related Procedures With Parameters

Related Procedures With Returns

Related Procedures With Returns

Related Procedures With Parameters

Related Procedures With Returns

Related Procedures With Parameters

Related Procedures With Returns

Constants

Variables

Procedures

Procedure Groups

Source Files

Generation Information

package core:text/regex
Source