package core:unicode/utf8
Index
Types (2)
Variables (2)
Procedures (21)
- decode_grapheme_clusters
- decode_last_rune_in_bytes
- decode_last_rune_in_string
- decode_rune_in_bytes
- decode_rune_in_string
- encode_rune
- full_rune_in_bytes
- full_rune_in_string
- grapheme_count
- rune_at
- rune_at_pos
- rune_count_in_bytes
- rune_count_in_string
- rune_offset
- rune_size
- rune_start
- rune_string_at_pos
- runes_to_string
- string_to_runes
- valid_rune
- valid_string
Procedure Groups (4)
Types
Constants
MAX_RUNE ¶
MAX_RUNE :: '\U0010ffff'
RUNE1_MAX ¶
RUNE1_MAX :: 1 << 7 - 1
RUNE2_MAX ¶
RUNE2_MAX :: 1 << 11 - 1
RUNE3_MAX ¶
RUNE3_MAX :: 1 << 16 - 1
RUNE_BOM ¶
RUNE_BOM :: 0xfeff
RUNE_EOF ¶
RUNE_EOF: rune : ~rune(0)
RUNE_ERROR ¶
RUNE_ERROR :: '\ufffd'
RUNE_SELF ¶
RUNE_SELF :: 0x80
SURROGATE_HIGH_MAX ¶
SURROGATE_HIGH_MAX :: 0xdbff
A high/leading surrogate is in range SURROGATE_MIN..SURROGATE_HIGH_MAX, A low/trailing surrogate is in range SURROGATE_LOW_MIN..SURROGATE_MAX.
SURROGATE_LOW_MIN ¶
SURROGATE_LOW_MIN :: 0xdc00
SURROGATE_MAX ¶
SURROGATE_MAX :: 0xdfff
SURROGATE_MIN ¶
SURROGATE_MIN :: 0xd800
ZERO_WIDTH_JOINER ¶
ZERO_WIDTH_JOINER :: unicode.ZERO_WIDTH_JOINER
Variables
accept_ranges ¶
accept_ranges: [5]Accept_Range = …
accept_sizes ¶
accept_sizes: [256]u8 = …
Procedures
decode_grapheme_clusters ¶
decode_grapheme_clusters :: proc(str: string, track_graphemes: bool = true, allocator := context.allocator) -> (graphemes: [dynamic]Grapheme, grapheme_count: int, rune_count: int, width: int) {…}
Decode the individual graphemes in a UTF-8 string.
Allocates Using Provided Allocator
Inputs:
str: The input string.
track_graphemes: Whether or not to allocate and return graphemes
with extra data about each grapheme.
allocator: (default: context.allocator)
Returns:
graphemes: Extra data about each grapheme.
grapheme_count: The number of graphemes in the string.
rune_count: The number of runes in the string.
width: The width of the string in number of monospace cells.
full_rune_in_bytes ¶
full_rune_in_bytes reports if the bytes in b begin with a full utf-8 encoding of a rune or not An invalid encoding is considered a full rune since it will convert as an error rune of width 1 (RUNE_ERROR)
full_rune_in_string ¶
full_rune_in_string reports if the bytes in s begin with a full utf-8 encoding of a rune or not An invalid encoding is considered a full rune since it will convert as an error rune of width 1 (RUNE_ERROR)
grapheme_count ¶
Count the individual graphemes in a UTF-8 string.
Inputs:
str: The input string.
Returns:
graphemes: The number of graphemes in the string.
runes: The number of runes in the string.
width: The width of the string in number of monospace cells.
rune_offset ¶
Returns the byte position of rune at position pos in s with an optional start byte position. Returns -1 if it runs out of the string.
runes_to_string ¶
runes_to_string :: proc(runes: []rune, allocator := context.allocator) -> string {…}
string_to_runes ¶
string_to_runes :: proc(s: string, allocator := context.allocator) -> (runes: []rune) {…}
Procedure Groups
decode_last_rune ¶
decode_last_rune :: proc{ decode_last_rune_in_string, decode_last_rune_in_bytes, }
decode_rune ¶
decode_rune :: proc{ decode_rune_in_string, decode_rune_in_bytes, }
full_rune ¶
full_rune :: proc{ full_rune_in_bytes, full_rune_in_string, }
full_rune reports if the bytes in b begin with a full utf-8 encoding of a rune or not An invalid encoding is considered a full rune since it will convert as an error rune of width 1 (RUNE_ERROR)
rune_count ¶
rune_count :: proc{ rune_count_in_string, rune_count_in_bytes, }
Source Files
Generation Information
Generated with odin version dev-2024-12 (vendor "odin") Windows_amd64 @ 2024-12-17 21:11:02.096687600 +0000 UTC