package core:unicode

⌘K
Ctrl+K
or
/

    Overview

    Data and procedures to test properties of Unicode code points.

    Index

    Types (1)
    Variables (123)
    Procedure Groups (0)

    This section is empty.

    Types

    Range ¶

    Range :: struct {
    	single_16: []u16,
    	ranges_16: []u16,
    	single_32: []i32,
    	ranges_32: []i32,
    }
    Related Procedures With Parameters

    Constants

    MAX_ASCII ¶

    MAX_ASCII :: '\u007f'
     

    Maximum ASCII value

    MAX_LATIN1 ¶

    MAX_LATIN1 :: '\u00ff'
     

    Maximum Latin-1 value

    MAX_RUNE ¶

    MAX_RUNE :: '\U0010ffff'
     

    Maximum valid unicode code point

    REPLACEMENT_CHAR ¶

    REPLACEMENT_CHAR :: '\ufffd'
     

    Represented an invalid code point

    WORD_JOINER ¶

    WORD_JOINER :: '\u2060'

    ZERO_WIDTH_JOINER ¶

    ZERO_WIDTH_JOINER :: '\u200D'

    ZERO_WIDTH_NON_JOINER ¶

    ZERO_WIDTH_NON_JOINER :: '\u200C'

    ZERO_WIDTH_SPACE ¶

    ZERO_WIDTH_SPACE :: '\u200B'

    Variables

    alpha_ranges ¶

    @(rodata)
    alpha_ranges: [304]i32 = …

    alpha_singlets ¶

    @(rodata)
    alpha_singlets: [32]i32 = …

    char_properties ¶

    @(rodata)
    char_properties: [256]u8 = …

    emoji_extended_pictographic_ranges ¶

    @(rodata)
    emoji_extended_pictographic_ranges: [1022]i32 = …

    extra_digits_ranges ¶

    extra_digits_ranges: Range = …

    extra_digits_ranges16 ¶

    @(rodata)
    extra_digits_ranges16: [22]u16 = …

    extra_digits_ranges32 ¶

    @(rodata)
    extra_digits_ranges32: [8]i32 = …

    extra_digits_singles16 ¶

    @(rodata)
    extra_digits_singles16: [5]u16 = …

    grapheme_extend_ranges ¶

    @(rodata)
    grapheme_extend_ranges: [752]i32 = …

    hangul_syllable_lv_singlets ¶

    @(rodata)
    hangul_syllable_lv_singlets: [399]i32 = …

    hangul_syllable_lvt_ranges ¶

    @(rodata)
    hangul_syllable_lvt_ranges: [798]i32 = …

    indic_conjunct_break_consonant_ranges ¶

    @(rodata)
    indic_conjunct_break_consonant_ranges: [52]i32 = …

    indic_conjunct_break_extend_ranges ¶

    @(rodata)
    indic_conjunct_break_extend_ranges: [340]i32 = …

    ll_ranges ¶

    ll_ranges: Range = …

    ll_ranges16 ¶

    @(rodata)
    ll_ranges16: [144]u16 = …

    ll_ranges32 ¶

    @(rodata)
    ll_ranges32: [82]i32 = …

    ll_singles16 ¶

    @(rodata)
    ll_singles16: [549]u16 = …

    ll_singles32 ¶

    @(rodata)
    ll_singles32: [2]i32 = …

    lm_ranges ¶

    lm_ranges: Range = …

    lm_ranges16 ¶

    @(rodata)
    lm_ranges16: [42]u16 = …

    lm_ranges32 ¶

    @(rodata)
    lm_ranges32: [28]i32 = …

    lm_singles16 ¶

    @(rodata)
    lm_singles16: [36]u16 = …

    lm_singles32 ¶

    @(rodata)
    lm_singles32: [8]i32 = …

    lo_ranges ¶

    lo_ranges: Range = …

    lo_ranges16 ¶

    @(rodata)
    lo_ranges16: [472]u16 = …

    lo_ranges32 ¶

    @(rodata)
    lo_ranges32: [382]i32 = …

    lo_singles16 ¶

    @(rodata)
    lo_singles16: [55]u16 = …

    lo_singles32 ¶

    @(rodata)
    lo_singles32: [60]i32 = …

    lt_ranges ¶

    lt_ranges: Range = …

    lt_ranges16 ¶

    @(rodata)
    lt_ranges16: [6]u16 = …

    lt_singles16 ¶

    @(rodata)
    lt_singles16: [7]u16 = …

    lu_ranges ¶

    lu_ranges: Range = …

    lu_ranges16 ¶

    @(rodata)
    lu_ranges16: [120]u16 = …

    lu_ranges32 ¶

    @(rodata)
    lu_ranges32: [78]i32 = …

    lu_singles16 ¶

    @(rodata)
    lu_singles16: [552]u16 = …

    lu_singles32 ¶

    @(rodata)
    lu_singles32: [4]i32 = …

    mc_ranges ¶

    mc_ranges: Range = …

    mc_ranges16 ¶

    @(rodata)
    mc_ranges16: [142]u16 = …

    mc_ranges32 ¶

    @(rodata)
    mc_ranges32: [86]i32 = …

    mc_singles16 ¶

    @(rodata)
    mc_singles16: [41]u16 = …

    mc_singles32 ¶

    @(rodata)
    mc_singles32: [38]i32 = …

    me_ranges ¶

    me_ranges: Range = …

    me_ranges16 ¶

    @(rodata)
    me_ranges16: [8]u16 = …

    me_singles16 ¶

    @(rodata)
    me_singles16: [1]u16 = …

    mn_ranges ¶

    mn_ranges: Range = …

    mn_ranges16 ¶

    @(rodata)
    mn_ranges16: [264]u16 = …

    mn_ranges32 ¶

    @(rodata)
    mn_ranges32: [206]i32 = …

    mn_singles16 ¶

    @(rodata)
    mn_singles16: [81]u16 = …

    mn_singles32 ¶

    @(rodata)
    mn_singles32: [49]i32 = …

    nd_ranges ¶

    nd_ranges: Range = …

    nd_ranges16 ¶

    @(rodata)
    nd_ranges16: [74]u16 = …

    nd_ranges32 ¶

    @(rodata)
    nd_ranges32: [70]i32 = …

    nl_ranges ¶

    nl_ranges: Range = …

    nl_ranges16 ¶

    @(rodata)
    nl_ranges16: [12]u16 = …

    nl_ranges32 ¶

    @(rodata)
    nl_ranges32: [8]i32 = …

    nl_singles16 ¶

    @(rodata)
    nl_singles16: [1]u16 = …

    nl_singles32 ¶

    @(rodata)
    nl_singles32: [2]i32 = …

    no_ranges ¶

    no_ranges: Range = …

    no_ranges16 ¶

    @(rodata)
    no_ranges16: [48]u16 = …

    no_ranges32 ¶

    @(rodata)
    no_ranges32: [86]i32 = …

    no_singles16 ¶

    @(rodata)
    no_singles16: [5]u16 = …

    nonspacing_mark_ranges ¶

    @(rodata)
    nonspacing_mark_ranges: [692]i32 = …

    normalized_east_asian_width_ranges ¶

    @(rodata)
    normalized_east_asian_width_ranges: [489]i32 = …
     

    Fullwidth (F) and Wide (W) are counted as 2. Everything else is 1.

    Derived from: https://unicode.org/Public/15.1.0/ucd/EastAsianWidth.txt

    other_lowercase_ranges ¶

    other_lowercase_ranges: Range = …

    other_lowercase_ranges16 ¶

    @(rodata)
    other_lowercase_ranges16: [26]u16 = …

    other_lowercase_ranges32 ¶

    @(rodata)
    other_lowercase_ranges32: [8]i32 = …

    other_lowercase_singles16 ¶

    @(rodata)
    other_lowercase_singles16: [10]u16 = …

    other_lowercase_singles32 ¶

    @(rodata)
    other_lowercase_singles32: [1]i32 = …

    other_uppercase_ranges ¶

    other_uppercase_ranges: Range = …

    other_uppercase_ranges16 ¶

    @(rodata)
    other_uppercase_ranges16: [4]u16 = …

    other_uppercase_ranges32 ¶

    @(rodata)
    other_uppercase_ranges32: [6]i32 = …

    pc_ranges ¶

    pc_ranges: Range = …

    pc_ranges16 ¶

    @(rodata)
    pc_ranges16: [6]u16 = …

    pc_singles16 ¶

    @(rodata)
    pc_singles16: [3]u16 = …

    pd_ranges ¶

    pd_ranges: Range = …

    pd_ranges16 ¶

    @(rodata)
    pd_ranges16: [6]u16 = …

    pd_singles16 ¶

    @(rodata)
    pd_singles16: [15]u16 = …

    pd_singles32 ¶

    @(rodata)
    pd_singles32: [2]i32 = …

    pe_ranges ¶

    pe_ranges: Range = …

    pe_ranges16 ¶

    @(rodata)
    pe_ranges16: [2]u16 = …

    pe_singles16 ¶

    @(rodata)
    pe_singles16: [75]u16 = …

    pf_ranges ¶

    pf_ranges: Range = …

    pf_singles16 ¶

    @(rodata)
    pf_singles16: [10]u16 = …

    pi_ranges ¶

    pi_ranges: Range = …

    pi_ranges16 ¶

    @(rodata)
    pi_ranges16: [2]u16 = …

    pi_singles16 ¶

    @(rodata)
    pi_singles16: [10]u16 = …

    po_ranges ¶

    po_ranges: Range = …

    po_ranges16 ¶

    @(rodata)
    po_ranges16: [168]u16 = …

    po_ranges32 ¶

    @(rodata)
    po_ranges32: [80]i32 = …

    po_singles16 ¶

    @(rodata)
    po_singles16: [47]u16 = …

    po_singles32 ¶

    @(rodata)
    po_singles32: [23]i32 = …

    ps_ranges ¶

    ps_ranges: Range = …

    ps_singles16 ¶

    @(rodata)
    ps_singles16: [79]u16 = …

    sc_ranges ¶

    sc_ranges: Range = …

    sc_ranges16 ¶

    @(rodata)
    sc_ranges16: [12]u16 = …

    sc_ranges32 ¶

    @(rodata)
    sc_ranges32: [2]i32 = …

    sc_singles16 ¶

    @(rodata)
    sc_singles16: [12]u16 = …

    sc_singles32 ¶

    @(rodata)
    sc_singles32: [2]i32 = …

    sk_ranges ¶

    sk_ranges: Range = …

    sk_ranges16 ¶

    @(rodata)
    sk_ranges16: [32]u16 = …

    sk_ranges32 ¶

    @(rodata)
    sk_ranges32: [2]i32 = …

    sk_singles16 ¶

    @(rodata)
    sk_singles16: [14]u16 = …

    sm_ranges ¶

    sm_ranges: Range = …

    sm_ranges16 ¶

    @(rodata)
    sm_ranges16: [50]u16 = …

    sm_ranges32 ¶

    @(rodata)
    sm_ranges32: [6]i32 = …

    sm_singles16 ¶

    @(rodata)
    sm_singles16: [28]u16 = …

    sm_singles32 ¶

    @(rodata)
    sm_singles32: [11]i32 = …

    so_ranges ¶

    so_ranges: Range = …

    so_ranges16 ¶

    @(rodata)
    so_ranges16: [162]u16 = …

    so_ranges32 ¶

    @(rodata)
    so_ranges32: [130]i32 = …

    so_singles16 ¶

    @(rodata)
    so_singles16: [35]u16 = …

    so_singles32 ¶

    @(rodata)
    so_singles32: [12]i32 = …

    space_ranges ¶

    @(rodata)
    space_ranges: [26]i32 = …

    spacing_mark_ranges ¶

    @(rodata)
    spacing_mark_ranges: [364]i32 = …

    to_lower_ranges ¶

    @(rodata)
    to_lower_ranges: [108]i32 = …

    to_lower_singlets ¶

    @(rodata)
    to_lower_singlets: [666]i32 = …

    to_title_singlets ¶

    @(rodata)
    to_title_singlets: [16]i32 = …

    to_upper_ranges ¶

    @(rodata)
    to_upper_ranges: [105]i32 = …

    to_upper_singlets ¶

    @(rodata)
    to_upper_singlets: [680]i32 = …

    unicode_spaces ¶

    @(rodata)
    unicode_spaces: [18]i32 = …

    zs_ranges ¶

    zs_ranges: Range = …

    zs_ranges16 ¶

    @(rodata)
    zs_ranges16: [2]u16 = …

    zs_singles16 ¶

    @(rodata)
    zs_singles16: [6]u16 = …

    Procedures

    binary_search :: proc(c: $T, table: []$T, length, stride: int, loc := #caller_location) -> int {…}

    in_range ¶

    in_range :: proc(r: rune, range: Range) -> bool {…}
     

    Check to see if the rune r is in range

    is_alpha ¶

    is_alpha :: is_letter
     

    Return true if the rune r is a letter. Being a letter means that the rune has the Unicode general category property of L. In practice, the character will have a general category property of Ll, Lm, Lo, Lt, or Lu.

    Inputs:
    r: The rune which will be check for having the property of being a letter.

    Returns:
    true when the rune r is a letter. false will be returned in all other cases.

    is_combining ¶

    is_combining :: proc(r: rune) -> bool {…}

    is_control ¶

    is_control :: proc(r: rune) -> bool {…}

    is_decimal ¶

    is_decimal :: proc(r: rune) -> bool {…}
     

    Returns true if the rune r is in the General Category Nd

    Inputs:
    r: The run to check if it is in the general category Nd.

    Returns:
    true if the rune is in the general category Nd and false otherwise

    is_digit ¶

    is_digit :: proc(r: rune) -> bool {…}
     

    This function determincs if a rune is a digit. To be a digit the charage either has a Numeric_Type of Digit or Decimal.

    Inputs:
    r: The rune to check if it is a digit.

    Returns:
    true if the rune r is a digit, false in all other cases

    is_emoji_extended_pictographic ¶

    is_emoji_extended_pictographic :: proc(r: rune) -> bool {…}
     

    Extended_Pictographic

    is_emoji_modifier ¶

    is_emoji_modifier :: proc(r: rune) -> bool {…}
     

    Emoji_Modifier

    is_enclosing_mark ¶

    is_enclosing_mark :: proc(r: rune) -> bool {…}
     

    General_Category=Enclosing_Mark

    is_gcb_extend_class ¶

    is_gcb_extend_class :: proc(r: rune) -> bool {…}
     

    For grapheme text segmentation, from Unicode TR 29 Rev 43:

    ` Grapheme_Extend = Yes, or Emoji_Modifier = Yes

    This includes: General_Category = Nonspacing_Mark General_Category = Enclosing_Mark U+200C ZERO WIDTH NON-JOINER

    plus a few General_Category = Spacing_Mark needed for canonical equivalence. `

    is_gcb_prepend_class ¶

    is_gcb_prepend_class :: proc(r: rune) -> bool {…}
     

    For grapheme text segmentation, from Unicode TR 29 Rev 43:

    ` Indic_Syllabic_Category = Consonant_Preceding_Repha, or Indic_Syllabic_Category = Consonant_Prefixed, or Prepended_Concatenation_Mark = Yes `

    is_grapheme_extend ¶

    is_grapheme_extend :: proc(r: rune) -> bool {…}
     

    Grapheme_Extend

    is_graphic ¶

    is_graphic :: proc(r: rune) -> bool {…}

    is_hangul_syllable_leading ¶

    is_hangul_syllable_leading :: proc(r: rune) -> bool {…}
     

    Hangul_Syllable_Type=Leading_Jamo

    is_hangul_syllable_lv ¶

    is_hangul_syllable_lv :: proc(r: rune) -> bool {…}
     

    Hangul_Syllable_Type=LV_Syllable

    is_hangul_syllable_lvt ¶

    is_hangul_syllable_lvt :: proc(r: rune) -> bool {…}
     

    Hangul_Syllable_Type=LVT_Syllable

    is_hangul_syllable_trailing ¶

    is_hangul_syllable_trailing :: proc(r: rune) -> bool {…}
     

    Hangul_Syllable_Type=Trailing_Jamo

    is_hangul_syllable_vowel ¶

    is_hangul_syllable_vowel :: proc(r: rune) -> bool {…}
     

    Hangul_Syllable_Type=Vowel_Jamo

    is_indic_conjunct_break_consonant ¶

    is_indic_conjunct_break_consonant :: proc(r: rune) -> bool {…}
     

    Indic_Conjunct_Break=Consonant

    is_indic_conjunct_break_extend ¶

    is_indic_conjunct_break_extend :: proc(r: rune) -> bool {…}
     

    Indic_Conjunct_Break=Extend

    is_indic_conjunct_break_linker ¶

    is_indic_conjunct_break_linker :: proc(r: rune) -> bool {…}
     

    Indic_Conjunct_Break=Linker

    is_indic_consonant_preceding_repha ¶

    is_indic_consonant_preceding_repha :: proc(r: rune) -> bool {…}
     

    Indic_Syllabic_Category=Consonant_Preceding_Repha

    is_indic_consonant_prefixed ¶

    is_indic_consonant_prefixed :: proc(r: rune) -> bool {…}
     

    Indic_Syllabic_Category=Consonant_Prefixed

    is_letter ¶

    is_letter :: proc(r: rune) -> bool {…}
     

    Return true if the rune r is a letter. Being a letter means that the rune has the Unicode general category property of L. In practice, the character will have a general category property of Ll, Lm, Lo, Lt, or Lu.

    Inputs:
    r: The rune which will be check for having the property of being a letter.

    Returns:
    true when the rune r is a letter. false will be returned in all other cases.

    is_lower ¶

    is_lower :: proc(r: rune) -> bool {…}

    is_nonspacing_mark ¶

    is_nonspacing_mark :: proc(r: rune) -> bool {…}
     

    General_Category=Nonspacing_Mark

    is_number ¶

    is_number :: proc(r: rune) -> bool {…}
     

    Checks to see if the rune r is a number. This means the rune is a member of the general category Nd, Nl, or No.

    Inputs:
    r: The rune to check if it is number.

    Returns:
    true if the ruen belongs to the general category Nd, Nl, or No. false is return in all other cases.

    is_prepended_concatenation_mark ¶

    is_prepended_concatenation_mark :: proc(r: rune) -> bool {…}
     

    Prepended_Concatenation_Mark

    is_print ¶

    is_print :: proc(r: rune) -> bool {…}

    is_punct ¶

    is_punct :: proc(r: rune) -> bool {…}

    is_regional_indicator ¶

    is_regional_indicator :: proc(r: rune) -> bool {…}
     

    Regional_Indicator

    is_space ¶

    is_space :: proc(r: rune) -> bool {…}

    is_spacing_mark ¶

    is_spacing_mark :: proc(r: rune) -> bool {…}
     

    General_Category=Spacing_Mark

    is_symbol ¶

    is_symbol :: proc(r: rune) -> bool {…}

    is_title ¶

    is_title :: proc(r: rune) -> bool {…}

    is_upper ¶

    is_upper :: proc(r: rune) -> bool {…}

    normalized_east_asian_width ¶

    normalized_east_asian_width :: proc(r: rune) -> int {…}
     

    Return values:

    2 if East_Asian_Width=F or W, or 0 if non-printable / zero-width, or 1 in all other cases.

    simple_fold ¶

    simple_fold :: proc(r: rune) -> rune {…}
     

    simple_fold iterates over the Unicode code points equivalent under the Unicode defined simple case folding. simple_fold returns the smallest rune > r if one exists, or the smallest rune >= 0. If no valid Unicode code point exists, r is returned.

    Example:
    simple_fold('A')      == 'a'
    simple_fold('a')      == 'A'
    simple_fold('Z')      == 'z'
    simple_fold('z')      == 'Z'
    simple_fold('7')      == '7'
    simple_fold('k')      == '\u212a' (Kelvin symbol, K)
    simple_fold('\u212a') == 'k'
    simple_fold(-3)       == -3
    

    to_lower ¶

    to_lower :: proc(r: rune) -> rune {…}

    to_title ¶

    to_title :: proc(r: rune) -> rune {…}

    to_upper ¶

    to_upper :: proc(r: rune) -> rune {…}

    Procedure Groups

    This section is empty.

    Source Files

    Generation Information

    Generated with odin version dev-2026-03 (vendor "odin") Windows_amd64 @ 2026-03-22 21:18:17.000998000 +0000 UTC