package base:intrinsics
Overview
package intrinsics provides documentation for Odin's compiler-level intrinsics.
Index
Constants (0)
This section is empty.
Procedures (225)
- alloca
- atomic_add
- atomic_add_explicit
- atomic_and
- atomic_and_explicit
- atomic_compare_exchange_strong
- atomic_compare_exchange_strong_explicit
- atomic_compare_exchange_weak
- atomic_compare_exchange_weak_explicit
- atomic_exchange
- atomic_exchange_explicit
- atomic_load
- atomic_load_explicit
- atomic_nand
- atomic_nand_explicit
- atomic_or
- atomic_or_explicit
- atomic_signal_fence
- atomic_store
- atomic_store_explicit
- atomic_sub
- atomic_sub_explicit
- atomic_thread_fence
- atomic_type_is_lock_free
- atomic_xor
- atomic_xor_explicit
- byte_swap
- concatenate
- constant_log2
- constant_utf16_cstring
- count_leading_zeros
- count_ones
- count_trailing_zeros
- count_zeros
- cpu_relax
- debug_trap
- expect
- fixed_point_div
- fixed_point_div_sat
- fixed_point_mul
- fixed_point_mul_sat
- fused_mul_add
- hadamard_product
- has_target_feature
- matrix_flatten
- mem_copy
- mem_copy_non_overlapping
- mem_zero
- mem_zero_volatile
- non_temporal_load
- non_temporal_store
- objc_block
- objc_find_class
- objc_find_selector
- objc_ivar_get
- objc_register_class
- objc_register_selector
- objc_super
- outer_product
- overflow_add
- overflow_mul
- overflow_sub
- prefetch_read_data
- prefetch_read_instruction
- prefetch_write_data
- prefetch_write_instruction
- procedure_of
- ptr_offset
- ptr_sub
- read_cycle_counter
- read_cycle_counter_frequency
- reverse_bits
- saturating_add
- saturating_sub
- simd_abs
- simd_add
- simd_bit_and
- simd_bit_and_not
- simd_bit_or
- simd_bit_xor
- simd_ceil
- simd_clamp
- simd_div
- simd_extract
- simd_extract_lsbs
- simd_extract_msbs
- simd_floor
- simd_gather
- simd_lanes_eq
- simd_lanes_ge
- simd_lanes_gt
- simd_lanes_le
- simd_lanes_lt
- simd_lanes_ne
- simd_lanes_reverse
- simd_lanes_rotate_left
- simd_lanes_rotate_right
- simd_masked_compress_store
- simd_masked_expand_load
- simd_masked_load
- simd_masked_store
- simd_max
- simd_min
- simd_mul
- simd_nearest
- simd_neg
- simd_reduce_add_ordered
- simd_reduce_all
- simd_reduce_and
- simd_reduce_any
- simd_reduce_max
- simd_reduce_min
- simd_reduce_mul_ordered
- simd_reduce_or
- simd_reduce_xor
- simd_replace
- simd_runtime_swizzle
- simd_saturating_add
- simd_saturating_sub
- simd_scatter
- simd_select
- simd_shl
- simd_shl_masked
- simd_shr
- simd_shr_masked
- simd_shuffle
- simd_sub
- simd_to_bits
- simd_trunc
- soa_struct
- sqrt
- syscall
- syscall_bsd
- transpose
- trap
- type_base_type
- type_bit_set_elem_type
- type_bit_set_underlying_type
- type_convert_variants_to_pointers
- type_core_type
- type_elem_type
- type_equal_proc
- type_field_index_of
- type_field_type
- type_has_field
- type_has_nil
- type_hasher_proc
- type_is_any
- type_is_array
- type_is_bit_set
- type_is_boolean
- type_is_comparable
- type_is_complex
- type_is_cstring
- type_is_cstring16
- type_is_dereferenceable
- type_is_dynamic_array
- type_is_endian_big
- type_is_endian_little
- type_is_endian_platform
- type_is_enum
- type_is_enumerated_array
- type_is_float
- type_is_indexable
- type_is_integer
- type_is_map
- type_is_matrix
- type_is_matrix_column_major
- type_is_matrix_row_major
- type_is_multi_pointer
- type_is_named
- type_is_nearly_simple_compare
- type_is_numeric
- type_is_ordered
- type_is_ordered_numeric
- type_is_pointer
- type_is_proc
- type_is_quaternion
- type_is_raw_union
- type_is_rune
- type_is_simd_vector
- type_is_simple_compare
- type_is_slice
- type_is_sliceable
- type_is_specialization_of
- type_is_specialized_polymorphic_record
- type_is_string
- type_is_string16
- type_is_struct
- type_is_subtype_of
- type_is_typeid
- type_is_union
- type_is_unsigned
- type_is_unspecialized_polymorphic_record
- type_is_valid_map_key
- type_is_valid_matrix_elements
- type_is_variant_of
- type_map_cell_info
- type_map_info
- type_merge
- type_polymorphic_record_parameter_count
- type_polymorphic_record_parameter_value
- type_proc_parameter_count
- type_proc_parameter_type
- type_proc_return_count
- type_proc_return_type
- type_struct_field_count
- type_struct_has_implicit_padding
- type_union_base_tag_value
- type_union_tag_offset
- type_union_tag_type
- type_union_variant_count
- type_variant_index_of
- type_variant_type_of
- unaligned_load
- unaligned_store
- valgrind_client_request
- volatile_load
- volatile_store
- wasm_memory_atomic_notify32
- wasm_memory_atomic_wait32
- wasm_memory_grow
- wasm_memory_size
- x86_cpuid
- x86_xgetbv
Procedure Groups (0)
This section is empty.
Constants
Types
Atomic_Memory_Order ¶
Atomic_Memory_Order :: enum { Relaxed = 0, // Unordered Consume = 1, // Monotonic Acquire = 2, Release = 3, Acq_Rel = 4, Seq_Cst = 5, }
Describes memory ordering for an atomic operation.
Modern CPU's contain multiple cores and caches specific to those cores. When a core performs a write to memory, the value is written to cache first. The issue is that a core doesn't typically see what's inside the caches of other cores. In order to make operations consistent CPU's implement mechanisms that synchronize memory operations across cores by asking other cores or by pushing data about writes to other cores.
Due to how these algorithms are implemented, the stores and loads performed by one core may seem to happen in a different order to another core. It also may happen that a core reorders stores and loads (independent of how compiler put them into the machine code). This can cause issues when trying to synchronize multiple memory locations between two cores. Which is why CPU's allow for stronger memory ordering guarantees if certain instructions or instruction variants are used.
In Odin there are 5 different memory ordering guaranties that can be provided to an atomic operation:
Relaxed: The memory access (load or store) is unordered with respect to
other memory accesses. This can be used to implement an atomic counter.
Multiple threads access a single variable, but it doesn't matter when
exactly it gets incremented, because it will become eventually consistent.
Consume: No loads or stores dependent on a memory location can be
reordered before a load with consume memory order. If other threads released
the same memory, it becomes visible.
Acquire: No loads or stores on a memory location can be reordered before a
load of that memory location with acquire memory ordering. If other threads
release the same memory, it becomes visible.
Release: No loads or stores on a memory location can be reordered after a
store of that memory location with release memory ordering. All threads that
acquire the same memory location will see all writes done by the current
thread.
Acq_Rel: Acquire-release memory ordering: combines acquire and release
memory orderings in the same operation.
Seq_Cst: Sequential consistency. The strongest memory ordering. A load will
always be an acquire operation, a store will always be a release operation,
and in addition to that all threads observe the same order of writes.
Non-explicit atomics will always be sequentially consistent.
Atomic_Memory_Order :: enum {
Relaxed = 0, // Unordered
Consume = 1, // Monotonic
Acquire = 2,
Release = 3,
Acq_Rel = 4,
Seq_Cst = 5,
}
Note(i386, x64): x86 has a very strong memory model by default. It
guarantees that all writes are ordered, stores and loads aren't reordered. In
a sense, all operations are at least acquire and release operations. If lock
prefix is used, all operations are sequentially consistent. If you use explicit
atomics, make sure you have the correct atomic memory order, because bugs likely
will not show up in x86, but may show up on e.g. arm. More on x86 memory
ordering can be found
here
objc_selector ¶
objc_selector :: struct {}
Darwin targets only
Represents an Objective-C selector type.
Procedures
alloca ¶
A procedure that allocates size bytes of space in the stack frame of the caller, aligned to align bytes. This temporary space is automatically freed when the procedure that called alloca returns to its caller.
atomic_add ¶
atomic_add :: proc(dst: ^$T, val: T) -> T {…}
Atomically add a value to the value stored in memory.
This procedure loads a value from memory, adds the specified value to it, and stores it back as an atomic operation. This operation is an atomic equivalent of the following:
dst^ += val
The memory ordering of this operation is sequentially-consistent.
atomic_add_explicit ¶
atomic_add_explicit :: proc(dst: ^$T, val: T, order: Atomic_Memory_Order) -> T {…}
Atomically add a value to the value stored in memory.
This procedure loads a value from memory, adds the specified value to it, and stores it back as an atomic operation. This operation is an atomic equivalent of the following:
dst^ += val
The memory ordering of this operation is as specified by the order parameter.
atomic_and ¶
atomic_and :: proc(dst: ^$T, val: T) -> T {…}
Atomically replace the memory location with the result of AND operation with the specified value.
This procedure loads a value from memory, calculates the result of AND operation between the loaded value and the specified value, and stores it back into the same memory location as an atomic operation. This operation is an atomic equivalent of the following:
dst^ &= val
The memory ordering of this operation is sequentially-consistent.
atomic_and_explicit ¶
atomic_and_explicit :: proc(dst: ^$T, val: T, order: Atomic_Memory_Order) -> T {…}
Atomically replace the memory location with the result of AND operation with the specified value.
This procedure loads a value from memory, calculates the result of AND operation between the loaded value and the specified value, and stores it back into the same memory location as an atomic operation. This operation is an atomic equivalent of the following:
dst^ &= val
The memory ordering of this operation is as specified by the order parameter.
atomic_compare_exchange_strong ¶
atomic_compare_exchange_strong :: proc(dst: ^$T, old, new: T) -> (T, bool) #optional_ok {…}
Atomically compare and exchange the value with a memory location.
This procedure checks if the value pointed to by the dst parameter is equal
to old, and if they are, it stores the value new into the memory location,
all done in a single atomic operation. This procedure returns the old value
stored in a memory location and a boolean value signifying whether old was
equal to new.
This procedure is an atomic equivalent of the following operation:
old_dst := dst^
if old_dst == old {
dst^ = new
return old_dst, true
} else {
return old_dst, false
}
The strong version of compare exchange always returns true, when the returned
old value stored in location pointed to by dst and the old parameter are
equal.
Atomic compare exchange has two memory orderings: One is for the read-modify-write operation, if the comparison succeeds, and the other is for the load operation, if the comparison fails. The memory ordering for both of of these operations is sequentially-consistent.
atomic_compare_exchange_strong_explicit ¶
atomic_compare_exchange_strong_explicit :: proc(dst: ^$T, old, new: T, success, failure: Atomic_Memory_Order) -> (T, bool) #optional_ok {…}
Atomically compare and exchange the value with a memory location.
This procedure checks if the value pointed to by the dst parameter is equal
to old, and if they are, it stores the value new into the memory location,
all done in a single atomic operation. This procedure returns the old value
stored in a memory location and a boolean value signifying whether old was
equal to new.
This procedure is an atomic equivalent of the following operation:
old_dst := dst^
if old_dst == old {
dst^ = new
return old_dst, true
} else {
return old_dst, false
}
The strong version of compare exchange always returns true, when the returned
old value stored in location pointed to by dst and the old parameter are
equal.
Atomic compare exchange has two memory orderings: One is for the
read-modify-write operation, if the comparison succeeds, and the other is for
the load operation, if the comparison fails. The memory ordering for these
operations is as specified by success and failure parameters respectively.
atomic_compare_exchange_weak ¶
atomic_compare_exchange_weak :: proc(dst: ^$T, old, new: T) -> (T, bool) #optional_ok {…}
Atomically compare and exchange the value with a memory location.
This procedure checks if the value pointed to by the dst parameter is equal
to old, and if they are, it stores the value new into the memory location,
all done in a single atomic operation. This procedure returns the old value
stored in a memory location and a boolean value signifying whether old was
equal to new.
This procedure is an atomic equivalent of the following operation:
old_dst := dst^
if old_dst == old {
// may return false here
dst^ = new
return old_dst, true
} else {
return old_dst, false
}
The weak version of compare exchange may return false, even if dst^ == old.
On some platforms running weak compare exchange in a loop is faster than a
strong version.
Atomic compare exchange has two memory orderings: One is for the read-modify-write operation, if the comparison succeeds, and the other is for the load operation, if the comparison fails. The memory ordering for both of these operations is sequentially-consistent.
atomic_compare_exchange_weak_explicit ¶
atomic_compare_exchange_weak_explicit :: proc(dst: ^$T, old, new: T, success, failure: Atomic_Memory_Order) -> (T, bool) #optional_ok {…}
Atomically compare and exchange the value with a memory location.
This procedure checks if the value pointed to by the dst parameter is equal
to old, and if they are, it stores the value new into the memory location,
all done in a single atomic operation. This procedure returns the old value
stored in a memory location and a boolean value signifying whether old was
equal to new.
This procedure is an atomic equivalent of the following operation:
old_dst := dst^
if old_dst == old {
// may return false here
dst^ = new
return old_dst, true
} else {
return old_dst, false
}
The weak version of compare exchange may return false, even if dst^ == old.
On some platforms running weak compare exchange in a loop is faster than a
strong version.
Atomic compare exchange has two memory orderings: One is for the
read-modify-write operation, if the comparison succeeds, and the other is for
the load operation, if the comparison fails. The memory ordering for these
operations is as specified by the success and failure parameters
respectively.
atomic_exchange ¶
atomic_exchange :: proc(dst: ^$T, val: T) -> T {…}
Atomically exchange the value in a memory location, with the specified value.
This procedure loads a value from the specified memory location, and stores the specified value into that memory location. Then the loaded value is returned, all done in a single atomic operation. This operation is an atomic equivalent of the following:
tmp := dst^ dst^ = val return tmp
The memory ordering of this operation is sequentially-consistent.
atomic_exchange_explicit ¶
atomic_exchange_explicit :: proc(dst: ^$T, val: T, order: Atomic_Memory_Order) -> T {…}
Atomically exchange the value in a memory location, with the specified value.
This procedure loads a value from the specified memory location, and stores the specified value into that memory location. Then the loaded value is returned, all done in a single atomic operation. This operation is an atomic equivalent of the following:
tmp := dst^ dst^ = val return tmp
The memory ordering of this operation is as specified by the order parameter.
atomic_load ¶
atomic_load :: proc(dst: ^$T) -> T {…}
Atomically load a value from memory.
This procedure loads a value from a memory location in such a way that the received value is not a partial read. The memory ordering of this operation is sequentially-consistent.
atomic_load_explicit ¶
atomic_load_explicit :: proc(dst: ^$T, order: Atomic_Memory_Order) -> T {…}
Atomically load a value from memory with explicit memory ordering.
This procedure loads a value from a memory location in such a way that the
received value is not a partial read. The memory ordering of this operation
is as specified by the order parameter.
atomic_nand ¶
atomic_nand :: proc(dst: ^$T, val: T) -> T {…}
Atomically replace the memory location with the result of NAND operation with the specified value.
This procedure loads a value from memory, calculates the result of NAND operation between the loaded value and the specified value, and stores it back into the same memory location as an atomic operation. This operation is an atomic equivalent of the following:
dst^ = ~(dst^ & val)
The memory ordering of this operation is sequentially-consistent.
atomic_nand_explicit ¶
atomic_nand_explicit :: proc(dst: ^$T, val: T, order: Atomic_Memory_Order) -> T {…}
Atomically replace the memory location with the result of NAND operation with the specified value.
This procedure loads a value from memory, calculates the result of NAND operation between the loaded value and the specified value, and stores it back into the same memory location as an atomic operation. This operation is an atomic equivalent of the following:
dst^ = ~(dst^ & val)
The memory ordering of this operation is as specified by the order parameter.
atomic_or ¶
atomic_or :: proc(dst: ^$T, val: T) -> T {…}
Atomically replace the memory location with the result of OR operation with the specified value.
This procedure loads a value from memory, calculates the result of OR operation between the loaded value and the specified value, and stores it back into the same memory location as an atomic operation. This operation is an atomic equivalent of the following:
dst^ |= val
The memory ordering of this operation is sequentially-consistent.
atomic_or_explicit ¶
atomic_or_explicit :: proc(dst: ^$T, val: T, order: Atomic_Memory_Order) -> T {…}
Atomically replace the memory location with the result of OR operation with the specified value.
This procedure loads a value from memory, calculates the result of OR operation between the loaded value and the specified value, and stores it back into the same memory location as an atomic operation. This operation is an atomic equivalent of the following:
dst^ |= val
The memory ordering of this operation is as specified by the order parameter.
atomic_signal_fence ¶
atomic_signal_fence :: proc(order: Atomic_Memory_Order) {…}
Adds a "fence" to introduce a "happens-before" edges between operations.
Establish memory ordering between a current thread and a signal handler.
This procedure establishes memory ordering between a thread and a signal
handler, that run on the same thread, without an associated atomic operation.
This procedure is equivalent to atomic_thread_fence, except it doesn't
issue any CPU instructions for memory ordering.
atomic_store ¶
atomic_store :: proc(dst: ^$T, val: T) {…}
Atomically store a value into memory.
This procedure stores a value to a memory location in such a way that no other thread is able to see partial reads. This operation is sequentially-consistent.
atomic_store_explicit ¶
atomic_store_explicit :: proc(dst: ^$T, val: T, order: Atomic_Memory_Order) {…}
Atomically store a value into memory with explicit memory ordering.
This procedure stores a value to a memory location in such a way that no other
thread is able to see partial reads. The memory ordering of this operation is
as specified by the order parameter.
atomic_sub ¶
atomic_sub :: proc(dst: ^$T, val: T) -> T {…}
Atomically subtract a value from the value stored in memory.
This procedure loads a value from memory, subtracts the specified value from it, and stores the result back as an atomic operation. This operation is an atomic equivalent of the following:
dst^ -= val
The memory ordering of this operation is sequentially-consistent.
atomic_sub_explicit ¶
atomic_sub_explicit :: proc(dst: ^$T, val: T, order: Atomic_Memory_Order) -> T {…}
Atomically subtract a value from the value stored in memory.
This procedure loads a value from memory, subtracts the specified value from it, and stores the result back as an atomic operation. This operation is an atomic equivalent of the following:
dst^ -= val
The memory ordering of this operation is as specified by the order parameter.
atomic_thread_fence ¶
atomic_thread_fence :: proc(order: Atomic_Memory_Order) {…}
Adds a "fence" to introduce a "happens-before" edges between operations.
Establish memory ordering.
This procedure establishes memory ordering, without an associated atomic operation.
atomic_xor ¶
atomic_xor :: proc(dst: ^$T, val: T) -> T {…}
Atomically replace the memory location with the result of XOR operation with the specified value.
This procedure loads a value from memory, calculates the result of XOR operation between the loaded value and the specified value, and stores it back into the same memory location as an atomic operation. This operation is an atomic equivalent of the following:
dst^ ~= val
The memory ordering of this operation is sequentially-consistent.
atomic_xor_explicit ¶
atomic_xor_explicit :: proc(dst: ^$T, val: T, order: Atomic_Memory_Order) -> T {…}
Atomically replace the memory location with the result of XOR operation with the specified value.
This procedure loads a value from memory, calculates the result of XOR operation between the loaded value and the specified value, and stores it back into the same memory location as an atomic operation. This operation is an atomic equivalent of the following:
dst^ ~= val
The memory ordering of this operation is as specified by the order parameter.
byte_swap ¶
byte_swap :: proc(x: $T) -> T where type_is_integer(T) || type_is_float(T) {…}
Reverses the bytes from ascending order to descending order e.g. 0xfe_ed_01_12 -> 0x12_01_ed_fe
concatenate ¶
concatenate :: proc(x, y: $T, z: ..T) -> T where type_is_array(T) || type_is_slice(T) {…}
concatenates 2+ constant slices or arrays values together to form a new one.
Example:
x :: intrinsics.concatenate([]int{1, 2, 3}, []int{4, 5, 6}, {1, 1, -1})
#assert(type_of(x) == []int)
y :: intrinsics.concatenate([3]int{1, 2, 3}, [?]int{4, 5}, [?]int{6, 1, 1, -1})
#assert(type_of(y) == [9]int)
constant_log2 ¶
constant_log2 :: proc($v: $T) -> T where type_is_integer(T) {…}
Returns the log2 value of the given constant integer.
constant_utf16_cstring ¶
Returns a runtime value of a constant string UTF-8 value encoded as a UTF-16 NULL terminated string value, useful for interfacing with UTF-16 procedure such as the Windows API.
Important Note: This will be deprecated soon as UTF-16 string types and literals are supported natively.
count_leading_zeros ¶
count_leading_zeros :: proc(x: $T) -> T where type_is_integer(T) || type_is_simd_vector(T) {…}
Counts the number of leading unset bits (0s) until a set bit (1) is seen or all bits have been counted.
count_ones ¶
count_ones :: proc(x: $T) -> T where type_is_integer(T) || type_is_simd_vector(T) {…}
Counts the number of set bits (1s).
count_trailing_zeros ¶
count_trailing_zeros :: proc(x: $T) -> T where type_is_integer(T) || type_is_simd_vector(T) {…}
Counts the number of trailing unset bits (0s) until a set bit (1) is seen or all bits have been counted.
count_zeros ¶
count_zeros :: proc(x: $T) -> T where type_is_integer(T) || type_is_simd_vector(T) {…}
Counts the number of unset bits (0s).
cpu_relax ¶
cpu_relax :: proc() {…}
On i386/amd64, it should map to the pause instruction. On arm64, it should map to isb instruction (see https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8258604 for more information).
debug_trap ¶
debug_trap :: proc() {…}
A call intended to cause an execution trap with the intention of requesting a debugger's attention.
expect ¶
expect :: proc(val, expected_val: T) -> T {…}
Provides information about expected (the most probable) value of val, which can be used by optimizing backends.
fixed_point_div ¶
fixed_point_div :: proc(lhs, rhs: $T, #const scale: uint) -> T where type_is_integer(T) {…}
A fixed point number represents a real data type for a number that has a fixed number of digits after a radix point. The number of digits after the radix point is referred to as scale.
fixed_point_div_sat ¶
fixed_point_div_sat :: proc(lhs, rhs: $T, #const scale: uint) -> T where type_is_integer(T) {…}
A fixed point number represents a real data type for a number that has a fixed number of digits after a radix point. The number of digits after the radix point is referred to as scale.
fixed_point_mul ¶
fixed_point_mul :: proc(lhs, rhs: $T, #const scale: uint) -> T where type_is_integer(T) {…}
A fixed point number represents a real data type for a number that has a fixed number of digits after a radix point. The number of digits after the radix point is referred to as scale.
fixed_point_mul_sat ¶
fixed_point_mul_sat :: proc(lhs, rhs: $T, #const scale: uint) -> T where type_is_integer(T) {…}
A fixed point number represents a real data type for a number that has a fixed number of digits after a radix point. The number of digits after the radix point is referred to as scale.
fused_mul_add ¶
fused_mul_add :: proc(a, b, c: $T) -> T where type_is_float(T) || (type_is_simd_vector(T) && type_is_float(type_elem_type(T))) {…}
has_target_feature ¶
has_target_feature :: proc($test: $T) -> bool where type_is_string(T) || type_is_proc(T) {…}
Checks if the current target supports the given target features.
Takes a constant comma-separated string (eg: "sha512,sse4.1"), or a procedure type which has either @(require_target_feature) or @(enable_target_feature) as its input and returns a boolean indicating if all listed features are supported.
mem_copy ¶
Copies a block of memory from the src location to the dst location but assumes that the memory ranges could be overlapping. It is equivalent to C's memmove, but unlike the C's libc procedure, it does not return value.
mem_copy_non_overlapping ¶
Copies a block of memory from the src location to the dst location but it does not assume the memory ranges could be overlapping. It is equivalent to C's memcpy, but unlike the C's libc procedure, it does not return value.
mem_zero ¶
Zeroes a block of memory at the ptr location for len bytes.
mem_zero_volatile ¶
Zeroes a block of memory at the ptr location for len bytes with volatile semantics.
non_temporal_load ¶
non_temporal_load :: proc(dst: ^$T) -> T {…}
Tells the code generator of a compiler that this operation is not expected to be reused in the cache. The code generator may select special instructions to save cache bandwidth (e.g. on x86, movnt instruct might be used).
non_temporal_store ¶
non_temporal_store :: proc(dst: ^$T, val: T) {…}
Tells the code generator of a compiler that this operation is not expected to be reused in the cache. The code generator may select special instructions to save cache bandwidth (e.g. on x86, movnt instruct might be used).
objc_block ¶
objc_block :: proc(invoke: $T, ..any) -> ^Objc_Block(T) where type_is_proc(T) {…}
Darwin targets only
objc_find_class ¶
objc_find_class :: proc($name: string) -> objc_Class {…}
Darwin targets only Will return a run-time cached class value for the given constant string value.
objc_find_selector ¶
Darwin targets only Will return a run-time cached selector value for the given constant string value.
objc_register_class ¶
objc_register_class :: proc($name: string) -> objc_Class {…}
Darwin targets only Will register a class value at run-time for the given constant string value.
objc_register_selector ¶
Darwin targets only Will register a selector value at run-time for the given constant string value.
objc_super ¶
objc_super :: proc(obj: ^$T) -> ^$U where type_is_subtype_of(T, objc_object), type_is_subtype_of(U, objc_object) {…}
Darwin targets only
overflow_add ¶
overflow_add :: proc(lhs, rhs: $T) -> (T, bool) where type_is_integer(T) #optional_ok {…}
Performs an "add" operation with an overflow check. The second return value will be true if an overflow occurs.
overflow_mul ¶
overflow_mul :: proc(lhs, rhs: $T) -> (T, bool) where type_is_integer(T) #optional_ok {…}
Performs a "multiply" operation with an overflow check. The second return value will be true if an overflow occurs.
overflow_sub ¶
overflow_sub :: proc(lhs, rhs: $T) -> (T, bool) where type_is_integer(T) #optional_ok {…}
Performs a "subtract" operation with an overflow check. The second return value will be true if an overflow occurs.
prefetch_read_data ¶
The prefetch_* intrinsic are a hint to the code generator to insert a prefetch instruction if supported; otherwise, it is a no-op. Prefetches have no affect on the behaviour of the program but can change its performance characteristics.
The locality parameter must be a constant integer, and its temporal locality value ranges from 0 (no locality) to 3 (extremely local, keep in cache).
prefetch_read_instruction ¶
The prefetch_* intrinsic are a hint to the code generator to insert a prefetch instruction if supported; otherwise, it is a no-op. Prefetches have no affect on the behaviour of the program but can change its performance characteristics.
The locality parameter must be a constant integer, and its temporal locality value ranges from 0 (no locality) to 3 (extremely local, keep in cache).
prefetch_write_data ¶
The prefetch_* intrinsic are a hint to the code generator to insert a prefetch instruction if supported; otherwise, it is a no-op. Prefetches have no affect on the behaviour of the program but can change its performance characteristics.
The locality parameter must be a constant integer, and its temporal locality value ranges from 0 (no locality) to 3 (extremely local, keep in cache).
prefetch_write_instruction ¶
The prefetch_* intrinsic are a hint to the code generator to insert a prefetch instruction if supported; otherwise, it is a no-op. Prefetches have no affect on the behaviour of the program but can change its performance characteristics.
The locality parameter must be a constant integer, and its temporal locality value ranges from 0 (no locality) to 3 (extremely local, keep in cache).
procedure_of ¶
procedure_of :: proc(x: $T) -> T where type_is_proc(T) {…}
Returns the value of the procedure where x must be a call expression.
ptr_offset ¶
ptr_offset :: proc(ptr: ^$T, offset: int) -> ^T {…}
Prefer using [^]T operations if possible. e.g. ptr[offset:]
ptr_sub ¶
ptr_sub :: proc(a, b: ^$T) -> int {…}
Equivalent to int(uintptr(a) - uintptr(b)) / size_of(T)
read_cycle_counter ¶
read_cycle_counter :: proc() -> i64 {…}
This provides access to the cycle counter register (or similar low latency, high accuracy clocks) on the targets that support it. On i386/amd64, it should map to the rdtsc instruction. On arm64, it should map to the cntvct_el0 instruction.
read_cycle_counter_frequency ¶
read_cycle_counter_frequency :: proc() -> i64 {…}
This provides access to the frequency that the cycle counter register (or similar low latency, high accuracy clocks) uses on the targets that support it.
reverse_bits ¶
reverse_bits :: proc(x: $T) -> T where type_is_integer(T) || type_is_simd_vector(T) {…}
Reverses the bits from ascending order to descending order e.g. 0b01110101 -> 0b10101110
saturating_add ¶
saturating_add :: proc(lhs, rhs: $T) -> T -> where type_is_integer(T) {…}
Performs a saturating "add" operation, where the return value is clamped between min(T) and max(T).
saturating_sub ¶
saturating_sub :: proc(lhs, rhs: $T) -> T -> where type_is_integer(T) {…}
Performs a saturating "subtract" operation, where the return value is clamped between min(T) and max(T).
simd_abs ¶
simd_abs :: proc(a: #simd[N]T) -> #simd[N]T {…}
Absolute value of a SIMD vector.
This procedure returns a vector where each lane has the absolute value of the
corresponding lane in the vector a.
Inputs:a: An integer or a float vector to negate
Returns:
The absolute value of a vector.
Operation:
for i in 0 ..< len(res) {
switch {
case a[i] < 0: res[i] = -a[i]
case a[i] > 0: res[i] = a[i]
case a[i] == 0: res[i] = 0
}
}
return res
Example:
+------+------+------+------+
a: | 0 | -1 | 2 | -3 |
+------+------+------+------+
res:
+------+------+------+------+
| 0 | 1 | 2 | 3 |
+------+------+------+------+
simd_add ¶
simd_add :: proc(a, b: #simd[N]T) -> #simd[N]T {…}
Add SIMD vectors.
This procedure returns a vector, where each lane holds the sum of the
corresponding a and b vectors' lanes.
Inputs:a: An integer or a float vector.
b: An integer or a float vector.
Returns:
A vector that is the sum of two input vectors.
Operation:
for i in 0 ..< len(res) {
res[i] = a[i] + b[i]
}
return res
Example:
+-----+-----+-----+-----+
a: | 0 | 1 | 2 | 3 |
+-----+-----+-----+-----+
+-----+-----+-----+-----+
b: | 0 | 1 | 2 | -1 |
+-----+-----+-----+-----+
res:
+-----+-----+-----+-----+
| 0 | 2 | 4 | 2 |
+-----+-----+-----+-----+
simd_bit_and ¶
simd_bit_and :: proc(a, b: #simd[N]T) -> #simd[N]T {…}
Bitwise AND of vectors.
This procedure returns a vector, such that each lane has the result of a bitwise
AND operation between the corresponding lanes of the vectors a and b.
Inputs:a: An integer or a boolean vector.
b: An integer or a boolean vector.
Returns:
A vector that is the result of the bitwise AND operation between two vectors.
Operation:
for i in 0 ..< len(res) {
res[i] = a[i] & b[i]
}
return res
Example:
+------+------+------+------+
a: | 0x11 | 0x33 | 0x55 | 0xaa |
+------+------+------+------+
+------+------+------+------+
b: | 0xff | 0xf0 | 0x0f | 0x00 |
+------+------+------+------+
res:
+------+------+------+------+
| 0x11 | 0x30 | 0x05 | 0x00 |
+------+------+------+------+
simd_bit_and_not ¶
simd_bit_and_not :: proc(a, b: #simd[N]T) -> #simd[N]T {…}
Bitwise AND NOT of vectors.
This procedure returns a vector, such that each lane has the result of a bitwise
AND NOT operation between the corresponding lanes of the vectors a and b.
Inputs:a: An integer or a boolean vector.
b: An integer or a boolean vector.
Returns:
A vector that is the result of the bitwise AND NOT operation between two vectors.
Operation:
for i in 0 ..< len(res) {
res[i] = a[i] &~ b[i]
}
return res
Example:
+------+------+------+------+
a: | 0x11 | 0x33 | 0x55 | 0xaa |
+------+------+------+------+
+------+------+------+------+
b: | 0xff | 0xf0 | 0x0f | 0x00 |
+------+------+------+------+
res:
+------+------+------+------+
| 0x00 | 0x03 | 0x50 | 0xaa |
+------+------+------+------+
simd_bit_or ¶
simd_bit_or :: proc(a, b: #simd[N]T) -> #simd[N]T {…}
Bitwise OR of vectors.
This procedure returns a vector, such that each lane has the result of a bitwise
OR operation between the corresponding lanes of the vectors a and b.
Inputs:a: An integer or a boolean vector.
b: An integer or a boolean vector.
Returns:
A vector that is the result of the bitwise OR operation between two vectors.
Operation:
for i in 0 ..< len(res) {
res[i] = a[i] | b[i]
}
return res
Example:
+------+------+------+------+
a: | 0x11 | 0x33 | 0x55 | 0xaa |
+------+------+------+------+
+------+------+------+------+
b: | 0xff | 0xf0 | 0x0f | 0x00 |
+------+------+------+------+
res:
+------+------+------+------+
| 0xff | 0xf3 | 0x5f | 0xaa |
+------+------+------+------+
simd_bit_xor ¶
simd_bit_xor :: proc(a, b: #simd[N]T) -> #simd[N]T {…}
Bitwise XOR of vectors.
This procedure returns a vector, such that each lane has the result of a bitwise
XOR operation between the corresponding lanes of the vectors a and b.
Inputs:a: An integer or a boolean vector.
b: An integer or a boolean vector.
Returns:
A vector that is the result of the bitwise XOR operation between two vectors.
Operation:
for i in 0 ..< len(res) {
res[i] = a[i] ~ b[i]
}
return res
Example:
+------+------+------+------+
a: | 0x11 | 0x33 | 0x55 | 0xaa |
+------+------+------+------+
+------+------+------+------+
b: | 0xff | 0xf0 | 0x0f | 0x00 |
+------+------+------+------+
res:
+------+------+------+------+
| 0xee | 0xc3 | 0x5a | 0xaa |
+------+------+------+------+
simd_ceil ¶
simd_ceil :: proc(a: #simd[N]any_float) -> #simd[N]any_float {…}
lane-wise ceil
Ceil each lane in a SIMD vector.
simd_clamp ¶
Clamp lanes of vector.
This procedure returns a vector, where each lane is the result of the
clamping of the lane from the vector v between the values in the corresponding
lanes of vectors min and max.
Inputs:v: An integer or a float vector with values to be clamped.
min: An integer or a float vector with minimum bounds.
max: An integer or a float vectoe with maximum bounds.
Returns:
A vector containing clamped values in each lane.
Operation:
for i in 0 ..< len(res) {
val := v[i]
switch {
case val < min: val = min
case val > max: val = max
}
res[i] = val
}
return res
Example:
+-------+-------+-------+-------+
v: | -1 | 0.3 | 1.2 | 1 |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
min: | 0 | 0 | 0 | 0 |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
max: | 1 | 1 | 1 | 1 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+-------+
| 0 | 0.3 | 1 | 1 |
+-------+-------+-------+-------+
simd_div ¶
simd_div :: proc(a, b: #simd[N]T) -> #simd[N]T where type_is_float(T) {…}
Divide SIMD vectors.
This procedure returns a vector, where each lane holds the quotient (result
of division) between the corresponding lanes of the vectors a and b. Each
lane of the vector a is divided by the corresponding lane of the vector b.
This operation performs a standard floating-point division for each lane.
Inputs:a: A float vector.
b: A float vector to divide by.
Returns:
A vector that is the quotient of two vectors, a / b.
Operation:
for i in 0 ..< len(res) {
res[i] = a[i] / b[i]
}
return res
Example:
+-----+-----+-----+-----+
a: | 2 | 2 | 2 | 2 |
+-----+-----+-----+-----+
+-----+-----+-----+-----+
b: | 0 | -1 | 2 | -3 |
+-----+-----+-----+-----+
res:
+-----+-----+-----+------+
| +∞ | -2 | 1 | -2/3 |
+-----+-----+-----+------+
simd_extract ¶
simd_extract :: proc(a: #simd[N]T, idx: uint) -> T {…}
Extracts a single scalar element from a #simd vector at a specified index.
Extract scalar from a vector's lane.
This procedure returns the scalar from the lane at the specified index of the vector.
Inputs:a: The vector to extract from.
idx: The lane index.
Returns:
The value of the lane at the specified index.
Operation:
return a[idx]
simd_extract_lsbs ¶
simd_extract_lsbs :: proc(a: #simd[N]T) -> bit_set[0..where type_is_integer(T) || type_is_boolean(T) {…}
Extracts the least significant bit of each element of the given vector into a bit_set.
Extract the set of least-significant bits of a SIMD vector.
This procedure checks the the least-significant bit (LSB) for each lane of vector
and returns the numbers of lanes with the least-significant bit set. This procedure
can be used in conjuction with lanes_eq (and other similar procedures) to
count the number of matched lanes by computing the cardinality of the resulting
set.
Inputs:a: An input vector.
Result: A bitset of integers, corresponding to the indexes of the lanes, whose LSBs are set.
Operation:
res = bit_set {}
for i in 0 ..< len(a) {
if a[i] & 1 != 0 {
res |= i
}
}
return res
Example:
// Since lanes 0, 2, 4, 6 contain odd integers, the least significant bits
// for these lanes are set.
import "core:fmt"
import "core:simd"
simd_extract_lsbs_example :: proc() {
v := #simd [8]i32 { -1, -2, +3, +4, -5, +6, +7, -8 }
fmt.println(simd.extract_lsbs(v))
}
bit_set[0..=7]{0, 2, 4, 6}
simd_extract_msbs ¶
simd_extract_msbs :: proc(a: #simd[N]T) -> bit_set[0..where type_is_integer(T) || type_is_boolean(T) {…}
Extracts the most significant bit of each element of the given vector into a bit_set.
Extract the set of most-significant bits of a SIMD vector.
This procedure checks the the most-significant bit (MSB) for each lane of vector
and returns the numbers of lanes with the most-significant bit set. This procedure
can be used in conjuction with lanes_eq (and other similar procedures) to
count the number of matched lanes by computing the cardinality of the resulting
set.
Inputs:a: An input vector.
Result: A bitset of integers, corresponding to the indexes of the lanes, whose MSBs are set.
Operation:
bits_per_lane = 8*size_of(a[0])
res = bit_set {}
for i in 0 ..< len(a) {
if a[i] & 1<<(bits_per_lane-1) != 0 {
res |= i
}
}
return res
Example:
// Since lanes 0, 1, 4, 7 contain negative numbers, the most significant
// bits for them will be set.
import "core:fmt"
import "core:simd"
simd_extract_msbs_example :: proc() {
v := #simd [8]i32 { -1, -2, +3, +4, -5, +6, +7, -8 }
fmt.println(simd.extract_msbs(v))
}
bit_set[0..=7]{0, 1, 4, 7}
simd_floor ¶
simd_floor :: proc(a: #simd[N]any_float) -> #simd[N]any_float {…}
lane-wise floor
Floor each lane in a SIMD vector.
simd_gather ¶
simd_gather :: proc(ptr: #simd[N]rawptr, val: #simd[N]T, mask: #simd[N]U) -> #simd[N]T where type_is_integer(U) || type_is_boolean(U) {…}
Perform a gather load into a vector.
A gather operation is memory load operation, that loads values from an vector of addresses into a single value vector. This can be used to achieve the following results:
Accessing every N'th element of an array (strided access) Access of elements according to some computed offsets (indexed access). Access of elements in a different order (shuffling access).
When used alongside other SIMD procedures in order to compute the offsets
for the ptr and mask parameters.
Inputs:ptr: A vector of memory locations. Each pointer points to a single value,
of a SIMD vector's lane type that will be loaded into the vector. Pointer in this vector can be `nil` or any other invalid value, if the corresponding value in the `mask` parameter is zero.
val: A vector of values that will be used at corresponding positions
of the result vector, if the corresponding memory location has been masked out.
mask: A vector of booleans or unsigned integers that determines which memory
locations to read from. If the value at an index has the value true (lowest bit set), the value at that index will be loaded into the result vector from the corresponding memory location in the `ptr` vector. Otherwise the value will be loaded from the `val` vector.
Returns:
A vector with all values from unmasked indices
loaded from the pointer vector ptr, and all values from masked indices loaded
from the value vector val.
Operation:
for i in 0 ..< len(res) {
if mask[i]&1 == 1 {
res[i] = ptr[i]^
} else {
res[i] = val[i]
}
}
return res
Example:
// Example below loads 2 lanes of values from 2 lanes of float vectors, `v1` and
// `v2`. From each of these vectors we're loading the second value, into the first
// and the third position of the result vector.
// Therefore the `ptrs` argument is initialized such that the first and the third
// value are the addresses of the values that we want to load into the result
// vector, and we'll fill in `nil` for the rest of them. To prevent CPU from
// dereferencing those `nil` addresses we provide the mask that only allows us
// to load valid positions of the `ptrs` array, and the array of defaults which
// will have `127` in each position as the default value.
import "core:fmt"
import "core:simd"
simd_gather_example :: proc() {
v1 := [4] f32 {1, 2, 3, 4};
v2 := [4] f32 {9, 10,11,12};
ptrs := #simd [4]rawptr { &v1[1], nil, &v2[1], nil }
mask := #simd [4]bool { true, false, true, false }
defaults := #simd [4]f32 { 0x7f, 0x7f, 0x7f, 0x7f }
res := simd.gather(ptrs, defaults, mask)
fmt.println(res)
}
<2, 127, 10, 127>
The first and the third positions came from the ptrs array, and the other
2 lanes of from the default vector. The graphic below shows how the values of
the result are decided based on the mask:
+-------------------------------+
mask: | 1 | 0 | 1 | 0 |
+-------------------------------+
| | | `----------------------------.
| | | |
| `---- | ------------------------. |
v v v v
+-------------------------------+ +-------------------+
ptrs: | &m0 | nil | &m2 | nil | vals: | d0 | d1 | d2 | d3 |
+-------------------------------+ +-------------------+
| | | |
| .--- | -------------------------' |
| | | ,-------------------------'
v v v v
+-------------------------------+
result: | m0 | d1 | m2 | d3 |
+-------------------------------+
simd_lanes_eq ¶
simd_lanes_eq :: proc(a, b: #simd[N]T) -> #simd[N]Integer {…}
Return an unsigned integer of the same size as the input type, NOT A BOOLEAN. element-wise: false => 0x00...00, true => 0xff...ff
Check if lanes of vectors are equal.
This procedure checks each pair of lanes from vectors a and b for whether
they are equal, and if they are, the corresponding lane of the result vector
will have a value with all bits set (0xff..ff). Otherwise the lane of the
result vector will have the value 0.
Inputs:a: An integer, a float or a boolean vector.
b: An integer, a float or a boolean vector.
Returns:
A vector of unsigned integers of the same size as the input vector's lanes,
containing the comparison results for each lane.
Operation:
for i in 0 ..< len(res) {
if a[i] == b[i] {
res[i] = max(T)
} else {
res[i] = 0
}
}
return res
Example:
+-------+-------+-------+-------+
a: | 0 | 1 | 2 | 3 |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
b: | 0 | 2 | 2 | 2 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+-------+
| 0xff | 0x00 | 0xff | 0x00 |
+-------+-------+-------+-------+
simd_lanes_ge ¶
simd_lanes_ge :: proc(a, b: #simd[N]T) -> #simd[N]Integer {…}
Return an unsigned integer of the same size as the input type, NOT A BOOLEAN. element-wise: false => 0x00...00, true => 0xff...ff
Check if lanes of a vector are greater than or equal than another. SIMD vector.
This procedure checks each pair of lanes from vectors a and b for whether the
lane of a is greater than or equal to the lane of b, and if so, the
corresponding lane of the result vector will have a value with all bits set
(0xff..ff). Otherwise the lane of the result vector will have the value 0.
Inputs:a: An integer or a float vector.
b: An integer or a float vector.
Returns:
A vector of unsigned integers of the same size as the input vector's lanes,
containing the comparison results for each lane.
Operation:
for i in 0 ..< len(res) {
if a[i] >= b[i] {
res[i] = unsigned(-1)
} else {
res[i] = 0
}
}
return res
Example:
+-------+-------+-------+-------+
a: | 0 | 1 | 2 | 3 |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
b: | 0 | 2 | 2 | 2 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+-------+
| 0xff | 0x00 | 0xff | 0xff |
+-------+-------+-------+-------+
simd_lanes_gt ¶
simd_lanes_gt :: proc(a, b: #simd[N]T) -> #simd[N]Integer {…}
Return an unsigned integer of the same size as the input type, NOT A BOOLEAN. element-wise: false => 0x00...00, true => 0xff...ff
Check if lanes of a vector are greater than another. vector.
This procedure checks each pair of lanes from vectors a and b for whether the
lane of a is greater than to the lane of b, and if so, the corresponding
lane of the result vector will have a value with all bits set (0xff..ff).
Otherwise the lane of the result vector will have the value 0.
Inputs:a: An integer or a float vector.
b: An integer or a float vector.
Returns:
A vector of unsigned integers of the same size as the input vector's lanes,
containing the comparison results for each lane.
Operation:
for i in 0 ..< len(res) {
if a[i] > b[i] {
res[i] = unsigned(-1)
} else {
res[i] = 0
}
}
return res
Example:
+-------+-------+-------+-------+
a: | 0 | 1 | 2 | 3 |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
b: | 0 | 2 | 2 | 2 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+-------+
| 0x00 | 0x00 | 0x00 | 0xff |
+-------+-------+-------+-------+
simd_lanes_le ¶
simd_lanes_le :: proc(a, b: #simd[N]T) -> #simd[N]Integer {…}
Return an unsigned integer of the same size as the input type, NOT A BOOLEAN. element-wise: false => 0x00...00, true => 0xff...ff
Check if lanes of a vector are less than or equal than another. SIMD vector.
This procedure checks each pair of lanes from vectors a and b for whether the
lane of a is less than or equal to the lane of b, and if so, the
corresponding lane of the result vector will have a value with all bits set
(0xff..ff). Otherwise the lane of the result vector will have the value 0.
Inputs:a: An integer or a float vector.
b: An integer or a float vector.
Returns:
A vector of unsigned integers of the same size as the input vector's lanes,
containing the comparison results for each lane.
Operation:
for i in 0 ..< len(res) {
if a[i] <= b[i] {
res[i] = unsigned(-1)
} else {
res[i] = 0
}
}
return res
Example:
+-------+-------+-------+-------+
a: | 0 | 1 | 2 | 3 |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
b: | 0 | 2 | 2 | 2 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+-------+
| 0xff | 0xff | 0xff | 0x00 |
+-------+-------+-------+-------+
simd_lanes_lt ¶
simd_lanes_lt :: proc(a, b: #simd[N]T) -> #simd[N]Integer {…}
Return an unsigned integer of the same size as the input type, NOT A BOOLEAN. element-wise: false => 0x00...00, true => 0xff...ff
Check if lanes of a vector are less than another.
This procedure checks each pair of lanes from vectors a and b for whether
the lane of a is less than the lane of b, and if so, the corresponding lane
of the result vector will have a value with all bits set (0xff..ff). Otherwise
the lane of the result vector will have the value 0.
Inputs:a: An integer or a float vector.
b: An integer or a float vector.
Returns:
A vector of unsigned integers of the same size as the input vector's lanes,
containing the comparison results for each lane.
Operation:
for i in 0 ..< len(res) {
if a[i] < b[i] {
res[i] = unsigned(-1)
} else {
res[i] = 0
}
}
return res
Example:
+-------+-------+-------+-------+
a: | 0 | 1 | 2 | 3 |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
b: | 0 | 2 | 2 | 2 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+-------+
r: | 0x00 | 0xff | 0x00 | 0x00 |
+-------+-------+-------+-------+
simd_lanes_ne ¶
simd_lanes_ne :: proc(a, b: #simd[N]T) -> #simd[N]Integer {…}
Return an unsigned integer of the same size as the input type, NOT A BOOLEAN. element-wise: false => 0x00...00, true => 0xff...ff
Check if lanes of vectors are not equal.
This procedure checks each pair of lanes from vectors a and b for whether
they are not equal, and if they are, the corresponding lane of the result
vector will have a value with all bits set (0xff..ff). Otherwise the lane of
the result vector will have the value 0.
Inputs:a: An integer, a float or a boolean vector.
b: An integer, a float or a boolean vector.
Returns:
A vector of unsigned integers of the same size as the input vector's lanes,
containing the comparison results for each lane.
Operation:
for i in 0 ..< len(res) {
if a[i] != b[i] {
res[i] = unsigned(-1)
} else {
res[i] = 0
}
}
return res
Example:
+-------+-------+-------+-------+
a: | 0 | 1 | 2 | 3 |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
b: | 0 | 2 | 2 | 2 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+-------+
| 0x00 | 0xff | 0x00 | 0xff |
+-------+-------+-------+-------+
simd_lanes_reverse ¶
simd_lanes_reverse :: proc(a: #simd[N]T) -> #simd[N]T {…}
equivalent a swizzle with descending indices, e.g. reserve(a, 3, 2, 1, 0)
Reverse the lanes of a SIMD vector.
This procedure reverses the lanes of a vector, putting last lane in the first spot, etc. This procedure is equivalent to the following call (for 4-element vectors):
swizzle(a, 3, 2, 1, 0)
simd_lanes_rotate_left ¶
simd_lanes_rotate_left :: proc(a: #simd[N]T, $offset: int) -> #simd[N]T {…}
Rotate the lanes of a SIMD vector left.
This procedure rotates the lanes of a vector, putting the first lane of the last spot, second lane in the first spot, third lane in the second spot, etc. For 4-element vectors, this procedure is equvalent to the following:
swizzle(a, 1, 2, 3, 0)
simd_lanes_rotate_right ¶
simd_lanes_rotate_right :: proc(a: #simd[N]T, $offset: int) -> #simd[N]T {…}
Rotate the lanes of a SIMD vector right.
This procedure rotates the lanes of a SIMD vector, putting the first lane of the second spot, second lane in the third spot, etc. For 4-element vectors, this procedure is equvalent to the following:
swizzle(a, 3, 0, 1, 2)
simd_masked_compress_store ¶
simd_masked_compress_store :: proc(ptr: rawptr, val: #simd[N]T, mask: #simd[N]U) where type_is_integer(U) || type_is_boolean(U) {…}
Store masked values to consecutive memory locations.
This procedure stores values from masked lanes of a vector val consecutively
into memory. This operation is the opposite of masked_expand_load. The number
of items stored into memory is the number of set bits in the mask. If the value
in a lane of a mask is true, that lane is stored into memory. Otherwise
nothing is stored.
Inputs:ptr: The pointer to the memory of a store.
val: The vector to store into memory.
mask: The mask that selects which values to store into memory.
Operation:
mem_idx := 0
for i in 0 ..< len(mask) {
if mask[i]&1 == 1 {
ptr[mem_idx] = val[i]
mem_idx += 1
}
}
Example:
// The code below fills the vector `v` with two values from a 4-element SIMD
// vector, the first and the third value. The items in the mask are set to `true`
// in those lanes.
import "core:fmt"
import "core:simd"
simd_masked_compress_store_example :: proc() {
v := [2] f64 { };
mask := #simd [4]bool { true, false, true, false }
vals := #simd [4]f64 { 1, 2, 3, 4 }
simd.masked_compress_store(&v, vals, mask)
fmt.println(v)
}
[1, 3]
Graphical representation of the operation:
+-------------------+
mask: | 1 | 0 | 1 | 0 |
+-------------------+
| |
v v
+-------------------+
vals: | v0 | v1 | v2 | v3 |
+-------------------+
| ,--'
ptr v v
+--->+-----------------
| v0 | v2 | ...
+-----------------
simd_masked_expand_load ¶
simd_masked_expand_load :: proc(ptr: rawptr, val: #simd[N]T, mask: #simd[N]U) -> #simd[N]T where type_is_integer(U) || type_is_boolean(U) {…}
Load consecutive scalar values and expand into a vector.
This procedure loads a number of consecutive scalar values from an address,
specified by the ptr parameter, and stores them in a result vector, according
to the mask. The number of values read from memory is the number of set bits
in the mask. The lanes for which the mask has the value true get the next
consecutive value from memory, otherwise if the mask is false for the
lane, its value is filled from the corresponding lane of the val parameter.
This procedure acts like masked_store, except the values from memory are
read consecutively, and not according to the lanes. The memory values are read
and assigned to the result vector's masked lanes in order of increasing
addresses.
Inputs:ptr: The pointer to the memory to read from.
vals: The default values for masked-off entries.
mask: The mask that determines which lanes get consecutive memory values.
Returns:
The result vector, holding masked memory values unmasked default values.
Operation:
mem_idx := 0
for i in 0 ..< len(mask) {
if mask[i]&1 == 1 {
res[i] = ptr[mem_idx]
mem_idx += 1
} else {
res[i] = val[i]
}
}
return res
Example:
// The example below loads two values from memory of the vector `v`. Two values in
// the mask are set to `true`, meaning only two memory items will be loaded into
// the result vector. The mask is set to `true` in the first and the third
// position, which specifies that the first memory item will be read into the
// first lane of the result vector, and the second memory item will be read into
// the third lane of the result vector. All the other lanes of the result vector
// will be initialized to the default value `127`.
import "core:fmt"
import "core:simd"
simd_masked_expand_load_example :: proc() {
v := [2] f64 {1, 2};
mask := #simd [4]bool { true, false, true, false }
vals := #simd [4]f64 { 0x7f, 0x7f, 0x7f, 0x7f }
res := simd.masked_expand_load(&v, vals, mask)
fmt.println(res)
}
<1, 127, 2, 127>
Graphical representation of the operation:
ptr --->+-----------+-----
| m0 | m1 | ...
+-----------+-----
| `--.
v v
+-------------------+ +-------------------+
mask: | 1 | 0 | 1 | 0 | vals: | v0 | v1 | v2 | v3 |
+-------------------+ +-------------------+
| | | |
| .-- | -----------------------' |
| | | ,----------------------------'
v v v v
+-------------------+
result: | m0 | v1 | m1 | v3 |
+-------------------+
simd_masked_load ¶
simd_masked_load :: proc(ptr: rawptr, val: #simd[N]T, mask: #simd[N]U) -> #simd[N]T where type_is_integer(U) || type_is_boolean(U) {…}
Perform a masked load into the vector.
This procedure performs a masked load from memory, into the vector. The ptr
argument specifies the base address from which the values of the vector
will be loaded. The mask selects the source for the result vector's lanes. If
the mask for the corresponding lane has the value true (lowest bit set), the
result lane is loaded from memory. Otherwise the result lane is loaded from the
corresponding lane of the val vector.
Inputs:ptr: The address of the vector values to load. Masked-off values are not
accessed.
val: The vector of values that will be loaded into the masked slots of the
result vector.
mask: The mask that selects where to load the values from.
Returns:
The loaded vector. The lanes for which the mask was set are loaded from
memory, and the other lanes are loaded from the val vector.
Operation:
for i in 0 ..< len(res) {
if mask[i]&1 == 1 {
res[i] = ptr[i]
} else {
res[i] = vals[i]
}
}
return res
Example:
// The following code loads two values from the `src` vector, the first and the
// third value (selected by the mask). The masked-off values are given the value
// of 127 (`0x7f`).
import "core:fmt"
import "core:simd"
simd_masked_load_example :: proc() {
src := [4] f32 {1, 2, 3, 4};
mask := #simd [4]bool { true, false, true, false }
vals := #simd [4]f32 { 0x7f, 0x7f, 0x7f, 0x7f }
res := simd.masked_load(&src, vals, mask)
fmt.println(res)
}
<1, 127, 3, 127>
The graphic below demonstrates the flow of lanes.
+-------------------------------+
mask: | 1 | 0 | 1 | 0 |
+-------------------------------+
| | | `----------------------------.
| | | |
| `---- | ------------------------. |
ptr v v v v
+---->+-------------------------------+ +-------------------+
| v1 | v2 | v3 | v4 | vals: | d0 | d1 | d2 | d3 |
+-------------------------------+ +-------------------+
| | | |
| .--- | -------------------------' |
| | | ,-------------------------'
v v v v
+-------------------------------+
result: | v1 | d1 | v3 | d3 |
+-------------------------------+
simd_masked_store ¶
simd_masked_store :: proc(ptr: rawptr, val: #simd[N]T, mask: #simd[N]U) where type_is_integer(U) || type_is_boolean(U) {…}
Perform a masked store to memory.
This procedure performs a masked store from a vector val, into memory at
address ptr, with the mask deciding which lanes are going to be stored,
and which aren't. If the mask at a corresponding lane has the value true
(lowest bit set), the lane is stored into memory. Otherwise the lane is not
stored into memory.
Inputs:ptr: The base address of the store.
val: The vector to store.
mask: The mask, selecting which lanes of the vector to store into memory.
Operation:
for i in 0 ..< len(val) {
if mask[i]&1 == 1 {
ptr[i] = val
}
}
Example:
// Example below stores the value 127 into the first and the third slot of the
// vector `v`.
import "core:fmt"
import "core:simd"
simd_masked_store_example :: proc() {
v := [4] f32 {1, 2, 3, 4};
mask := #simd [4]bool { true, false, true, false }
vals := #simd [4]f32 { 0x7f, 0x7f, 0x7f, 0x7f }
simd.masked_store(&v, vals, mask)
fmt.println(v)
}
[127, 2, 127, 4]
The graphic below shows the flow of lanes:
+-------------------+
mask: | 1 | 0 | 1 | 0 |
+-------------------+
| | | |
v X v X
+-------------------+
vals: | v0 | v1 | v2 | v3 |
+-------------------+
| \
ptr v v
+--->+-----------------------+
| v0 | ... | v2 | ... |
+-----------------------+
simd_max ¶
simd_max :: proc(a, b: #simd[N]T) -> #simd[N]T {…}
Maximum of each lane of vectors.
This procedure returns a vector, such that each lane has the maximum value
between the corresponding lanes in vectors a and b.
Inputs:a: An integer or a float vector.
b: An integer or a float vector.
Returns:
A vector containing with maximum values from corresponding lanes of a and b.
Operation:
for i in 0 ..< len(res) {
if a[i] > b[i] {
res[i] = a[i]
} else {
res[i] = b[i]
}
}
return res
Example:
+-----+-----+-----+-----+
a: | 0 | 1 | 2 | 3 |
+-----+-----+-----+-----+
+-----+-----+-----+-----+
b: | 0 | 2 | 1 | -1 |
+-----+-----+-----+-----+
res:
+-----+-----+-----+-----+
| 0 | 2 | 2 | 3 |
+-----+-----+-----+-----+
simd_min ¶
simd_min :: proc(a, b: #simd[N]T) -> #simd[N]T {…}
Minimum of each lane of vectors.
This procedure returns a vector, such that each lane has the minimum value
between the corresponding lanes in vectors a and b.
Inputs:a: An integer or a float vector.
b: An integer or a float vector.
Returns:
A vector containing with minimum values from corresponding lanes of a and b.
Operation:
for i in 0 ..< len(res) {
if a[i] < b[i] {
res[i] = a[i]
} else {
res[i] = b[i]
}
}
return res
Example:
+-----+-----+-----+-----+
a: | 0 | 1 | 2 | 3 |
+-----+-----+-----+-----+
+-----+-----+-----+-----+
b: | 0 | 2 | 1 | -1 |
+-----+-----+-----+-----+
res:
+-----+-----+-----+-----+
| 0 | 1 | 1 | -1 |
+-----+-----+-----+-----+
simd_mul ¶
simd_mul :: proc(a, b: #simd[N]T) -> #simd[N]T {…}
Multiply (component-wise) SIMD vectors.
This procedure returns a vector, where each lane holds the product of the
corresponding lanes of the vectors a and b.
Inputs:a: An integer or a float vector.
b: An integer or a float vector.
Returns:
A vector that is the product of two vectors.
Operation:
for i in 0 ..< len(res) {
res[i] = a[i] * b[i]
}
return res
Example:
+-----+-----+-----+-----+
a: | 2 | 2 | 2 | 2 |
+-----+-----+-----+-----+
+-----+-----+-----+-----+
b: | 0 | -1 | 2 | -3 |
+-----+-----+-----+-----+
res:
+-----+-----+-----+-----+
| 0 | -2 | 4 | -6 |
+-----+-----+-----+-----+
simd_nearest ¶
simd_nearest :: proc(a: #simd[N]any_float) -> #simd[N]any_float {…}
rounding to the nearest integral value; if two values are equally near, rounds to the even one
Compute the nearest integer of each lane in a SIMD vector.
simd_neg ¶
simd_neg :: proc(a: #simd[N]T) -> #simd[N]T {…}
Negation of a SIMD vector.
This procedure returns a vector where each lane is the negation of the
corresponding lane in the vector a.
Inputs:a: An integer or a float vector to negate.
Returns:
The negated version of the vector a.
Operation:
for i in 0 ..< len(res) {
res[i] = -a[i]
}
return res
Example:
+------+------+------+------+
a: | 0 | 1 | 2 | 3 |
+------+------+------+------+
res:
+------+------+------+------+
| 0 | -1 | -2 | -3 |
+------+------+------+------+
simd_reduce_add_ordered ¶
simd_reduce_add_ordered :: proc(a: #simd[N]T) -> T {…}
Performs a reduction of a #simd vector a, returning the result as a scalar. The return type matches the element-type T of the #simd vector input. See the following pseudocode:
simd_reduce_add_ordered :: proc(v: #simd[N]T) -> T {
result := simd_extract(v, 0)
for i in 1..<N {
e := simd_extract(v, i)
result = result + e
}
return result
}
Reduce a vector to a scalar by adding up all the lanes in an ordered fashion.
This procedure returns a scalar that is the ordered sum of all lanes. The
ordered sum may be important for accounting for precision errors in
floating-point computation, as floating-point addition is not associative,
that is (a+b)+c may not be equal to a+(b+c).
Inputs:a: The vector to reduce.
Result: Sum of all lanes, as a scalar.
Operation:
res := 0
for i in 0 ..< len(a) {
res += a[i]
}
simd_reduce_all ¶
simd_reduce_all :: proc(a: #simd[N]T) -> T where type_is_boolean(T) {…}
Reduce SIMD vector to a scalar by performing bitwise AND of all of the lanes.
This procedure returns a scalar that is the result of the bitwise AND operation between all of the lanes in a vector.
Inputs:a: The vector to reduce.
Result: Bitwise AND of all lanes, as a scalar.
Operation:
res := 0
for i in 0 ..< len(a) {
res &= a[i]
}
simd_reduce_and ¶
simd_reduce_and :: proc(a: #simd[N]T) -> T {…}
Performs a reduction of a #simd vector a, returning the result as a scalar. The return type matches the element-type T of the #simd vector input. See the following pseudocode:
simd_reduce_and :: proc(v: #simd[N]T) -> T {
result := simd_extract(v, 0)
for i in 1..<N {
e := simd_extract(v, i)
result = result & e
}
return result
}
Reduce a vector to a scalar by performing bitwise AND of all of the lanes.
This procedure returns a scalar that is the result of the bitwise AND operation between all of the lanes in a vector.
Inputs:a: The vector to reduce.
Result: Bitwise AND of all lanes, as a scalar.
Operation:
res := 0
for i in 0 ..< len(a) {
res &= a[i]
}
simd_reduce_any ¶
simd_reduce_any :: proc(a: #simd[N]T) -> T where type_is_boolean(T) {…}
Reduce SIMD vector to a scalar by performing bitwise OR of all of the lanes.
This procedure returns a scalar that is the result of the bitwise OR operation between all of the lanes in a vector.
Inputs:a: The vector to reduce.
Result: Bitwise OR of all lanes, as a scalar.
Operation:
res := 0
for i in 0 ..< len(a) {
res |= a[i]
}
simd_reduce_max ¶
simd_reduce_max :: proc(a: #simd[N]T) -> T {…}
Performs a reduction of a #simd vector a, returning the result as a scalar. The return type matches the element-type T of the #simd vector input. See the following pseudocode:
simd_reduce_max :: proc(v: #simd[N]T) -> T {
result := simd_extract(v, 0)
for i in 1..<N {
e := simd_extract(v, i)
result = max(result, e)
}
return result
}
Reduce a vector to a scalar by finding the maximum value between all of the lanes.
This procedure returns a scalar that is the maximum value of all the lanes in a vector.
Inputs:a: The vector to reduce.
Result: Maximum value of all lanes, as a scalar.
Operation:
res := 0
for i in 0 ..< len(a) {
res = max(res, a[i])
}
simd_reduce_min ¶
simd_reduce_min :: proc(a: #simd[N]T) -> T {…}
Performs a reduction of a #simd vector a, returning the result as a scalar. The return type matches the element-type T of the #simd vector input. See the following pseudocode:
simd_reduce_min :: proc(v: #simd[N]T) -> T {
result := simd_extract(v, 0)
for i in 1..<N {
e := simd_extract(v, i)
result = min(result, e)
}
return result
}
Reduce a vector to a scalar by finding the minimum value between all of the lanes.
This procedure returns a scalar that is the minimum value of all the lanes in a vector.
Inputs:a: The vector to reduce.
Result: Minimum value of all lanes, as a scalar.
Operation:
res := 0
for i in 0 ..< len(a) {
res = min(res, a[i])
}
simd_reduce_mul_ordered ¶
simd_reduce_mul_ordered :: proc(a: #simd[N]T) -> T {…}
Performs a reduction of a #simd vector a, returning the result as a scalar. The return type matches the element-type T of the #simd vector input. See the following pseudocode:
simd_reduce_mul_ordered :: proc(v: #simd[N]T) -> T {
result := simd_extract(v, 0)
for i in 1..<N {
e := simd_extract(v, i)
result = result * e
}
return result
}
Reduce a vector to a scalar by multiplying all the lanes in an ordered fashion.
This procedure returns a scalar that is the ordered product of all lanes.
The ordered product may be important for accounting for precision errors in
floating-point computation, as floating-point multiplication is not associative,
that is (a*b)*c may not be equal to a*(b*c).
Inputs:a: The vector to reduce.
Result: Product of all lanes, as a scalar.
Operation:
res := 1
for i in 0 ..< len(a) {
res *= a[i]
}
simd_reduce_or ¶
simd_reduce_or :: proc(a: #simd[N]T) -> T {…}
Performs a reduction of a #simd vector a, returning the result as a scalar. The return type matches the element-type T of the #simd vector input. See the following pseudocode:
simd_reduce_or :: proc(v: #simd[N]T) -> T {
result := simd_extract(v, 0)
for i in 1..<N {
e := simd_extract(v, i)
result = result | e
}
return result
}
Reduce a vector to a scalar by performing bitwise OR of all of the lanes.
This procedure returns a scalar that is the result of the bitwise OR operation between all of the lanes in a vector.
Inputs:a: The vector to reduce.
Result: Bitwise OR of all lanes, as a scalar.
Operation:
res := 0
for i in 0 ..< len(a) {
res |= a[i]
}
simd_reduce_xor ¶
simd_reduce_xor :: proc(a: #simd[N]T) -> T {…}
Performs a reduction of a #simd vector a, returning the result as a scalar. The return type matches the element-type T of the #simd vector input. See the following pseudocode:
simd_reduce_xor :: proc(v: #simd[N]T) -> T {
result := simd_extract(v, 0)
for i in 1..<N {
e := simd_extract(v, i)
result = result ~ e
}
return result
}
Reduce SIMD vector to a scalar by performing bitwise XOR of all of the lanes.
This procedure returns a scalar that is the result of the bitwise XOR operation between all of the lanes in a vector.
Inputs:a: The vector to reduce.
Result: Bitwise XOR of all lanes, as a scalar.
Operation:
res := 0
for i in 0 ..< len(a) {
res ~= a[i]
}
simd_replace ¶
simd_replace :: proc(a: #simd[N]T, idx: uint, elem: T) -> #simd[N]T {…}
Replaces a single scalar element from a #simd vector and returns a new vector.
Replace the value in a vector's lane.
This procedure places a scalar value at the lane corresponding to the given index of the vector.
Inputs:a: The vector to replace a lane in.
idx: The lane index.
elem: The scalar to place.
Returns:
Vector with the specified lane replaced.
Operation:
a[idx] = elem
simd_runtime_swizzle ¶
simd_runtime_swizzle :: proc(table: #simd[N]T, indices: #simd[N]T) -> #simd[N]T where type_is_integer(T) {…}
Runtime Equivalent to Shuffle.
Performs element-wise table lookups using runtime indices. Each element in the indices vector selects an element from the table vector. The indices are automatically masked to prevent out-of-bounds access.
This operation is hardware-accelerated on most platforms when using 8-bit integer vectors. For other element types or unsupported vector sizes, it falls back to software emulation.
Inputs:table: The lookup table vector (should be power-of-2 size for correct masking).
indices: The indices vector (automatically masked to valid range).
Returns:
A vector where result[i] = table[indices[i] & (table_size-1)].
Operation:
for i in 0 ..< len(indices) {
masked_index := indices[i] & (len(table) - 1)
result[i] = table[masked_index]
}
return result
Implementation:
| Platform | Lane Size | Implementation | |-------------|-------------------------------------------|---------------------| | x86-64 | pshufb (16B), vpshufb (32B), AVX512 (64B) | Single vector | | ARM64 | tbl1 (16B), tbl2 (32B), tbl4 (64B) | Automatic splitting | | ARM32 | vtbl1 (8B), vtbl2 (16B), vtbl4 (32B) | Automatic splitting | | WebAssembly | i8x16.swizzle (16B), Emulation (>16B) | Mixed | | Other | Emulation | Software |
Example:
import "core:simd"
import "core:fmt"
runtime_swizzle_example :: proc() {
table := simd.u8x16{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
indices := simd.u8x16{15, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}
result := simd.runtime_swizzle(table, indices)
fmt.println(result) // Expected: {15, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14}
}
simd_saturating_add ¶
simd_saturating_add :: proc(a, b: #simd[N]T) -> #simd[N]T where type_is_integer(T) {…}
Saturated addition of SIMD vectors.
The saturated sum is a just like a normal sum, except the treatment of the result upon overflow or underflow is different. In saturated operations, the result is not wrapped to the bit-width of the lane, and instead is kept clamped between the minimum and the maximum values of the lane type.
This procedure returns a vector where each lane is the saturated sum of the
corresponding lanes of vectors a and b.
Inputs:a: An integer vector.
b: An integer vector.
Returns:
The saturated sum of the two vectors.
Operation:
for i in 0 ..< len(res) {
switch {
case b[i] >= max(type_of(a[i])) - a[i]: // (overflow of a[i])
res[i] = max(type_of(a[i]))
case b[i] <= min(type_of(a[i])) - a[i]: // (underflow of a[i])
res[i] = min(type_of(a[i]))
} else {
res[i] = a[i] + b[i]
}
}
return res
Example:
// An example for a 4-lane vector `a` of 8-bit signed integers.
+-----+-----+-----+-----+
a: | 0 | 255 | 2 | 3 |
+-----+-----+-----+-----+
+-----+-----+-----+-----+
b: | 1 | 3 | 2 | -1 |
+-----+-----+-----+-----+
res:
+-----+-----+-----+-----+
| 1 | 255 | 4 | 2 |
+-----+-----+-----+-----+
simd_saturating_sub ¶
simd_saturating_sub :: proc(a, b: #simd[N]T) -> #simd[N]T where type_is_integer(T) {…}
Saturated subtraction of 2 lanes of vectors.
The saturated difference is a just like a normal difference, except the treatment of the result upon overflow or underflow is different. In saturated operations, the result is not wrapped to the bit-width of the lane, and instead is kept clamped between the minimum and the maximum values of the lane type.
This procedure returns a vector where each lane is the saturated difference of
the corresponding lanes of vectors a and b.
Inputs:a: An integer vector to subtract from.
b: An integer vector.
Returns:
The saturated difference of the two vectors.
Operation:
for i in 0 ..< len(res) {
switch {
case b[i] >= max(type_of(a[i])) + a[i]: // (overflow of a[i])
res[i] = max(type_of(a[i]))
case b[i] <= min(type_of(a[i])) + a[i]: // (underflow of a[i])
res[i] = min(type_of(a[i]))
} else {
res[i] = a[i] - b[i]
}
}
return res
Example:
// An example for a 4-lane vector `a` of 8-bit signed integers.
+-----+-----+-----+-----+
a: | 0 | 255 | 2 | 3 |
+-----+-----+-----+-----+
+-----+-----+-----+-----+
b: | 3 | 3 | 2 | -1 |
+-----+-----+-----+-----+
res:
+-----+-----+-----+-----+
| 0 | 252 | 0 | 4 |
+-----+-----+-----+-----+
simd_scatter ¶
simd_scatter :: proc(ptr: rawptr, val: #simd[N]T, mask: #simd[N]U) where type_is_integer(U) || type_is_boolean(U) {…}
Perform a scatter store from a vector.
A scatter operation is a memory store operation that stores values from a vector into multiple memory locations. This operation is effectively the opposite of the gather operation.
Inputs:ptr: A vector of memory locations. Each masked location will be written
to with a value from the `val` vector. Pointers in this vector can be `nil` or any other invalid value if the corresponding value in the `mask` parameter is zero.
val: A vector of values to write to the memory locations.
mask: A vector of booleans or unsigned integers that decides which lanes
get written to memory. If the value of the mask is `true` (the lowest bit set), the corresponding lane is written into memory. Otherwise it's not written into memory.
Operation:
for i in 0 ..< len(ptr) {
if mask[i]&1 == 1 {
ptr[i]^ = val[i]
}
}
Example:
// Example below writes value `127` to the second element of two different
// vectors. The addresses of store destinations are written to the first and the
// third argument of the `ptr` vector, and the `mask` is set accordingly.
import "core:fmt"
import "core:simd"
simd_scatter_example :: proc() {
v1 := [4] f32 {1, 2, 3, 4};
v2 := [4] f32 {5, 6, 7, 8};
ptrs := #simd [4]rawptr { &v1[1], nil, &v2[1], nil }
mask := #simd [4]bool { true, false, true, false }
vals := #simd [4]f32 { 0x7f, 0x7f, 0x7f, 0x7f }
simd.scatter(ptrs, vals, mask)
fmt.println(v1)
fmt.println(v2)
}
[1, 127, 3, 4] [5, 127, 7, 8]
The graphic below shows how the data gets written into memory.
+-------------------+
mask: | 1 | 0 | 1 | 0 |
+-------------------+
| | | |
v X v X
+-------------------+
vals: | d0 | d1 | d2 | d3 |
+-------------------+
| \
v v
+-----------------------+
ptrs: | &m0 | nil | &m2 | nil |
+-----------------------+
simd_select ¶
Select values from one of the two vectors.
This procedure returns a vector, which has, on each lane a value from one of the
corresponding lanes in one of the two input vectors based on the cond
parameter. On each lane, if the value of the cond parameter is true (or
non-zero), the result lane will have a value from the true input vector,
otherwise the result lane will have a value from the false input vector.
Inputs:cond: The condition vector.
true: The first input vector.
false: The second input vector.
Result: The result of selecting values from the two input vectors.
Operation:
res = {}
for i in 0 ..< len(cond) {
if cond[i] {
res[i] = true[i]
} else {
res[i] = false[i]
}
}
return res
Example:
// The following example selects values from the two input vectors, `a` and `b`
// into a single vector.
import "core:fmt"
import "core:simd"
simd_select_example :: proc() {
a := #simd [4] f64 { 1,2,3,4 }
b := #simd [4] f64 { 5,6,7,8 }
cond := #simd[4] int { 1, 0, 1, 0 }
fmt.println(simd.select(cond,a,b))
}
<1, 6, 3, 8>
Graphically, the operation looks as follows. The t and f represent the
true and false vectors respectively:
0 1 2 3 0 1 2 3
+-----+-----+-----+-----+ +-----+-----+-----+-----+
t: | 1 | 2 | 3 | 4 | f: | 5 | 6 | 7 | 8 |
+-----+-----+-----+-----+ +-----+-----+-----+-----+
^ ^ ^ ^
| | | |
| | | |
| .--- | ----------------------' |
| | | .-----------------------------'
+-----+-----+-----+-----+
cond: | 1 | 0 | 1 | 0 |
+-----+-----+-----+-----+
^ ^ ^ ^
| | | |
+-----+-----+-----+-----+
res: | 1 | 5 | 3 | 6 |
+-----+-----+-----+-----+
simd_shl ¶
simd_shl :: proc(a: #simd[N]T, b: #simd[N]Unsigned_Integer) -> #simd[N]T {…}
Keeps Odin's behaviour: (x << y) if y <= mask else 0
Shift left lanes of a vector.
This procedure returns a vector, such that each lane holds the result of a
shift-left (aka shift-up) operation of the corresponding lane from vector a by the shift
amount from the corresponding lane of the vector b.
If the shift amount is greater than the bit-width of a lane, the result is 0
in the corresponding positions of the result.
Inputs:a: An integer vector of values to shift.
b: An unsigned integer vector of the shift amounts.
Result:
A vector, where each lane is the lane from a shifted left by the amount
specified in the corresponding lane of the vector b.
Operation:
for i in 0 ..< len(res) {
if b[i] < 8*size_of(a[i]) {
res[i] = a[i] << b[i]
} else {
res[i] = 0
}
}
return res
Example:
// An example for a 4-lane 8-bit signed integer vector `a`.
+-------+-------+-------+-------+
a: | 0x11 | 0x55 | 0x03 | 0xff |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
b: | 2 | 1 | 33 | 1 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+--------+
| 0x44 | 0xaa | 0 | 0xfe |
+-------+-------+-------+--------+
simd_shl_masked ¶
simd_shl_masked :: proc(a: #simd[N]T, b: #simd[N]Unsigned_Integer) -> #simd[N]T {…}
Similar to C's behaviour: x << (y & mask)
Shift left lanes of a vector (masked).
This procedure returns a vector, such that each lane holds the result of a
shift-left (aka shift-up) operation, of lane from the vector a by the shift
amount from the corresponding lane of the vector b.
The shift amount is wrapped (masked) to the bit-width of the lane.
Inputs:a: An integer vector of values to shift.
b: An unsigned integer vector of the shift amounts.
Result:
A vector, where each lane is the lane from a shifted left by the amount
specified in the corresponding lane of the vector b.
Operation:
for i in 0 ..< len(res) {
mask := 8*size_of(a[i]) - 1
res[i] = a[i] << (b[i] & mask)
}
return res
Example:
// An example for a 4-lane vector `a` of 8-bit signed integers.
+-------+-------+-------+-------+
a: | 0x11 | 0x55 | 0x03 | 0xff |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
b: | 2 | 1 | 33 | 1 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+--------+
| 0x44 | 0xaa | 0x06 | 0xfe |
+-------+-------+-------+--------+
simd_shr ¶
simd_shr :: proc(a: #simd[N]T, b: #simd[N]Unsigned_Integer) -> #simd[N]T {…}
Keeps Odin's behaviour: `(x >> y) if y <= mask else 0
Shift right lanes of a vector.
This procedure returns a vector, such that each lane holds the result of a
shift-right (aka shift-down) operation, of lane from the vector a by the shift
amount from the corresponding lane of the vector b.
If the shift amount is greater than the bit-width of a lane, the result is 0
in the corresponding positions of the result.
If the first vector is a vector of signed integers, the arithmetic shift operation is performed. Otherwise, if the first vector is a vector of unsigned integers, a logical shift is performed.
Inputs:a: An integer vector of values to shift.
b: An unsigned integer vector of the shift amounts.
Result:
A vector, where each lane is the lane from a shifted right by the amount
specified in the corresponding lane of the vector b.
Operation:
for i in 0 ..< len(res) {
if b[i] < 8*size_of(a[i]) {
res[i] = a[i] >> b[i]
} else {
res[i] = 0
}
}
return res
Example:
// An example for a 4-lane 8-bit signed integer vector `a`.
+-------+-------+-------+-------+
a: | 0x11 | 0x55 | 0x03 | 0xff |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
b: | 2 | 1 | 33 | 1 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+--------+
| 0x04 | 0x2a | 0 | 0xff |
+-------+-------+-------+--------+
simd_shr_masked ¶
simd_shr_masked :: proc(a: #simd[N]T, b: #simd[N]Unsigned_Integer) -> #simd[N]T {…}
Similar to C's behaviour: `x >> (y & mask)
Shift right lanes of a vector (masked).
This procedure returns a vector, such that each lane holds the result of a
shift-right (aka shift-down) operation, of lane from the vector a by the shift
amount from the corresponding lane of the vector b.
The shift amount is wrapped (masked) to the bit-width of the lane.
If the first vector is a vector of signed integers, the arithmetic shift operation is performed. Otherwise, if the first vector is a vector of unsigned integers, a logical shift is performed.
Inputs:a: An integer vector of values to shift.
b: An unsigned integer vector of the shift amounts.
Result:
A vector, where each lane is the lane from a shifted right by the amount
specified in the corresponding lane of the vector b.
Operation:
for i in 0 ..< len(res) {
mask := 8*size_of(a[i]) - 1
res[i] = a[i] >> (b[i] & mask)
}
return res
Example:
// An example for a 4-lane vector `a` of 8-bit signed integers.
+-------+-------+-------+-------+
a: | 0x11 | 0x55 | 0x03 | 0xff |
+-------+-------+-------+-------+
+-------+-------+-------+-------+
b: | 2 | 1 | 33 | 1 |
+-------+-------+-------+-------+
res:
+-------+-------+-------+--------+
| 0x04 | 0x2a | 0x01 | 0xff |
+-------+-------+-------+--------+
simd_shuffle ¶
Reorder the lanes of two SIMD vectors.
This procedure returns a vector, containing the scalars from the lanes of two vectors, according to the provided indices vector. Each index in the indices vector specifies, the lane of the scalar from one of the two input vectors, which will be written at the corresponding position of the result vector. If the index is within bounds 0 ..< len(A), it corresponds to the indices of the first input vector. Otherwise the index corresponds to the indices of the second input vector.
Inputs:a: The first input vector.
b: The second input vector.
indices: The indices.
Result: Input vectors, shuffled according to the indices.
Operation:
res = {}
for i in 0 ..< len(indices) {
idx = indices[i];
if idx < len(a) {
res[i] = a[idx]
} else {
res[i] = b[idx]
}
}
return res
Example:
// The example below shows how the indices are used to determine lanes of the
// input vector that are shuffled into the result vector.
import "core:fmt"
import "core:simd"
simd_shuffle_example :: proc() {
a := #simd [4]f32 { 1, 2, 3, 4 }
b := #simd [4]f32 { 5, 6, 7, 8 }
res := simd.shuffle(a, b, 0, 4, 2, 5)
fmt.println(res)
}
<1, 5, 3, 6>
The graphical representation of the operation is as follows. The idx vector in
the picture represents the indices parameter:
0 1 2 3 4 5 6 7
+-----+-----+-----+-----+ +-----+-----+-----+-----+
a: | 1 | 2 | 3 | 4 | b: | 5 | 6 | 7 | 8 |
+-----+-----+-----+-----+ +-----+-----+-----+-----+
^ ^ ^ ^
| | | |
| | | |
| .--- | ----------------' |
| | | .-----------------'
+-----+-----+-----+-----+
idx: | 0 | 4 | 2 | 5 |
+-----+-----+-----+-----+
^ ^ ^ ^
| | | |
+-----+-----+-----+-----+
res: | 1 | 5 | 3 | 6 |
+-----+-----+-----+-----+
simd_sub ¶
simd_sub :: proc(a, b: #simd[N]T) -> #simd[N]T {…}
Subtract SIMD vectors.
This procedure returns a vector, where each lane holds the difference between
the corresponding lanes of the vectors a and b. The lanes from the vector
b are subtracted from the corresponding lanes of the vector a.
Inputs:a: An integer or a float vector to subtract from.
b: An integer or a float vector.
Returns:
A vector that is the difference of two vectors, a - b.
Operation:
for i in 0 ..< len(res) {
res[i] = a[i] - b[i]
}
return res
Example:
+-----+-----+-----+-----+
a: | 2 | 2 | 2 | 2 |
+-----+-----+-----+-----+
+-----+-----+-----+-----+
b: | 0 | 1 | 2 | 3 |
+-----+-----+-----+-----+
res:
+-----+-----+-----+-----+
| 2 | 1 | 0 | -1 |
+-----+-----+-----+-----+
simd_to_bits ¶
simd_to_bits :: proc(v: #simd[N]T) -> #simd[N]Integer where size_of(T) == size_of(Integer), type_is_unsigned(Integer) {…}
Transmute a SIMD vector into an integer vector.
simd_trunc ¶
simd_trunc :: proc(a: #simd[N]any_float) -> #simd[N]any_float {…}
lane-wise trunc
Truncate each lane in a SIMD vector.
soa_struct ¶
A call-like way to construct an #soa struct. Possibly to be deprecated in the future.
sqrt ¶
sqrt :: proc(x: $T) -> T where type_is_float(T) || (type_is_simd_vector(T) && type_is_float(type_elem_type(T))) {…}
Returns the square root of a value. If the input value is negative, this is platform defined behaviour.
syscall ¶
system call for Linux and Darwin Only
syscall_bsd ¶
system call FreeBSD, NetBSD, etc.
type_base_type ¶
type_base_type :: proc($T: typeid) -> type {…}
Returns the type without any distinct indirection e.g. Foo :: distinct int, type_base_type(Foo) == int
type_bit_set_elem_type ¶
type_bit_set_elem_type :: proc($T: typeid) -> typeid where type_is_bit_set(T) {…}
Returns the element type of a bit_set T.
type_bit_set_underlying_type ¶
type_bit_set_underlying_type :: proc($T: typeid) -> typeid where type_is_bit_set(T) {…}
Returns the underlying/backing type of a bit_set T rather than the element type.
Example:
assert(type_bit_set_underlying_type(bit_set[0..<8]) == u8)
assert(type_bit_set_underlying_type(bit_set[Enum; int]) == int)
type_convert_variants_to_pointers ¶
type_convert_variants_to_pointers :: proc($T: typeid) -> typeid where type_is_union(T) {…}
Returns a type which converts all of the variants of a union to be pointer types of those variants.
Example:
Foo :: union {A, B, C}
type_convert_variants_to_pointers(Foo) == union {^A, ^B, ^C}
type_core_type ¶
type_core_type :: proc($T: typeid) -> type {…}
Returns the type without any distinct indirection and the underlying integer type for an enum or bit_set e.g. Foo :: distinct int, type_core_type(Foo) == int, or Bar :: enum u8 {A}, type_core_type(Bar) == u8, or Baz :: bit_set[Bar; u32], type_core_type(Baz) == u32
type_elem_type ¶
type_elem_type :: proc($T: typeid) -> type {…}
Returns the element type of an compound type.
Complex number: the underlying float type (e.g. complex64 -> f32)
Quaternion: the underlying float type (e.g. quaternion256 -> f64)
Pointer: the base type (e.g. ^T -> T)
Array: the element type (e.g. [N]T -> T)
Enumerated Array: the element type (e.g. [Enum]T -> T)
Slice: the element type (e.g. []T -> T)
Dynamic Array: the element type (e.g. [dynamic]T -> T)
type_equal_proc ¶
type_equal_proc :: proc($T: typeid) -> (equal: proc "contextless" (rawptr, rawptr) -> bool) where type_is_comparable(T) {…}
Returns the underlying procedure that is used to compare pointers to two values of the same type together. This is used by the map type and general complicated comparisons.
type_field_type ¶
Returns type of the field name on the type T. Note: the field must exist otherwise this will not compile.
type_has_field ¶
Returns true if the field name exists on the type T.
type_has_nil ¶
Types that support nil:
rawptr
any
cstring
cstring16
typeid
enum
bit_set
Slices
proc values
Pointers
#soa Pointers
Multi-Pointers
Dynamic Arrays
map
union without the #no_nil directive
#soa slices
#soa dynamic arrays
type_hasher_proc ¶
type_hasher_proc :: proc($T: typeid) -> (hasher: proc "contextless" (data: rawptr, seed: uintptr) -> uintptr) where type_is_comparable(T) {…}
Returns the underlying procedure that is used to hash a pointer to a value used by the map type.
type_is_any ¶
Return true if the type is derived from any
type_is_array ¶
Returns true if the base-type is a fixed-length array, i.e. [N]T
type_is_bit_set ¶
Returns true if the base-type is a bit_set
type_is_boolean ¶
Return true if the type is derived from any boolean type: bool, b8, b16, b32, b64
type_is_comparable ¶
Returns true if the type is comparable, which allows for the use of == and != binary operators.
One of the following non-compound types (as well as any distinct forms): rune, string, cstring, string16, cstring16, typeid, pointer, #soa related pointer, multi-pointer, enum, procedure, matrix, bit_set, #simd vector.
One of the following compound types (as well as any distinct forms): any array or enumerated array where its element type is also comparable; any struct where all of its fields are comparable; any struct #raw_union were all of its fields are simply comparable (see type_is_simple_compare); any union where all of its variants are comparable.
type_is_complex ¶
Return true if the type is derived from any complex type: complex32, complex64, complex128
type_is_cstring ¶
Returns true if the type is derived from the cstring type
type_is_cstring16 ¶
Returns true if the type is derived from the cstring16 type
type_is_dereferenceable ¶
Must be a pointer type ^T (not rawptr) or an #soa related pointer type.
type_is_dynamic_array ¶
Returns true if the base-type is a dynamic array, i.e. [dynamic]T
type_is_endian_big ¶
Returns true if the type is big endian specific or it is a platform native layout which is also big endian. Example: type_is_endian_big(u32be) == true, type_is_endian_big(u32le) == false, type_is_endian_big(u32) == (ODIN_ENDIAN == .Big)
type_is_endian_little ¶
Returns true if the type is little endian specific or it is a platform native layout which is also little endian. Example: type_is_endian_little(u32le) == true, type_is_endian_little(u32be) == false, type_is_endian_little(u32) == (ODIN_ENDIAN == .Little)
type_is_endian_platform ¶
Returns true if the type uses the platform native layout rather than a specific layout. Example: type_is_endian_platform(u32) == true, type_is_endian_platform(u32le) == false, type_is_endian_platform(u32be) == false
type_is_enumerated_array ¶
Returns true if the base-type is a enumerated array, i.e. [Some_Enum]T
type_is_float ¶
Return true if the type is derived from any float type
type_is_indexable ¶
Returns true if a value of this type can indexed:
string or string16
Any fixed-length array
Any slice
Any dynamic array
Any map
Any multi-pointer
Any enumerated array
Any matrix
type_is_integer ¶
Return true if the type is derived from any integer type
type_is_map ¶
Returns true if the base-type is a map, i.e. map[K]V
type_is_matrix ¶
Returns true if the base-type is a matrix
type_is_matrix_column_major ¶
type_is_matrix_column_major :: proc($T: typeid) -> bool where type_is_matrix(T) {…}
Returns true if the type passed is a matrix using #column_major ordering, this intrinsic only allows for matrices and will not compile otherwise. Note: The default matrix layout is #column_major.
type_is_matrix_row_major ¶
type_is_matrix_row_major :: proc($T: typeid) -> bool where type_is_matrix(T) {…}
Returns true if the type passed is a matrix using #row_major ordering, this intrinsic only allows for matrices and will not compile otherwise. Note: The default matrix layout is #column_major.
type_is_multi_pointer ¶
Returns true if the base-type is a multi pointer, i.e. [^]T
type_is_nearly_simple_compare ¶
easily compared using memcmp (== and !=) (including floats)
type_is_numeric ¶
Returns true if it is a "numeric" type in nature:
Any integer Any float Any complex number Any quaternion Any enum Any fixed-length array of a numeric type
type_is_ordered ¶
Returns true if the type is an integer, float, rune, any string, pointer, or multi-pointer
type_is_ordered_numeric ¶
Returns true if the type is an integer, float, or rune
type_is_pointer ¶
Returns true if the base-type is a pointer, i.e. ^T or rawptr
type_is_quaternion ¶
Return true if the type is derived from any quaternion type: quaternion64, quaternion128, quaternion256
type_is_raw_union ¶
Returns true if the base-type is a struct #raw_union
type_is_rune ¶
Return true if the type is derived from the rune type
type_is_simd_vector ¶
Returns true if the base-type is a simd vector, i.e. #simd[N]T
type_is_simple_compare ¶
easily compared using memcmp (== and !=) (not including floats)
type_is_slice ¶
Returns true if the base-type is a slice, i.e. []T
type_is_sliceable ¶
Returns true if a value of this type can indexed:
string or string16
Any fixed-length array
Any slice
Any dynamic array
Any multi-pointer
type_is_specialization_of ¶
Returns true if the type passed is a specialization of a parametric polymorphic type.
Example:
Foo :: struct($T: typeid) {x: T}
assert(type_is_specialization_of(Foo(int)) == true)
assert(type_is_specialization_of(Foo) == false)
assert(type_is_specialization_of(i32) == false)
type_is_specialized_polymorphic_record ¶
Returns true if the record type (struct or union) passed is a specialized polymorphic record. Returns false when the type is not polymorphic in the first place.
type_is_string ¶
Returns true if the type is derived from any string type: string, cstring, string16, cstring16
type_is_string16 ¶
Returns true if the type is derived from the string16 type AND not cstring16
type_is_struct ¶
Returns true if the base-type is a struct
type_is_subtype_of ¶
Returns true if T is a subtype (i.e. using was applied on a field) to type U.
type_is_typeid ¶
Return true if the type is derived from typeid
type_is_union ¶
Returns true if the base-type is a union, but not struct #raw_union
type_is_unsigned ¶
Returns true if the type is an unsigned integer or an enum backed by an unsigned integer, and false otherwise for any other type
type_is_unspecialized_polymorphic_record ¶
Returns true if the record type (struct or union) passed is a unspecialized polymorphic record. Returns false when the type is not polymorphic in the first place.
type_is_valid_map_key ¶
Any comparable type which is not-untyped nor generic.
type_is_valid_matrix_elements ¶
Any integer, float, or complex number type (not-untyped).
type_is_variant_of ¶
type_is_variant_of :: proc($U, $V: typeid) -> bool where type_is_union(U) {…}
Returns true if the V is a variant of the union type U.
Example:
Foo:: union {i32, f32}
assert(type_is_variant_of(Foo, i32) == true)
assert(type_is_variant_of(Foo, f32) == true)
assert(type_is_variant_of(Foo, string) == false)
type_merge ¶
type_merge :: proc($U, $V: typeid) -> typeid where type_is_union(U), type_is_union(V) {…}
Merges to union's variants into one bigger union.
Note: the merging is done is order and duplicate variant types are ignored.
Example:
A :: union{i32, f32, string}
B :: union{bool, complex64}
C :: union{string, bool, i32}
type_merge(A, B) == union{i32, f32, string, bool, complex64}
type_merge(A, C) == union{i32, f32, string, bool}
type_merge(B, C) == union{bool, complex64, string, i32}
type_merge(C, A) == union{string, bool, i32, f32}
type_polymorphic_record_parameter_count ¶
Returns the number of parametric polymorphic parameters to a parametric polymorphic record type (struct or union). Fails if the type is not such a type.
type_polymorphic_record_parameter_value ¶
Returns the value of a specifialized parametric polymorphic record type (struct or union) at a specified index. Fails if the type is not such a type.
type_proc_parameter_count ¶
type_proc_parameter_count :: proc($T: typeid) -> int where type_is_proc(T) {…}
Returns the number of parameters a procedure type has.
Example:
assert(type_proc_parameter_count(proc(i32, f32) -> bool) == 2)
type_proc_parameter_type ¶
type_proc_parameter_type :: proc($T: typeid, index: int) -> typeid where type_is_proc(T) {…}
Returns the type of a parameter of a procedure type at the specified index.
Example:
assert(type_proc_parameter_type(proc(i32, f32) -> bool, 1) == f32)
type_proc_return_count ¶
type_proc_return_count :: proc($T: typeid) -> int where type_is_proc(T) {…}
Returns the number of return values a procedure type has.
Example:
assert(type_proc_return_count(proc(i32, f32) -> bool) == 1)
type_proc_return_type ¶
type_proc_return_type :: proc($T: typeid, index: int) -> typeid where type_is_proc(T) {…}
Returns the type of a return value of a procedure type at the specified index.
Example:
assert(type_proc_return_type(proc(i32, f32) -> bool, 0) == bool)
type_struct_field_count ¶
type_struct_field_count :: proc($T: typeid) -> int where type_is_struct(T) {…}
Returns the number of fields in a struct type.
type_struct_has_implicit_padding ¶
type_struct_has_implicit_padding :: proc($T: typeid) -> bool where type_is_struct(T) {…}
Returns whether the struct has any implicit padding to ensure correct alignment for the fields.
Example:
Foo :: struct {x: u8, y: u32}
assert(type_struct_has_implicit_padding(Foo) == true)
type_union_base_tag_value ¶
type_union_base_tag_value :: proc($T: typeid) -> int where type_is_union(U) {…}
Returns the first valid tag value for the first variant. If #no_nil is used, the returned value will be 0, otherwise 1 will be returned.
Example:
assert(type_union_base_tag_value(union {i32, f32}) == 1)
assert(type_union_base_tag_value(union #no_nil {i32, f32}) == 0)
assert(type_union_base_tag_value(Maybe(rawptr}) == 1)
type_union_tag_offset ¶
type_union_tag_offset :: proc($T: typeid) -> uintptr where type_is_union(T) {…}
Returns the offset to the tag in bytes from the start of the union. If no tag is used (e.g. 'Maybe(Pointer_Like_Type)`), then size of the variant block space is returned.
Note: unions store the tag after the variant block space.
type_union_tag_type ¶
type_union_tag_type :: proc($T: typeid) -> typeid where type_is_union(T) {…}
Returns the type used to store the tag for a union. If no tag is used (e.g. Maybe(Pointer_Like_Type)), then u8 is returned.
Possible tag types: u8, u16, u32, u64
type_union_variant_count ¶
type_union_variant_count :: proc($T: typeid) -> int where type_is_union(T) {…}
Returns the number of possible variants a union can be (excluding a possible nil state).
Example:
assert(type_union_variant_count(union {i32, f32}) == 2)
assert(type_union_variant_count(union {i32, f32, b32}) == 3)
assert(type_union_variant_count(union {}) == 0)
type_variant_index_of ¶
type_variant_index_of :: proc($U, $V: typeid) -> int where type_is_union(U) {…}
Returns the index of a variant V of a union U.
Example:
Foo :: union{i32, f32, string}
assert(type_variant_type_of(Foo, i32) == 0)
assert(type_variant_type_of(Foo, f32) == 1)
assert(type_variant_type_of(Foo, string) == 2)
type_variant_type_of ¶
type_variant_type_of :: proc($T: typeid, $index: int) -> typeid where type_is_union(T) {…}
Returns the type of a union T's variant at a specified index.
Example:
Foo :: union{i32, f32, string}
assert(type_variant_type_of(Foo, 0) == i32)
assert(type_variant_type_of(Foo, 1) == f32)
assert(type_variant_type_of(Foo, 2) == string)
unaligned_load ¶
unaligned_load :: proc(src: ^$T) -> T {…}
Performs a load on an unaligned value src.
unaligned_store ¶
unaligned_store :: proc(dst: ^$T, val: T) -> T {…}
Performs a store on an unaligned value dst.
valgrind_client_request ¶
valgrind_client_request :: proc(default: uintptr, request: uintptr, a0, a1, a2, a3, a4: uintptr) -> uintptr {…}
volatile_load ¶
volatile_load :: proc(dst: ^$T) -> T {…}
Tells the optimizing backend of a compiler to not change the number of 'volatile' operations nor change their order of execution relative to other 'volatile' operations. Optimizers are allowed to change the order of volatile operations relative to non-volatile operations.
Note: This has nothing to do with Java's 'volatile' and has no cross-thread synchronization behaviour. Use atomics if this behaviour is wanted.
volatile_store ¶
volatile_store :: proc(dst: ^$T, val: T) {…}
Tells the optimizing backend of a compiler to not change the number of 'volatile' operations nor change their order of execution relative to other 'volatile' operations. Optimizers are allowed to change the order of volatile operations relative to non-volatile operations.
Note: This has nothing to do with Java's 'volatile' and has no cross-thread synchronization behaviour. Use atomics if this behaviour is wanted.
wasm_memory_atomic_notify32 ¶
Wakes threads waiting on the address indicated by ptr, up to the given maximum (waiters). If waiters is zero, no threads are woken up. Threads previously blocked with wasm_memory_atomic_wait32 will be woken up.
Returns:
The number of threads woken up.
wasm_memory_atomic_wait32 ¶
Blocks the calling thread for a given duration if the value pointed to by ptr is equal to the value of expected.
timeout_ns is the maximum number of nanoseconds the calling thread will be blocked for. If timeout_ns is negative, the calling thread will be blocked forever.
Returns:
0: the thread blocked and then was woken up
1: the loaded value from ptr did not match expected, the thread did not block
2: the thread blocked, but the timeout expired
x86_cpuid ¶
x86 Targets Only (i386, amd64)
Implements the cpuid instruction.
x86_xgetbv ¶
x86 Targets Only (i386, amd64)
Implements in xgetbv instruction.