SIMD API Reference

CPU SIMD intrinsics for vector operations (SSE2/AVX2/NEON)

Implementation: Requires C runtime (simd_runtime.c) and platform-specific SIMD support. SSE2/AVX2 on x86_64, NEON on ARM64. Functions are extern declarations that link to platform intrinsics.

Import

U std/simd

Overview

The simd module provides wrappers for CPU SIMD (Single Instruction Multiple Data) intrinsics, supporting x86_64 SSE2/AVX2 and ARM NEON instruction sets. It falls back to scalar operations when SIMD is not available.

Constants

Vector Widths

ConstantValueDescription
SIMD_128128SSE2 / NEON
SIMD_256256AVX2
SIMD_512512AVX-512

Element Counts

ConstantValueDescription
F32X4_SIZE4128-bit float vector
F32X8_SIZE8256-bit float vector
F64X2_SIZE2128-bit double vector
F64X4_SIZE4256-bit double vector
I32X4_SIZE4128-bit int vector
I32X8_SIZE8256-bit int vector

Struct

SimdVec

S SimdVec {
    data: i64,      # Pointer to aligned memory
    len: i64,       # Number of elements
    elem_size: i64, # Size of each element (4 for f32, 8 for f64)
    width: i64      # SIMD width (128, 256, 512)
}

A SIMD-friendly vector that stores elements in aligned memory for vectorized operations.

Key Operations

The module provides vectorized arithmetic operations (add, sub, mul, div), dot product, distance calculations, and reduction operations that automatically use the best available SIMD instruction set.

Example

U std/simd

F main() {
    # Create SIMD vectors
    a := SimdVec { data: ptr_a, len: 4, elem_size: 8, width: SIMD_256 }
    b := SimdVec { data: ptr_b, len: 4, elem_size: 8, width: SIMD_256 }
}