Tensor

c_tensor.h provides the generic, type-erased N-dimensional array that underpins every typed wrapper in csalt for one-dimensional arrays, two-dimensional matrices, and higher-rank tensors. It stores elements as raw bytes in a contiguous allocator-managed buffer and resolves element size, type identity, shape, strides, and allocator at initialisation time, caching all of them on the struct so that no registry lookups or virtual dispatch overhead occurs during normal operations.

Runtime behaviour is governed by a tensor_mode_t field set at initialisation:

TENSOR_STRUCT — all slots are zero-initialised at construction and considered live for the lifetime of the tensor. len equals alloc equals the product of all shape dimensions. Element access is bounded by alloc and remains valid after clear_tensor. This mode is used for matrices and higher-rank tensors.
ARRAY_STRUCT — len tracks the populated element count; slots in [len, alloc) are inaccessible until exposed by a push operation. Element access is bounded by len. Automatic reallocation on push is optionally enabled via the growth flag. This mode is used for one-dimensional dynamic arrays.

The generic API requires a dtype_id_t argument on every call that reads or writes element data. This type tag is checked against the tag fixed at initialisation and any mismatch is returned as TYPE_MISMATCH before any data is touched.

All memory is managed through an allocator_vtable_t supplied by the caller. csalt does not assume a default allocator — the caller must explicitly provide one at every call site that allocates memory (initialisation and copy). See C Allocator Overview for the full allocator API, the list of available implementations, and guidance on writing your own.

Error handling follows the expected value pattern: operations that produce a new tensor_t return a tensor_expect_t whose has_value field distinguishes success from failure; in-place mutations return an error_code_t directly. Callers must always check the result before using the value.

#include "c_tensor.h"

/* Choose an allocator — see :ref:`allocator_file` for all options. */
allocator_vtable_t alloc = heap_allocator();

/* Create a 3x4 matrix of floats (TENSOR_STRUCT). */
size_t shape[] = { 3, 4 };
tensor_expect_t r = init_tensor(2, shape, FLOAT_TYPE, alloc);
if (!r.has_value) { /* handle r.u.error */ }
tensor_t* t = r.u.value;

/* Write to element (1, 2) using an N-dimensional index. */
float val = 1.5f;
size_t idx[] = { 1, 2 };
set_tensor_nd_index(t, idx, &val, FLOAT_TYPE);

/* Read it back. */
float out = 0.0f;
get_tensor_nd_index(t, idx, &out, FLOAT_TYPE);  /* out == 1.5f */

return_tensor(t);

Note

For day-to-day use with a single element type, prefer a typed wrapper. The generic API is intended for code that must operate on tensors of arbitrary or runtime-determined dtype, such as container libraries, serialisers, or generic algorithms.

Recommended Allocators

Different allocators are appropriate depending on how tensors are constructed, resized, and accessed:

HeapAllocator

Best for:
- general-purpose dynamic arrays and matrices
- frequent resizing of ARRAY_STRUCT tensors
- applications where simplicity and correctness are priorities
The heap allocator provides flexible and predictable behaviour for both fixed-shape tensors and growing arrays.
ArenaAllocator

Best for:
- fixed-shape tensors built once and discarded together
- batch construction of many small tensors within a single lifetime
- intermediate tensors in algorithms
Arena allocation is efficient when tensors are not resized after construction. ARRAY_STRUCT tensors that grow frequently may leave unused memory in the arena.
BuddyAllocator

Best for:
- large fixed-shape tensors or matrices
- systems where memory fragmentation must be controlled
- long-lived tensors with periodic growth
The buddy allocator handles large contiguous allocations well and reduces fragmentation compared to a general heap.
PoolAllocator / SlabAllocator

Best for:
- fixed-capacity tensors of uniform shape
- performance-critical paths with predictable memory use
- many small tensors of identical rank and shape
These allocators are not well suited for ARRAY_STRUCT tensors that grow, but can be very efficient when shape and capacity are known in advance.

Structs

struct tensor_t

struct tensor_expect_t

enum tensor_mode_t

Values:

enumerator TENSOR_STRUCT

enumerator ARRAY_STRUCT

Initialisation and Teardown

tensor_expect_t init_tensor(uint8_t ndim, const size_t *shape, dtype_id_t dtype, allocator_vtable_t alloc_v)

Initialize a new fixed-shape tensor.

Allocates the tensor_t struct (including the inline FAM block holding shape and strides) and its data buffer through the provided allocator vtable. The dtype must be registered in the dtype registry before calling this function. data_size is resolved from the registry once at init and cached on the struct — no further registry lookups are performed during the lifetime of the tensor.

The tensor is created in TENSOR_FIXED_SHAPE mode. All allocated slots are considered live at construction time, so the data buffer is zero-initialised. len is set equal to alloc, which equals the product of all shape dimensions. growth is set to false. To create a TENSOR_DYNAMIC_1D tensor, initialise with ndim == 1 and then set mode and growth via the typed wrapper init function (e.g. init_array).

Strides are computed in C-order (row-major): strides[ndim - 1] = data_size strides[i] = strides[i + 1] * shape[i + 1] for i < ndim - 1

Parameters:

ndim – Number of dimensions. Must be > 0. Values above a few dozen are unusual in practice; the field is uint8_t so the maximum is 255.
shape – Array of ndim dimension sizes. Must not be NULL. Each shape[i] must be > 0. The product of all shape dimensions must not overflow size_t.
dtype – Type identifier. Must not be UNKNOWN_TYPE and must be registered in the dtype registry before this call.
alloc_v – Allocator vtable used for all memory operations. alloc_v.allocate must not be NULL.

Returns:

tensor_expect_t with has_value true and a valid tensor_t* on success. On failure, has_value is false and u.error is one of:

NULL_POINTER if alloc_v.allocate is NULL or shape is NULL
INVALID_ARG if ndim is 0 or any shape[i] is 0
INVALID_ARG if dtype is UNKNOWN_TYPE
ILLEGAL_STATE if the dtype registry could not be initialised
TYPE_MISMATCH if dtype is not registered in the dtype registry
LENGTH_OVERFLOW if the product of shape dimensions overflows size_t, or if len * data_size overflows size_t
BAD_ALLOC if the allocator fails to allocate the struct and FAM metadata block
OUT_OF_MEMORY if the allocator fails to allocate the data buffer

tensor_expect_t init_tensor_array(size_t capacity, dtype_id_t dtype, bool growth, allocator_vtable_t alloc_v)

Initialize a new dynamic 1-D tensor operating in ARRAY_STRUCT mode.

Convenience wrapper around init_tensor that constructs a single-dimension tensor and configures it for dynamic array semantics. The underlying allocation is identical to a rank-1 init_tensor call — one header+FAM block and one data buffer — but the mode, growth flag, and len are set to reflect ARRAY_STRUCT invariants:

mode = ARRAY_STRUCT (1-D dynamic array semantics)
len = 0 (no elements populated at construction)
alloc = capacity (total allocated capacity in elements)
growth controls whether push operations may trigger reallocation when len reaches alloc

The data buffer is zero-initialised at construction. The dtype must be registered in the dtype registry before calling this function. For type-safe access prefer a typed wrapper such as init_float_array which fixes the dtype at compile time and hides it from every call site.

Parameters:

capacity – Number of elements to allocate. Must be > 0.
dtype – Type identifier. Must not be UNKNOWN_TYPE and must be registered in the dtype registry before this call.
growth – If true, push operations will automatically reallocate the data buffer when len reaches alloc. If false, pushing onto a full array returns CAPACITY_OVERFLOW.
alloc_v – Allocator vtable used for all memory operations. alloc_v.allocate must not be NULL.

Returns:

tensor_expect_t with has_value true and a valid tensor_t* on success. On failure, has_value is false and u.error is one of:

NULL_POINTER if alloc_v.allocate is NULL
INVALID_ARG if capacity is 0 or dtype is UNKNOWN_TYPE
ILLEGAL_STATE if the dtype registry could not be initialised
TYPE_MISMATCH if dtype is not registered in the dtype registry
BAD_ALLOC if the allocator fails to allocate the struct and FAM metadata block
OUT_OF_MEMORY if the allocator fails to allocate the data buffer

void return_tensor(tensor_t *t)

Return a tensor’s memory back to its allocator.

Returns the data buffer and the tensor_t struct (including the inline FAM block holding shape and strides) to the allocator via return_element. Because shape and strides point into the FAM which is contiguous with the struct header, both are freed by the single return_element call on the struct — there is no separate call needed for shape or strides.

This does NOT free or destroy the allocator itself. After this call the pointer must not be used.

Parameters:

t – Pointer to the tensor to return. Silently ignored if NULL.

Array Operations

The following operations are only valid for ARRAY_STRUCT tensors. Calling them on a TENSOR_STRUCT tensor returns ILLEGAL_STATE. All push and pop variants accept a dtype argument that must match the dtype fixed at initialisation; a mismatch returns TYPE_MISMATCH without modifying the tensor.

error_code_t push_back_tensor(tensor_t *t, const void *data, dtype_id_t dtype)

Append one element to the back of a dynamic 1-D tensor.

Copies exactly data_size bytes from data into the next available slot at index len, then increments len by one. If the tensor is full and growth is true, the data buffer is reallocated before the copy using the tiered growth strategy (_compute_new_alloc). If growth is false and the tensor is full, no data is written and CAPACITY_OVERFLOW is returned.

This is an O(1) amortised operation when growth is enabled.

Parameters:

t – Pointer to the target tensor. Must not be NULL. Must have mode == ARRAY_STRUCT.
data – Pointer to the value to append. Must not be NULL. Must point to at least t->data_size bytes.
dtype – Type identifier. Must match t->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t or data is NULL
PRECONDITION_FAIL if t->mode != ARRAY_STRUCT
TYPE_MISMATCH if dtype != t->dtype
CAPACITY_OVERFLOW if the tensor is full and growth is false, or if the allocator does not support reallocation
LENGTH_OVERFLOW if the new capacity would overflow size_t
OUT_OF_MEMORY if reallocation fails

error_code_t push_front_tensor(tensor_t *t, const void *data, dtype_id_t dtype)

Prepend one element to the front of a dynamic 1-D tensor.

Shifts all existing elements one slot toward the back via memmove, then copies exactly data_size bytes from data into slot 0 and increments len by one. If the tensor is full and growth is true, the data buffer is reallocated before the shift. The reallocation is performed first so that the buffer pointer is stable during the memmove.

This is an O(n) operation due to the element shift.

Parameters:

t – Pointer to the target tensor. Must not be NULL. Must have mode == ARRAY_STRUCT.
data – Pointer to the value to prepend. Must not be NULL. Must point to at least t->data_size bytes.
dtype – Type identifier. Must match t->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t or data is NULL
PRECONDITION_FAIL if t->mode != ARRAY_STRUCT
TYPE_MISMATCH if dtype != t->dtype
CAPACITY_OVERFLOW if the tensor is full and growth is false, or if the allocator does not support reallocation
LENGTH_OVERFLOW if the new capacity would overflow size_t
OUT_OF_MEMORY if reallocation fails

error_code_t push_at_tensor(tensor_t *t, const void *data, size_t index, dtype_id_t dtype)

Insert one element at an arbitrary index in a dynamic 1-D tensor.

Inserts data at the given index by shifting all elements at positions [index, len) one slot toward the back via memmove, then copying exactly data_size bytes from data into the vacated slot and incrementing len by one. The reallocation is performed before the shift so that the buffer pointer is stable during the memmove.

As fast paths, index == 0 delegates to push_front_tensor and index == len delegates to push_back_tensor, avoiding an unnecessary memmove in both cases.

Valid index range is [0, len] inclusive. Passing index == len is equivalent to push_back_tensor. Passing index > len returns OUT_OF_BOUNDS without modifying the tensor.

This is an O(n) operation due to the element shift.

Parameters:

t – Pointer to the target tensor. Must not be NULL. Must have mode == ARRAY_STRUCT.
data – Pointer to the value to insert. Must not be NULL. Must point to at least t->data_size bytes.
index – Zero-based position at which to insert. Must be <= t->len.
dtype – Type identifier. Must match t->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t or data is NULL
PRECONDITION_FAIL if t->mode != ARRAY_STRUCT
TYPE_MISMATCH if dtype != t->dtype
OUT_OF_BOUNDS if index > t->len
CAPACITY_OVERFLOW if the tensor is full and growth is false, or if the allocator does not support reallocation
LENGTH_OVERFLOW if the new capacity would overflow size_t
OUT_OF_MEMORY if reallocation fails

error_code_t pop_back_tensor(tensor_t *t, void *out, dtype_id_t dtype)

Remove and optionally retrieve the last element of a dynamic 1-D tensor.

Decrements len by one, exposing the vacated slot as spare capacity. If out is non-NULL, copies exactly data_size bytes from the removed slot into the caller-provided buffer before the slot is considered free. The element bytes remain in the buffer until overwritten by a subsequent push — the caller must not rely on them after this call.

This is an O(1) operation.

Parameters:

t – Pointer to the target tensor. Must not be NULL. Must have mode == ARRAY_STRUCT.
out – Caller-provided buffer to receive the removed element. May be NULL if the value is not needed. When non-NULL, must point to at least t->data_size bytes.
dtype – Type identifier. Must match t->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t is NULL
PRECONDITION_FAIL if t->mode != ARRAY_STRUCT
TYPE_MISMATCH if dtype != t->dtype
EMPTY if t->len == 0

error_code_t pop_front_tensor(tensor_t *t, void *out, dtype_id_t dtype)

Remove and optionally retrieve the first element of a dynamic 1-D tensor.

Copies exactly data_size bytes from slot 0 into out (if non-NULL), then shifts all remaining elements one slot toward the front via memmove and decrements len by one. The shift is skipped when len reaches zero after the removal.

This is an O(n) operation due to the element shift.

Parameters:

t – Pointer to the target tensor. Must not be NULL. Must have mode == ARRAY_STRUCT.
out – Caller-provided buffer to receive the removed element. May be NULL if the value is not needed. When non-NULL, must point to at least t->data_size bytes.
dtype – Type identifier. Must match t->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t is NULL
PRECONDITION_FAIL if t->mode != ARRAY_STRUCT
TYPE_MISMATCH if dtype != t->dtype
EMPTY if t->len == 0

error_code_t pop_at_tensor(tensor_t *t, void *out, size_t index, dtype_id_t dtype)

Remove and optionally retrieve the element at an arbitrary index in a dynamic 1-D tensor.

Copies exactly data_size bytes from the slot at index into out (if non-NULL), then shifts all elements at positions (index, len) one slot toward the front via memmove and decrements len by one.

As fast paths, index == 0 delegates to pop_front_tensor and index == len - 1 delegates to pop_back_tensor, avoiding an unnecessary memmove in both cases.

Valid index range is [0, len). Passing index >= len returns OUT_OF_BOUNDS without modifying the tensor.

This is an O(n) operation due to the element shift.

Parameters:

t – Pointer to the target tensor. Must not be NULL. Must have mode == ARRAY_STRUCT.
out – Caller-provided buffer to receive the removed element. May be NULL if the value is not needed. When non-NULL, must point to at least t->data_size bytes.
index – Zero-based position of the element to remove. Must be < t->len.
dtype – Type identifier. Must match t->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t is NULL
PRECONDITION_FAIL if t->mode != ARRAY_STRUCT
TYPE_MISMATCH if dtype != t->dtype
EMPTY if t->len == 0
OUT_OF_BOUNDS if index >= t->len

Get and Set

Flat-index access is valid for both modes. The effective bound is alloc for TENSOR_STRUCT and len for ARRAY_STRUCT. N-dimensional index access is only valid for TENSOR_STRUCT tensors; calling either _nd variant on a ARRAY_STRUCT tensor returns ILLEGAL_STATE.

error_code_t set_tensor_index(tensor_t *t, size_t index, const void *data, dtype_id_t dtype)

Overwrite the element at a flat index without changing len or alloc.

Copies exactly data_size bytes from data into the slot at index in the tensor’s internal buffer. The flat index addresses elements in the order they are laid out in memory — for a C-order tensor this is row-major order.

The bound check depends on mode: TENSOR_FIXED_SHAPE: index must be < t->alloc. All allocated slots are always live regardless of len, so this function remains valid after clear_tensor. TENSOR_DYNAMIC_1D: index must be < t->len. Only populated slots are addressable; slots in [len, alloc) are off-limits until exposed by a push operation.

Parameters:

t – Pointer to the target tensor. Must not be NULL.
index – Zero-based flat index of the slot to overwrite. FIXED_SHAPE: must be < t->alloc. DYNAMIC_1D: must be < t->len.
data – Pointer to the replacement value. Must not be NULL. Must point to at least t->data_size bytes.
dtype – Type identifier. Must match t->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t or data is NULL
TYPE_MISMATCH if dtype != t->dtype
OUT_OF_BOUNDS if index >= t->alloc (FIXED_SHAPE) or index >= t->len (DYNAMIC_1D)

error_code_t set_tensor_nd_index(tensor_t *t, const size_t *idx, const void *data, dtype_id_t dtype)

Overwrite the element at an N-dimensional index without changing len or alloc.

Resolves the N-dimensional index into a flat byte offset using the tensor’s stride array, then copies exactly data_size bytes from data into that slot. This is the natural access pattern for matrix and higher-dimensional tensor operations and is consistent with the C-order (row-major) strides computed at init time.

This function is only valid for TENSOR_FIXED_SHAPE tensors. A TENSOR_DYNAMIC_1D tensor has no row or column structure and must be accessed via set_tensor_index instead.

Parameters:

t – Pointer to the target tensor. Must not be NULL.
idx – Array of t->ndim indices, one per dimension. Must not be NULL. idx[i] must be < t->shape[i] for all i in [0, t->ndim).
data – Pointer to the replacement value. Must not be NULL. Must point to at least t->data_size bytes.
dtype – Type identifier. Must match t->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t, idx, or data is NULL
TYPE_MISMATCH if dtype != t->dtype
ILLEGAL_STATE if t->mode != TENSOR_FIXED_SHAPE
OUT_OF_BOUNDS if any idx[i] >= t->shape[i]

error_code_t get_tensor_index(const tensor_t *t, size_t index, void *out, dtype_id_t dtype)

Copy one element out of the tensor by flat index.

Copies exactly t->data_size bytes from the tensor’s internal buffer at the given flat index into the caller-provided output buffer. The flat index addresses elements in the order they are laid out in memory — for a C-order tensor this is row-major order. The tensor is not modified.

The bound check depends on mode: TENSOR_FIXED_SHAPE: index must be < t->alloc. All allocated slots are always live regardless of len, so this function remains valid after clear_tensor. TENSOR_DYNAMIC_1D: index must be < t->len. Only populated slots are addressable; slots in [len, alloc) are off-limits until exposed by a push operation.

Parameters:

t – Pointer to the source tensor. Must not be NULL.
index – Zero-based flat index of the element to retrieve. FIXED_SHAPE: must be < t->alloc. DYNAMIC_1D: must be < t->len.
out – Caller-provided buffer to copy the element into. Must not be NULL. Must be at least t->data_size bytes.
dtype – Type identifier. Must match t->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t or out is NULL
TYPE_MISMATCH if dtype != t->dtype
OUT_OF_BOUNDS if index >= t->alloc (FIXED_SHAPE) or index >= t->len (DYNAMIC_1D)

error_code_t get_tensor_nd_index(const tensor_t *t, const size_t *idx, void *out, dtype_id_t dtype)

Copy one element out of the tensor by N-dimensional index.

Resolves the N-dimensional index into a flat byte offset using the tensor’s stride array, then copies exactly data_size bytes from that slot into the caller-provided output buffer. This is the natural access pattern for matrix and higher-dimensional tensor operations and is consistent with the C-order (row-major) strides computed at init time. The tensor is not modified.

This function is only valid for TENSOR_FIXED_SHAPE tensors. A TENSOR_DYNAMIC_1D tensor has no row or column structure and must be accessed via get_tensor_index instead.

Parameters:

t – Pointer to the source tensor. Must not be NULL.
idx – Array of t->ndim indices, one per dimension. Must not be NULL. idx[i] must be < t->shape[i] for all i in [0, t->ndim).
out – Caller-provided buffer to copy the element into. Must not be NULL. Must be at least t->data_size bytes.
dtype – Type identifier. Must match t->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t, idx, or out is NULL
TYPE_MISMATCH if dtype != t->dtype
ILLEGAL_STATE if t->mode != TENSOR_FIXED_SHAPE
OUT_OF_BOUNDS if any idx[i] >= t->shape[i]

Utility Operations

error_code_t clear_tensor(tensor_t *t)

Reset the tensor to empty without releasing its allocated buffer.

Sets len to zero so the tensor can be reused from scratch. The allocated capacity, data buffer, shape, strides, and ndim are all preserved. All previously stored element bytes are zeroed so no stale data remains readable in the buffer.

Parameters:

t – Pointer to the target tensor. Must not be NULL.

Returns:

NO_ERROR on success, or:

NULL_POINTER if t is NULL

tensor_expect_t copy_tensor(const tensor_t *src, allocator_vtable_t *alloc_v)

Create a deep copy of src into a newly initialized tensor.

Allocates a new tensor_t (including its FAM metadata block) and a new data buffer through alloc_v, then copies all len elements and the full shape and strides arrays from src. The copy has the same dtype, ndim, shape, strides, data_size, and growth setting as src. The caller is responsible for calling return_tensor on the returned tensor when done. The copy may use a different allocator than src.

Parameters:

src – Pointer to the source tensor. Must not be NULL.
alloc_v – Allocator vtable to use for the new tensor’s memory. alloc_v.allocate must not be NULL.

Returns:

tensor_expect_t with has_value true and a valid tensor_t* on success. On failure, has_value is false and u.error is one of:

NULL_POINTER if src is NULL
BAD_ALLOC if the allocator fails to allocate the new header
OUT_OF_MEMORY if the allocator fails to allocate the new data buffer

error_code_t concat_tensor_array(tensor_t *dst, const tensor_t *src, dtype_id_t dtype)

Append all elements of src to the back of dst.

Copies exactly src->len * src->data_size bytes from src’s data buffer into the first available slot in dst’s data buffer and increments dst->len by src->len. Both tensors must be in ARRAY_STRUCT mode and must share the same dtype.

Rather than appending elements one at a time, the function computes the total capacity needed upfront and performs at most one reallocation before the copy. When growth is required the new allocation is the larger of the tiered growth target (_compute_new_alloc) and the exact capacity needed, so that small concatenations still benefit from the tiered growth headroom and large concatenations are never undersized.

If src is empty (src->len == 0) the function returns NO_ERROR without modifying dst.

Parameters:

dst – Pointer to the destination tensor. Must not be NULL. Must have mode == ARRAY_STRUCT. Must have the same dtype as src. If dst is full and growth is true, its data buffer will be reallocated to accommodate all of src’s elements.
src – Pointer to the source tensor. Must not be NULL. Must have mode == ARRAY_STRUCT. Must have the same dtype as dst. src and dst must not refer to the same tensor.
dtype – Type identifier. Must match dst->dtype and src->dtype.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if dst or src is NULL
PRECONDITION_FAIL if dst->mode != ARRAY_STRUCT or src->mode != ARRAY_STRUCT
TYPE_MISMATCH if dtype != dst->dtype or dst->dtype != src->dtype
LENGTH_OVERFLOW if dst->len + src->len overflows size_t
CAPACITY_OVERFLOW if the combined length exceeds dst->alloc and growth is false, or if the allocator does not support reallocation
OUT_OF_MEMORY if reallocation fails

tensor_expect_t slice_tensor_array(const tensor_t *src, size_t start, size_t end, allocator_vtable_t *alloc_v)

Return a new dynamic 1-D tensor containing a copy of the elements in src at positions [start, end).

Allocates a new tensor_t and data buffer, copies exactly (end - start) * src->data_size bytes from src’s buffer starting at position start, and returns the result as an independent ARRAY_STRUCT tensor. The slice is a deep copy — mutations to the slice do not affect src and vice versa.

The slice is always returned with growth == false and alloc == len. The caller may enable growth after construction if dynamic behaviour is needed.

The range [start, end) is half-open: start is inclusive, end is exclusive. Both start and end must be <= src->len, and start must be strictly less than end. A zero-length slice (start == end) is considered invalid and returns INVALID_ARG.

If alloc_v.allocate is NULL the allocator is inherited from src, so the slice is managed by the same allocator as the source tensor. The caller is responsible for calling return_tensor on the returned tensor when it is no longer needed.

Parameters:

src – Pointer to the source tensor. Must not be NULL. Must have mode == ARRAY_STRUCT.
start – Zero-based index of the first element to include. Must be < end and <= src->len.
end – Zero-based index one past the last element to include. Must be > start and <= src->len.
alloc_v – Allocator vtable for the new tensor’s memory. If alloc_v.allocate is NULL the allocator is inherited from src.

Returns:

tensor_expect_t with has_value true and a valid tensor_t* on success. On failure, has_value is false and u.error is one of:

NULL_POINTER if src is NULL
PRECONDITION_FAIL if src->mode != ARRAY_STRUCT
OUT_OF_BOUNDS if start > src->len or end > src->len
INVALID_ARG if start >= end
BAD_ALLOC if the allocator fails to allocate the struct and FAM metadata block
OUT_OF_MEMORY if the allocator fails to allocate the data buffer

error_code_t reverse_tensor(tensor_t *t)

Reverse the populated elements of a tensor in place.

Reverses the order of all elements in the live region of the tensor’s data buffer using a SIMD-accelerated byte-swap routine. The operation acts on the flat element sequence regardless of the tensor’s shape or strides — for a TENSOR_STRUCT tensor this reverses all alloc elements in row-major order, and for an ARRAY_STRUCT tensor this reverses the len populated elements.

The tensor is modified in place; no allocation is performed and the allocator is not consulted.

Returns EMPTY rather than NO_ERROR when len < 2 because a zero or single-element sequence has no meaningful reversal. The tensor is left untouched in this case.

// arr = [10, 20, 30, 40, 50]
error_code_t err = reverse_tensor(arr->base);
// arr = [50, 40, 30, 20, 10]

Parameters:

t – Pointer to the tensor to reverse. Must not be NULL.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t is NULL
EMPTY if t->len < 2

error_code_t sort_tensor(tensor_t *array, int (*cmp)(const void*, const void*), direction_t dir)

Sort the populated elements of a tensor in place.

Sorts the len elements in the tensor’s data buffer using an iterative quicksort with median-of-three pivot selection and an insertion sort fallback for partitions of fewer than 10 elements. Tail-call optimisation keeps worst-case stack depth at O(log n) regardless of input distribution.

The sort operates on the flat element sequence in memory — for a TENSOR_STRUCT tensor this is row-major order across all alloc elements, and for an ARRAY_STRUCT tensor this covers only the len populated elements. Shape and stride metadata are not modified, so the logical N-dimensional structure of a TENSOR_STRUCT tensor is reinterpreted after the sort rather than preserved. If preserving N-dimensional structure is required, copy the tensor first via copy_tensor.

The comparator follows the qsort(3) convention:

returns a negative value if *a should precede *b
returns zero if *a and *b are equivalent
returns a positive value if *a should follow *b

The direction argument inverts the comparator result when set to REVERSE, producing a descending sort without requiring a separate comparator function.

// Comparator for int32_t elements
static int cmp_int32(const void* a, const void* b) {
    int32_t ia = *(const int32_t*)a;
    int32_t ib = *(const int32_t*)b;
    return (ia > ib) - (ia < ib);
}

// Sort an int32 array in ascending order
error_code_t err = sort_tensor(t, cmp_int32, FORWARD);

// Sort in descending order using the same comparator
err = sort_tensor(t, cmp_int32, REVERSE);

Parameters:

t – Pointer to the tensor to sort. Must not be NULL.
cmp – Comparator function following the qsort(3) convention. Both pointer arguments point to complete elements of t->data_size bytes. Must not be NULL.
dir – FORWARD for ascending order, REVERSE for descending order.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t or cmp is NULL
EMPTY if t->len < 2 (zero or one element — no sort performed, tensor is left untouched)

Type Query

static inline dtype_id_t tensor_dtype(const tensor_t *t)

Return the dtype_id_t of the tensor.

Convenience getter for the type identifier fixed at init time. Equivalent to reading array->dtype directly but consistent with the rest of the API.

Parameters:

array – Pointer to the array.

Returns:

The dtype_id_t of the array, or UNKNOWN_TYPE if array is NULL.

Introspection

All introspection functions are O(1) read-only operations. Every function returns 0, false, or NULL when passed a NULL pointer rather than dereferencing it.

static inline size_t tensor_size(const tensor_t *t)

Return the total number of elements currently stored in the tensor.

This is the product of all dimension sizes, not the allocated capacity.

Parameters:

t – Pointer to the tensor.

Returns:

Total element count, or 0 if t is NULL.

static inline size_t tensor_alloc(const tensor_t *t)

Return the allocated capacity of the tensor’s data buffer in elements.

Parameters:

t – Pointer to the tensor.

Returns:

Allocated capacity in elements, or 0 if t is NULL.

static inline size_t tensor_data_size(const tensor_t *t)

Return the size of one element in bytes.

Parameters:

t – Pointer to the tensor.

Returns:

Element size in bytes, or 0 if t is NULL.

static inline uint8_t tensor_ndim(const tensor_t *t)

Return the number of dimensions of the tensor.

Parameters:

t – Pointer to the tensor.

Returns:

Number of dimensions, or 0 if t is NULL.

static inline size_t tensor_shape_dim(const tensor_t *t, uint8_t dim)

Return the size of a single dimension.

The most common access pattern — equivalent to asking “how many rows” or “how many columns” without retrieving the full shape array.

Parameters:

t – Pointer to the tensor.
dim – Zero-based dimension index. Must be < t->ndim.

Returns:

Size of the requested dimension, or 0 if t is NULL or dim is out of range.

error_code_t tensor_shape(const tensor_t *t, size_t *out, uint8_t count)

Copy the shape of the tensor into a caller-provided buffer.

Copies min(t->ndim, count) dimension sizes into out. If count < t->ndim only the first count dimensions are copied and INVALID_ARG is returned so the caller knows the output was truncated.

Parameters:

t – Pointer to the tensor.
out – Caller-provided buffer to receive the shape. Must not be NULL. Must hold at least count elements.
count – Number of elements out can hold. Should be >= t->ndim.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t or out is NULL
INVALID_ARG if count < t->ndim (output truncated)

static inline const size_t *tensor_shape_ptr(const tensor_t *t)

Return a read-only pointer directly into the tensor’s internal shape array.

The returned pointer is valid only for the lifetime of the tensor and must not be modified or freed by the caller. Prefer tensor_shape() when the caller needs to store or modify the shape independently.

Parameters:

t – Pointer to the tensor.

Returns:

Pointer to the first element of the shape array, or NULL if t is NULL.

static inline const size_t *tensor_strides_ptr(const tensor_t *t)

Return a read-only pointer directly into the tensor’s internal strides array.

Strides are in bytes. stride[i] is the number of bytes to advance in the data buffer to move one step along dimension i. For a contiguous C-order tensor, strides[ndim-1] == data_size and each outer stride is the product of all inner dimension sizes times data_size.

The returned pointer is valid only for the lifetime of the tensor and must not be modified or freed by the caller.

Parameters:

t – Pointer to the tensor.

Returns:

Pointer to the first element of the strides array, or NULL if t is NULL.

error_code_t tensor_shape_str(const tensor_t *t, char *buf, size_t buf_len)

Write a human-readable shape string into a caller-provided buffer.

Formats the shape as “(d0, d1, …, dN)” into buf. Intended for debugging and logging only — do not parse the result programmatically, use tensor_shape() or tensor_shape_dim() instead.

If the formatted string would exceed buf_len bytes (including the null terminator) the output is truncated and CAPACITY_OVERFLOW is returned.

Parameters:

t – Pointer to the tensor.
buf – Caller-provided character buffer. Must not be NULL.
buf_len – Size of buf in bytes. Should be large enough to hold the formatted string.

Returns:

NO_ERROR on success, or one of:

NULL_POINTER if t or buf is NULL
CAPACITY_OVERFLOW if the formatted string was truncated

static inline bool is_tensor_empty(const tensor_t *t)

Return true if the tensor contains no elements.

Parameters:

t – Pointer to the tensor.

Returns:

true if len == 0 or t is NULL, false otherwise.

static inline bool is_tensor_full(const tensor_t *t)

Return true if the tensor is at full capacity.

Parameters:

t – Pointer to the tensor.

Returns:

true if len == alloc or t is NULL, false otherwise.

bool is_tensor_ptr(const tensor_t *t, const void *ptr)

Return true if ptr points to a valid element-aligned address within the live region of the tensor’s data buffer.

Conditions for validity:

t and ptr are non-NULL.
ptr >= t->data.
ptr < t->data + t->len * t->data_size.
(ptr - t->data) is an exact multiple of t->data_size.

Parameters:

t – Pointer to the tensor.
ptr – Pointer to test.

Returns:

true if ptr is valid, false otherwise.