Welcome to CSalt documentation!

Working with real and integer values in C presents several challenges in modern software development:

  • Manual memory management for arrays can lead to memory leaks and buffer overflows

  • Static arrays require compile-time size decisions that may not suit runtime needs

  • No built-in support for dynamic resizing of collections

  • Lack of associative containers for values requires complex custom implementations

  • Error handling for operations often requires boilerplate code

The csalt library addresses these challenges by providing four primary containers for all basic c data types:

  • A set of custom allocator types to support ease of memory management with a reduction in overhead calls to malloc, calloc, realloc and free

  • A dynamically allocated vector that automatically manages memory and resizing

  • A safe wrapper for static arrays with bounds checking

  • A dictionary implementation for mapping strings to primitive data types

  • A dictionary implementation for mapping strings to arrays of primitive data types

  • A dictionary implementation for dense and sparse storage methods

  • Build time config flags STATIC_ONLY and NO_FUNCTION_MACROS to support MISRA compliant builds

Performance Acceleration

Dynamically allocated vectors in csalt are optimized with hardware acceleration using SIMD (Single Instruction, Multiple Data) instruction sets where available. The library automatically dispatches to the best implementation supported by the target platform, including:

  • AVX-512, AVX2, and AVX on modern x86 processors

  • SSE4.1, SSE3, and SSE2 on legacy x86 processors

  • SVE2 and SVE on ARMv9/ARMv8-A processors

  • NEON on ARMv8-A (AArch64) processors

When to Use This Library

All of the functionality in this library can be accessed from the c_float.h, c_double.h, c_ldouble.h, c_int.h, c_string.h, c_bool.h, and c_binary.h files. Other data types will be included to this library at a later date.

This library is particularly useful when:

  • Working with arrays of unknown or varying size

  • Requiring safe bounds-checked access to arrays

  • Managing collections of named values

  • Performing numerical computations with dynamic data sets

  • Implementing algorithms that require flexible storage

The library’s encapsulated design prevents common array manipulation errors while maintaining the performance characteristics expected in C programs.

This project encapsulates its functionality, which is wrapped in the header guard #ifdef __cplusplus to allow compilation with both C and C++.

Implementation Details

The library provides these main container types:

Allocators (allocator_vtable_t)

  • Fast linear bump-pointer allocation (arena_t)

  • Fixed-size block allocator for small, uniform objects (pool_t)

  • Variable-sized returnable free list allocator (free_list_t)

  • Power-of-two buddy allocator (buddy_t)

  • High-throughput slab allocator for fixed-size objects (slab_t)

String (string_t)

  • Utilizes allocators from the c_allocator.h file or user developed custom allocators to manage memory.

  • Automatically manages memory during string operations

  • Provides functions for standard string operations

  • Provides an iterator for access methods

Tensor (tensor_t)

  • Utilizes allocators from the c_allocator.h file or user developed custom allocators to manage memory.

  • Serves as a unified container for one-dimensional arrays, two-dimensional matrices, and arbitrary N-dimensional tensors up to rank 255 using a single struct and allocator pipeline.

  • Runtime behaviour is governed by a tensor_mode_t field:

    • TENSOR_FIXED_SHAPE — all slots are zero-initialised at construction; len equals alloc equals the product of all shape dimensions at all times; used for matrices and higher-rank tensors.

    • TENSOR_DYNAMIC_1Dlen tracks the populated element count; automatic reallocation is optionally enabled via the growth flag; used for one-dimensional array operations.

  • Shape and strides are stored in a Flexible Array Member (FAM) contiguous with the struct header, keeping the entire tensor descriptor in a single allocator-managed block. Strides are computed in C-order (row-major) at initialisation.

  • Element types are resolved through the dtype registry (c_dtypes.h); user-defined types can be registered via ensure_dtype_registered. Every access function validates the caller-supplied dtype_id_t at the call site, providing runtime type safety over the type-erased byte buffer.

  • Typed wrappers embed tensor_t as their first member, enabling safe upcasting to tensor_t* so any generic tensor function can be called on a typed wrapper without an intermediate copy.

  • Supports array-specific operations (push, pop, insert, remove), matrix and tensor operations (N-dimensional indexed access via strides), shared operations (flat-index access, clear, deep copy), and standard introspection (size, capacity, shape, strides, element size).

Dictionary (dict_t)

  • Utilizes allocators from the c_allocator.h file or user developed custom allocators to manage memory.

  • Hash table using MurmurHash3-inspired byte-span hashing over arbitrary key lengths, with power-of-two bucket counts and separate chaining for collision resolution.

  • Supports any fixed-size value type via a generic data_size parameter; typed wrappers (uint8_dict_t, int32_dict_t, float_dict_t, etc.) fix the value type at initialisation and enforce it through the API.

  • Keys are null-terminated C-strings copied into internal allocator-managed storage on every insert; the caller may free or reuse key memory immediately.

  • Both plain (null-terminated) and _n (explicit-length) variants are provided for every key-taking function.

  • Automatically resizes when the load factor exceeds 0.75, provided the dict was initialised with growth = true; fixed-capacity mode is also supported.

  • Supports insert, pop, update, value retrieval by copy, direct pointer access, membership test, clear, deep copy, merge (with or without overwrite), and bucket-order iteration via a typed callback.

Linked List (slist_t)

  • Utilizes allocators from the c_allocator.h file or user-developed custom allocators to manage memory for the list header, node slab, and any overflow allocations.

  • Stores values as fixed-size byte buffers with a user-specified data_size and associated dtype_id_t for type identification.

  • Nodes store data inline (no secondary allocation), improving cache locality and reducing pointer indirection during traversal.

  • Uses a hybrid allocation strategy: - Pre-allocated contiguous node slab for fast, cache-friendly access - Optional overflow allocation for additional nodes when capacity is exceeded

  • Provides O(1) insertion at the front and back, with O(n) traversal-based access and insertion at arbitrary positions.

  • Supports safe element access and retrieval via copy semantics (no direct exposure of internal node storage).

  • Designed as a generic container — users are expected to define their own typed wrappers for domain-specific data structures using c_dtypes.h.

  • Provides iteration support via callback functions for traversal without exposing internal node structure.

  • Supports standard list operations including push (front, back, index), pop (front, back, index), search, and introspection.

Binary Heap (heap_t)

  • Utilizes allocators from the c_allocator.h file or user-developed custom allocators to manage memory for both the heap_t struct and its backing array_t element buffer.

  • Implemented as a generic binary heap backed by a contiguous array_t buffer; all elements are stored inline as fixed-size byte buffers identified by a dtype_id_t.

  • Heap ordering is fully determined by a caller-supplied comparator following the qsort(3) convention — the same comparator serves as both a max-heap and a min-heap depending on its sign convention, with no hardcoded ordering in the implementation.

  • Uses the same tiered growth strategy as array_t: 2× below 1 024 elements, 1.5× up to 8 192, 1.25× up to 65 536, and a fixed increment of 256 beyond that; fixed-capacity mode is also supported via growth = false.

  • Supports push (with sift-up), pop (with sift-down), non-destructive root peek, unordered foreach iteration via a typed callback, deep copy, and standard introspection (size, allocated capacity).

  • Ordered traversal is achieved by copying the heap with copy_heap and draining the copy via repeated heap_pop calls, preserving the original intact.

  • Designed as a generic container — users are expected to register their own dtype_id_t, supply a comparator, and define thin typed wrappers for domain-specific priority structures using c_dtypes.h.

AVL Tree (avl_t)

  • Utilizes allocators from the c_allocator.h file or user-developed custom allocators to manage memory for the tree header, node slab, free list, and any overflow allocations.

  • Implements a self-balancing binary search tree (AVL) that maintains strict height balance, ensuring O(log n) insertion, removal, and lookup operations.

  • Stores values as fixed-size byte buffers with a user-specified data_size derived from dtype_id_t, enabling generic storage of arbitrary types.

  • Nodes store data inline (no secondary allocation), improving cache locality and eliminating pointer indirection for element access.

  • Uses a hybrid allocation strategy: - Reuses removed nodes through an internal free list - Carves new nodes from a contiguous slab for cache-friendly allocation - Optionally performs overflow allocation when slab capacity is exceeded

  • Tree ordering is fully determined by a caller-supplied comparator following the qsort(3) convention; duplicate handling is configurable at initialisation.

  • Automatically maintains AVL balance through single and double rotations during insertion and removal operations.

  • Supports ordered traversal via in-order iteration, as well as range-based traversal with branch pruning for efficient subset queries.

  • Provides safe element access and retrieval via copy semantics (no direct exposure of internal node storage).

  • Designed as a generic container — users are expected to define their own typed wrappers for domain-specific data structures using c_dtypes.h.

  • Supports standard tree operations including insert, remove, contains, find, min/max retrieval, ordered traversal, range traversal, copy, and introspection.

Work Forward

The following are areas for future improvement in the code base

  • Test on a wider array of platforms and compilers to exercise all SIMD instruction sets

  • Add basic matrix solvers for Dense and Sparse (i.e. COO, CSR, CSC) matrices in lin_alg.h file

  • Add root solvers, numerical integration, and numerical differentiation in numerical.h file.

  • Add more advance matrix solvers like JFNK methods

Indices and tables

Requirements

  • Git

  • CMake (version 3.31.3 or later)

  • C compiler (GCC, Clang, or MSVC)

For unit testing: - Linux: valgrind (optional, for memory leak checking) - All platforms: cmocka testing framework

Getting the Code

Clone the repository:

git clone https://github.com/Jon-Webb-79/csalt.git
cd csalt

Debug Build (with tests)

Use the appropriate script for your platform:

Linux/macOS (bash):

cd scripts/bash
./debug.sh

Linux/macOS (zsh):

cd scripts/zsh
./debug.zsh

Windows:

cd scripts\Windows
debug.bat

Run tests:

Linux (with valgrind):

cd build/debug
valgrind ./unit_tests

macOS/Windows:

cd build/debug
./unit_tests

Static Library Build

Creates a static library without tests:

Linux/macOS (bash):

cd scripts/bash
./static.sh

Linux/macOS (zsh):

cd scripts/zsh
./static.zsh

Windows:

cd scripts\Windows
static.bat

System Installation

Installs library files to system directories for use in other projects:

Linux/macOS (requires sudo):

cd scripts/bash  # or scripts/zsh
sudo ./install.sh  # or sudo ./install.zsh

Windows (requires Administrator):

  1. Right-click scripts\Windows\install.bat

  2. Select “Run as Administrator”

Usage in Projects

After installation, you can use the library in three ways:

  1. As System Library:

After installation, include in your C files:

#include <c_float.h> // Or whichever header file you wish to use
  1. As Static Library:

Link against the static library created in the build/static directory.

  1. Direct Integration:

Copy any files you wish to your project and compile directly. Ensure that you have the .h and .c files. Each file requires that the c_string.h and c_string.c file also be present.

Troubleshooting

  • If tests fail, ensure all dependencies are properly installed

  • For Windows builds, ensure you’re using an appropriate Visual Studio version

  • For installation issues, verify you have appropriate system permissions

Contribute to Code Base

  1. Establish a pull request with the git repository owner.

  2. Once the package has been downloade, you will also need to install Python3.10 or later version to support documentation with Sphinx.

  3. Navigate to the csalt/docs/doxygen directory.

  4. Create a Python virtual environment with the following command.

    python -m venv .venv
    
  5. Activate the virtual environment with the following command.

Activation Commands for Virtual Environments

Platform

Shell

Command to activate virtual environment

POSIX

bash/zsh

$ source <venv>/bin/activate

fish

$ source <venv>/bin/activate.fish

csh/tcsh

$ source <venv>/bin/activate.csh

Powershell

$ <venv>/bin/Activate.ps1

Windows

cmd.exe

C:\> <venv>\\Scripts\\activate.bat

PowerShell

PS C:\\> <venv>\\Scripts\\Activate.ps1

  1. Install packages to virtual environments from requirements.txt file

    pip install -r requirements.txt
    
  2. At this point you can build the files in the same way described in the previous section and contribute to documentation.