.. Core Utilities documentation master file, created by sphinx-quickstart You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Welcome to CSalt documentation! =============================== Working with real and integer values in C presents several challenges in modern software development: * Manual memory management for arrays can lead to memory leaks and buffer overflows * Static arrays require compile-time size decisions that may not suit runtime needs * No built-in support for dynamic resizing of collections * Lack of associative containers for values requires complex custom implementations * Error handling for operations often requires boilerplate code The ``csalt`` library addresses these challenges by providing four primary containers for all basic c data types: * A set of custom allocator types to support ease of memory management with a reduction in overhead calls to ``malloc``, ``calloc``, ``realloc`` and ``free`` * A dynamically allocated vector that automatically manages memory and resizing * A safe wrapper for static arrays with bounds checking * A dictionary implementation for mapping strings to primitive data types * A dictionary implementation for mapping strings to arrays of primitive data types * A dictionary implementation for dense and sparse storage methods * Build time config flags ``STATIC_ONLY`` and ``NO_FUNCTION_MACROS`` to support MISRA compliant builds Performance Acceleration ------------------------ Dynamically allocated vectors in ``csalt`` are optimized with hardware acceleration using SIMD (Single Instruction, Multiple Data) instruction sets where available. The library automatically dispatches to the best implementation supported by the target platform, including: * **AVX-512**, **AVX2**, and **AVX** on modern x86 processors * **SSE4.1**, **SSE3**, and **SSE2** on legacy x86 processors * **SVE2** and **SVE** on ARMv9/ARMv8-A processors * **NEON** on ARMv8-A (AArch64) processors When to Use This Library ######################## All of the functionality in this library can be accessed from the ``c_float.h``, ``c_double.h``, ``c_ldouble.h``, ``c_int.h``, ``c_string.h``, ``c_bool.h``, and ``c_binary.h`` files. Other data types will be included to this library at a later date. This library is particularly useful when: * Working with arrays of unknown or varying size * Requiring safe bounds-checked access to arrays * Managing collections of named values * Performing numerical computations with dynamic data sets * Implementing algorithms that require flexible storage The library's encapsulated design prevents common array manipulation errors while maintaining the performance characteristics expected in C programs. This project encapsulates its functionality, which is wrapped in the header guard ``#ifdef __cplusplus`` to allow compilation with both C and C++. Implementation Details ###################### The library provides these main container types: Allocators (``allocator_vtable_t``) ----------------------------------- * Fast linear bump-pointer allocation (``arena_t``) * Fixed-size block allocator for small, uniform objects (``pool_t``) * Variable-sized returnable free list allocator (``free_list_t``) * Power-of-two buddy allocator (``buddy_t``) * High-throughput slab allocator for fixed-size objects (``slab_t``) String (``string_t``) ---------------------- * Utilizes allocators from the ``c_allocator.h`` file or user developed custom allocators to manage memory. * Automatically manages memory during string operations * Provides functions for standard string operations * Provides an iterator for access methods Tensor (``tensor_t``) --------------------- * Utilizes allocators from the ``c_allocator.h`` file or user developed custom allocators to manage memory. * Serves as a unified container for one-dimensional arrays, two-dimensional matrices, and arbitrary N-dimensional tensors up to rank 255 using a single struct and allocator pipeline. * Runtime behaviour is governed by a ``tensor_mode_t`` field: - ``TENSOR_FIXED_SHAPE`` — all slots are zero-initialised at construction; ``len`` equals ``alloc`` equals the product of all shape dimensions at all times; used for matrices and higher-rank tensors. - ``TENSOR_DYNAMIC_1D`` — ``len`` tracks the populated element count; automatic reallocation is optionally enabled via the ``growth`` flag; used for one-dimensional array operations. * Shape and strides are stored in a Flexible Array Member (FAM) contiguous with the struct header, keeping the entire tensor descriptor in a single allocator-managed block. Strides are computed in C-order (row-major) at initialisation. * Element types are resolved through the dtype registry (``c_dtypes.h``); user-defined types can be registered via ``ensure_dtype_registered``. Every access function validates the caller-supplied ``dtype_id_t`` at the call site, providing runtime type safety over the type-erased byte buffer. * Typed wrappers embed ``tensor_t`` as their first member, enabling safe upcasting to ``tensor_t*`` so any generic tensor function can be called on a typed wrapper without an intermediate copy. * Supports array-specific operations (push, pop, insert, remove), matrix and tensor operations (N-dimensional indexed access via strides), shared operations (flat-index access, clear, deep copy), and standard introspection (size, capacity, shape, strides, element size). Dictionary (``dict_t``) ----------------------- * Utilizes allocators from the ``c_allocator.h`` file or user developed custom allocators to manage memory. * Hash table using MurmurHash3-inspired byte-span hashing over arbitrary key lengths, with power-of-two bucket counts and separate chaining for collision resolution. * Supports any fixed-size value type via a generic ``data_size`` parameter; typed wrappers (``uint8_dict_t``, ``int32_dict_t``, ``float_dict_t``, etc.) fix the value type at initialisation and enforce it through the API. * Keys are null-terminated C-strings copied into internal allocator-managed storage on every insert; the caller may free or reuse key memory immediately. * Both plain (null-terminated) and ``_n`` (explicit-length) variants are provided for every key-taking function. * Automatically resizes when the load factor exceeds 0.75, provided the dict was initialised with ``growth = true``; fixed-capacity mode is also supported. * Supports insert, pop, update, value retrieval by copy, direct pointer access, membership test, clear, deep copy, merge (with or without overwrite), and bucket-order iteration via a typed callback. Linked List (``slist_t``) ------------------------- * Utilizes allocators from the ``c_allocator.h`` file or user-developed custom allocators to manage memory for the list header, node slab, and any overflow allocations. * Stores values as fixed-size byte buffers with a user-specified ``data_size`` and associated ``dtype_id_t`` for type identification. * Nodes store data inline (no secondary allocation), improving cache locality and reducing pointer indirection during traversal. * Uses a hybrid allocation strategy: - Pre-allocated contiguous node slab for fast, cache-friendly access - Optional overflow allocation for additional nodes when capacity is exceeded * Provides O(1) insertion at the front and back, with O(n) traversal-based access and insertion at arbitrary positions. * Supports safe element access and retrieval via copy semantics (no direct exposure of internal node storage). * Designed as a generic container — users are expected to define their own typed wrappers for domain-specific data structures using ``c_dtypes.h``. * Provides iteration support via callback functions for traversal without exposing internal node structure. * Supports standard list operations including push (front, back, index), pop (front, back, index), search, and introspection. Binary Heap (``heap_t``) ------------------------ * Utilizes allocators from the ``c_allocator.h`` file or user-developed custom allocators to manage memory for both the ``heap_t`` struct and its backing ``array_t`` element buffer. * Implemented as a generic binary heap backed by a contiguous ``array_t`` buffer; all elements are stored inline as fixed-size byte buffers identified by a ``dtype_id_t``. * Heap ordering is fully determined by a caller-supplied comparator following the ``qsort(3)`` convention — the same comparator serves as both a max-heap and a min-heap depending on its sign convention, with no hardcoded ordering in the implementation. * Uses the same tiered growth strategy as ``array_t``: 2× below 1 024 elements, 1.5× up to 8 192, 1.25× up to 65 536, and a fixed increment of 256 beyond that; fixed-capacity mode is also supported via ``growth = false``. * Supports push (with sift-up), pop (with sift-down), non-destructive root peek, unordered foreach iteration via a typed callback, deep copy, and standard introspection (size, allocated capacity). * Ordered traversal is achieved by copying the heap with ``copy_heap`` and draining the copy via repeated ``heap_pop`` calls, preserving the original intact. * Designed as a generic container — users are expected to register their own ``dtype_id_t``, supply a comparator, and define thin typed wrappers for domain-specific priority structures using ``c_dtypes.h``. AVL Tree (``avl_t``) --------------------- * Utilizes allocators from the ``c_allocator.h`` file or user-developed custom allocators to manage memory for the tree header, node slab, free list, and any overflow allocations. * Implements a self-balancing binary search tree (AVL) that maintains strict height balance, ensuring O(log n) insertion, removal, and lookup operations. * Stores values as fixed-size byte buffers with a user-specified ``data_size`` derived from ``dtype_id_t``, enabling generic storage of arbitrary types. * Nodes store data inline (no secondary allocation), improving cache locality and eliminating pointer indirection for element access. * Uses a hybrid allocation strategy: - Reuses removed nodes through an internal free list - Carves new nodes from a contiguous slab for cache-friendly allocation - Optionally performs overflow allocation when slab capacity is exceeded * Tree ordering is fully determined by a caller-supplied comparator following the ``qsort(3)`` convention; duplicate handling is configurable at initialisation. * Automatically maintains AVL balance through single and double rotations during insertion and removal operations. * Supports ordered traversal via in-order iteration, as well as range-based traversal with branch pruning for efficient subset queries. * Provides safe element access and retrieval via copy semantics (no direct exposure of internal node storage). * Designed as a generic container — users are expected to define their own typed wrappers for domain-specific data structures using ``c_dtypes.h``. * Supports standard tree operations including insert, remove, contains, find, min/max retrieval, ordered traversal, range traversal, copy, and introspection. Work Forward ------------ The following are areas for future improvement in the code base * Test on a wider array of platforms and compilers to exercise all SIMD instruction sets * Add basic matrix solvers for Dense and Sparse (i.e. COO, CSR, CSC) matrices in ``lin_alg.h`` file * Add root solvers, numerical integration, and numerical differentiation in ``numerical.h`` file. * Add more advance matrix solvers like JFNK methods .. toctree:: :maxdepth: 1 :caption: Modules: c_types c_error c_allocator c_string c_tensor c_uint8 c_int8 c_uint16 c_int16 c_uint32 c_int32 c_uint64 c_int64 c_float c_double c_ldouble c_dict c_list c_tree Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search` Installation and Build Guide ############################ Requirements ------------ - Git - CMake (version 3.31.3 or later) - C compiler (GCC, Clang, or MSVC) For unit testing: - Linux: valgrind (optional, for memory leak checking) - All platforms: cmocka testing framework Getting the Code ---------------- Clone the repository: .. code-block:: bash git clone https://github.com/Jon-Webb-79/csalt.git cd csalt Debug Build (with tests) ------------------------ Use the appropriate script for your platform: **Linux/macOS (bash)**: .. code-block:: bash cd scripts/bash ./debug.sh **Linux/macOS (zsh)**: .. code-block:: bash cd scripts/zsh ./debug.zsh **Windows**: .. code-block:: batch cd scripts\Windows debug.bat Run tests: **Linux (with valgrind)**: .. code-block:: bash cd build/debug valgrind ./unit_tests **macOS/Windows**: .. code-block:: bash cd build/debug ./unit_tests Static Library Build -------------------- Creates a static library without tests: **Linux/macOS (bash)**: .. code-block:: bash cd scripts/bash ./static.sh **Linux/macOS (zsh)**: .. code-block:: bash cd scripts/zsh ./static.zsh **Windows**: .. code-block:: batch cd scripts\Windows static.bat System Installation ------------------- Installs library files to system directories for use in other projects: **Linux/macOS (requires sudo)**: .. code-block:: bash cd scripts/bash # or scripts/zsh sudo ./install.sh # or sudo ./install.zsh **Windows (requires Administrator)**: 1. Right-click ``scripts\Windows\install.bat`` 2. Select "Run as Administrator" Usage in Projects ----------------- After installation, you can use the library in three ways: 1. **As System Library**: After installation, include in your C files: .. code-block:: c #include // Or whichever header file you wish to use 2. **As Static Library**: Link against the static library created in the build/static directory. 3. **Direct Integration**: Copy any files you wish to your project and compile directly. Ensure that you have the ``.h`` and ``.c`` files. Each file requires that the ``c_string.h`` and ``c_string.c`` file also be present. Troubleshooting --------------- - If tests fail, ensure all dependencies are properly installed - For Windows builds, ensure you're using an appropriate Visual Studio version - For installation issues, verify you have appropriate system permissions Contribute to Code Base ----------------------- #. Establish a pull request with the git repository owner. #. Once the package has been downloade, you will also need to install Python3.10 or later version to support documentation with Sphinx. #. Navigate to the ``csalt/docs/doxygen`` directory. #. Create a Python virtual environment with the following command. .. code-block:: bash python -m venv .venv #. Activate the virtual environment with the following command. .. table:: Activation Commands for Virtual Environments +----------------------+------------------+-------------------------------------------+ | Platform | Shell | Command to activate virtual environment | +======================+==================+===========================================+ | POSIX | bash/zsh | ``$ source /bin/activate`` | + +------------------+-------------------------------------------+ | | fish | ``$ source /bin/activate.fish`` | + +------------------+-------------------------------------------+ | | csh/tcsh | ``$ source /bin/activate.csh`` | + +------------------+-------------------------------------------+ | | Powershell | ``$ /bin/Activate.ps1`` | +----------------------+------------------+-------------------------------------------+ | Windows | cmd.exe | ``C:\> \\Scripts\\activate.bat`` | + +------------------+-------------------------------------------+ | | PowerShell | ``PS C:\\> \\Scripts\\Activate.ps1``| +----------------------+------------------+-------------------------------------------+ #. Install packages to virtual environments from ``requirements.txt`` file .. code-block:: bash pip install -r requirements.txt #. At this point you can build the files in the same way described in the previous section and contribute to documentation.