C String Overview
String handling in C is traditionally performed using raw character arrays
and functions from string.h. While efficient, this approach requires
manual management of memory allocation, buffer sizing, and null termination,
which can introduce risks such as buffer overflows, memory leaks, and
undefined behavior in large or safety-critical systems.
The C String module in this library provides a lightweight, allocator-aware
string container implemented in pure C and declared in c_string.h.
Unlike conventional C strings, this container explicitly tracks:
The current string length (excluding the null terminator)
The total allocated buffer size (including space for the terminator)
The allocator used to create and release the string memory
By integrating directly with the allocator abstraction defined in
c_allocator.h, the string container supports heap, arena, pool, slab,
or custom allocation strategies without changing the public API.
This enables deterministic memory behavior suitable for embedded,
real-time, or safety-regulated environments.
String construction follows a capacity-driven model:
A requested capacity of
0allocates exactly enough memory to store the full input string plus the null terminator.A non-zero capacity allocates space for the requested number of characters plus one byte for the null terminator.
If the requested capacity is smaller than the input string length, the stored string is safely truncated and always null-terminated.
If the requested capacity is larger, unused buffer space remains available for future operations.
All functions return explicit success/error state using the
*_expect_t pattern defined in c_error.h, avoiding implicit failure
modes common in traditional C string handling.
These characteristics make the C String module appropriate for:
Deterministic and allocator-controlled memory management
Embedded or real-time software requiring bounded behavior
Safety-critical systems emphasizing explicit error handling
Large applications seeking consistent container abstractions in C
Data Types
The following data structures and derived data types are defined in
c_string.h and implemented in c_string.c to support the allocator-aware
string container.
string_t
string_t is a public, non-opaque data structure that represents a
dynamically managed C-style string.
Unlike opaque container types used elsewhere in the library, this structure is
intentionally visible so that users may:
Inspect internal metadata directly when appropriate
Integrate the container with custom utilities or serialization logic
Extend behavior through user-defined helper functions or wrappers
The structure stores both the string data and the metadata required for safe
memory management through the allocator abstraction defined in
c_allocator.h.
Key properties maintained by string_t include:
A pointer to a null-terminated character buffer
The current logical length of the string (excluding the terminator)
The total allocated buffer size in bytes (including space for the terminator)
The allocator instance responsible for allocation and release of memory
These fields allow deterministic control over memory usage while preserving compatibility with standard C string operations.
typedef struct {
char* str; // Pointer to null-terminated character buffer
size_t len; // Logical string length (excludes '\0')
size_t alloc; // Total allocated bytes (includes space for '\0')
allocator_vtable_t allocator; // Allocator used for memory management
} string_t;
The following invariants are guaranteed for any valid string_t instance:
stralways points to a null-terminated character sequence.alloc >= len + 1to ensure space for the terminator.Memory ownership and release are handled exclusively through the stored allocator.
Because the structure is public, users must preserve these invariants when manipulating fields directly. Violating them may result in undefined behavior or allocator misuse.
string_expect_t
string_expect_t is a lightweight result container used for explicit error
handling during string construction and operations.
This follows the *_expect_t convention defined in c_error.h and avoids
implicit failure modes such as returning NULL without context.
typedef struct {
bool has_value;
union {
string_t* value;
error_code_t error;
} u;
} string_expect_t;
When has_value is true, the value field contains a valid
string_t pointer that must eventually be released using
return_string().
When has_value is false, the error field contains the associated
error code describing the failure condition.
Recommended Allocators
Different allocators are appropriate depending on how strings are created, resized, and destroyed:
HeapAllocator
Best for:
general-purpose string manipulation
variable-length strings with frequent resizing
host-side applications and utilities
debugging string behavior independent of allocator complexity
Strings often reallocate and copy memory during growth. A heap allocator provides predictable behavior and is the easiest to reason about during development and testing.
ArenaAllocator
Best for:
short-lived string collections (e.g., parsing, tokenization)
append-only or immutable string usage
batch construction followed by bulk discard
This is highly efficient when strings are not individually freed. However, repeated resizing may lead to unused memory if older buffers cannot be reclaimed.
BuddyAllocator
Best for:
dynamically sized strings with controlled fragmentation
systems requiring deterministic allocation patterns
workloads with frequent allocation and release
Since strings often grow geometrically, a buddy allocator can help reduce fragmentation compared to a general heap in long-running systems.
PoolAllocator / SlabAllocator
Best for:
fixed-capacity strings
preallocated buffers
embedded systems with strict memory layouts
These allocators are less suitable for dynamically growing strings, but can be effective when string sizes are bounded or known in advance.
String Functions
Creation and Teardown
init_string
-
string_expect_t init_string(const char *cstr, size_t capacity_bytes, allocator_vtable_t allocator)
Initialize an allocator-backed string container.
Constructs a new string_t instance using the provided C-string input, requested payload capacity, and allocator vtable.
Capacity semantics:
If
capacity_bytesis 0, the allocation defaults to exactly the length ofcstrplus space for the null terminator.If
capacity_bytesis non-zero, the container allocates (capacity_bytes+ 1) bytes to guarantee space for the terminator.If the requested capacity is smaller than the source string length, the stored string is truncated to fit and always null-terminated.
Memory is obtained exclusively through the supplied allocator and must later be released with return_string().
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("hello", 0, a); if (r.has_value) { printf("%s\n", const_string(r.u.value)); // prints "hello" return_string(r.u.value); }
- Parameters:
cstr – [in] Null-terminated source C string.
capacity_bytes – [in] Requested payload capacity in characters (excluding the null terminator).
allocator – [in] Allocator vtable used for memory management.
- Returns:
string_expect_t
.has_value = true→ valid string_t pointer in.u.value.has_value = false→ error code in.u.error
return_string
-
void return_string(string_t *s)
Release a string and its associated memory.
Frees the internal character buffer and the string_t structure using the allocator stored within the container. For some allocators like an
arena_tthis function may be a no-op. Passing NULL is safe and performs no action.After this call, the pointer must not be used.
string_expect_t r = init_string("example", 0, heap_allocator()); if (r.has_value) { return_string(r.u.value); // safe cleanup }
- Parameters:
s – [in] Pointer to string_t instance or NULL.
Utility Functions
get_string_index
-
static inline char get_string_index(const string_t *s, size_t index)
Safely retrieve a character from a string at a given index.
Returns the character at position
indexwithin the logical contents of the strings. The index must satisfy:If the index is out of bounds, or if0 <= index < s->len
sors->strisNULL, this function returns the null character (\0’`).The logical string length (
s->len) is authoritative. The null terminator stored ats->str[s->len]is considered an implementation detail and is not treated as a valid character.This function does not distinguish between an out-of-bounds access and a valid embedded \0’` character.
For applications requiring explicit error signaling, consider using a boolean-return variant with an output parameter.
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("hello", 0u, a); if (r.has_value) { string_t* s = r.u.value; char c0 = get_string_index(s, 0); // 'h' char c4 = get_string_index(s, 4); // 'o' char c5 = get_string_index(s, 5); // '\0' (out of bounds) return_string(s); }
- Parameters:
s – Pointer to the source string_t.
index – Zero-based character index.
- Returns:
The character at the specified index if valid.
const_string
-
static inline const char *const_string(const string_t *s)
Retrieve the internal null-terminated C string.
Returns a pointer to the underlying character buffer owned by the string_t container. The returned pointer remains valid until the string is released with return_string().
Passing NULL is safe and returns NULL.
const char* text = const_string(str); if (text) { puts(text); }
- Parameters:
s – [in] Pointer to string_t instance or NULL.
- Returns:
Pointer to null-terminated character buffer, or NULL.
string_size
-
static inline size_t string_size(const string_t *s)
Get the logical length of the string.
Returns the number of characters stored in the container, excluding the null terminator.
Passing NULL is safe and returns 0.
size_t n = string_size(str); printf("length = %zu\n", n);
- Parameters:
s – [in] Pointer to string_t instance or NULL.
- Returns:
Character count excluding the null terminator.
string_alloc
-
static inline size_t string_alloc(const string_t *s)
Get the total allocated buffer size in bytes.
Returns the number of bytes allocated for the internal buffer, including space reserved for the null terminator.
Passing NULL is safe and returns 0.
size_t cap = string_alloc(str); printf("capacity = %zu bytes\n", cap);
- Parameters:
s – [in] Pointer to string_t instance or NULL.
- Returns:
Total allocated bytes for the string buffer.
str_compare
-
int8_t str_compare(const string_t *s, const char *str)
Compare a bounded string_t against a C string.
Performs a lexicographical comparison between the contents of
sand the null-terminated C stringstr. The comparison is bounded by s->len, meaning the function never reads beyond the initialized region of the string_t buffer.This function uses a scalar implementation to guarantee:
Deterministic execution
Strict bounds safety
MISRA-compatible control flow
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("alpha", 0, a); if (r.has_value) { string_t* s = r.u.value; int8_t cmp = str_compare(s, "alphabet"); // cmp == -1 ("alpha" < "alphabet") return_string(s); }
Note
Comparison stops at s->len or the first differing character.
The function does not read beyond the bounds of
s.
- Parameters:
s – Pointer to the source string_t to compare.
str – Pointer to a null-terminated C string.
- Return values:
INT8_MIN – Invalid argument (NULL pointer or corrupt state).
0 – Strings are equal within the bounded region.
-1 –
sis lexicographically less thanstr.1 –
sis lexicographically greater thanstr.
string_compare
-
int8_t string_compare(const string_t *s, const string_t *str)
Compare two bounded string_t objects.
Performs a lexicographical comparison between
sandstr, both of which are bounded string_t instances.This function may use SIMD acceleration when supported by the target architecture and enabled at compile time:
AVX / AVX2 / AVX-512 on x86
SSE2 / SSE3 / SSE4.1 on x86
NEON on ARM
SVE / SVE2 on ARM
When SIMD is unavailable, the implementation falls back to a fully scalar, MISRA-safe comparison with identical semantics.
allocator_vtable_t a = heap_allocator(); string_expect_t r1 = init_string("delta", 0, a); string_expect_t r2 = init_string("gamma", 0, a); if (r1.has_value && r2.has_value) { string_t* s1 = r1.u.value; string_t* s2 = r2.u.value; int8_t cmp = string_compare(s1, s2); // cmp == -1 ("delta" < "gamma") return_string(s1); return_string(s2); }
Note
SIMD is used only for bounded byte comparison and never reads past either string’s initialized region.
Return values are architecture-independent.
- Parameters:
s – Pointer to the first string_t.
str – Pointer to the second string_t.
- Return values:
INT8_MIN – Invalid argument (NULL pointer or corrupt state).
0 – Strings are equal.
-1 –
sis lexicographically less thanstr.1 –
sis lexicographically greater thanstr.
is_string_ptr
-
bool is_string_ptr(const string_t *s, const void *ptr)
Checks whether a pointer lies within a string’s allocated buffer.
Determines if
ptrpoints to a memory location inside the character storage owned by the string_t instances.The valid range is:
[s->str, s->str + string_alloc(s))This function does not verify:
Whether the pointer references initialized characters
Whether the pointer aligns to a character boundary
Whether the allocator globally owns the pointer
It only checks containment within the string’s allocation.
string_expect_t r = init_string("hello", 0, heap_allocator()); if (r.has_value) { string_t* s = r.u.value; char* p = s->str + 1; if (is_string_ptr(s, p)) { *p = 'a'; // safe mutation } return_string(s); }
Note
Safe for defensive validation in low-level APIs.
Useful before performing pointer arithmetic or in-place mutation.
- Parameters:
s – Pointer to the string instance.
ptr – Pointer to test.
- Return values:
true – Pointer lies within the string’s allocated buffer.
false – Pointer is NULL, string is invalid, or outside range.
is_string_ptr_sized
-
bool is_string_ptr_sized(const string_t *s, const void *ptr, size_t bytes)
Checks whether a sized memory range lies fully within a string’s allocated buffer.
Determines whether the range
[ptr, ptr + bytes)is entirely contained within the character storage owned bys.The string’s valid allocation range is:
[s->str, s->str + s->alloc).This function is useful for validating that an object, sub-buffer, or typed view fits completely within a string before performing operations such as parsing, casting, or in-place mutation.
Note
This checks allocation containment, not “used characters” containment. If you want containment within initialized characters, bound against
s->len + 1(ors->len) instead ofs->alloc.bytes == 0returns false to avoid “vacuously true” ranges.
- Parameters:
s – Pointer to the string instance.
ptr – Pointer to the start of the candidate region.
bytes – Size (in bytes) of the candidate region.
- Return values:
true – The entire range lies within the string’s allocated buffer.
false – Invalid inputs, overflow, or the range extends خارج the buffer.
find_substr
-
size_t find_substr(const string_t *haystack, const string_t *needle, const uint8_t *begin, const uint8_t *end, direction_t dir)
Finds the first occurrence of a substring within a bounded region.
Searches for the string
needleinside the character data ofhaystack, restricted to the memory range [begin,end).The search direction is controlled by
dir:FORWARD — scans from
begintowardendand returns the earliest match.REVERSE — scans from
endtowardbeginand returns the latest match within the region.
If
beginorendisNULL, the search defaults to the used character region of the string:[haystack->str, haystack->str + haystack->len)
This function is implemented to take advantage of SIMD (Single Instruction, Multiple Data) instructions when supported by the target architecture.
At compile time, architecture-specific vectorized implementations may be selected, including:
AVX-512, AVX2, AVX
SSE4.1, SSE3, SSE2
ARM NEON
ARM SVE / SVE2
When SIMD is available, the search compares multiple characters in parallel, significantly improving performance for:
long haystacks
repeated substring searches
forward and reverse scans over large buffers
If no SIMD capability is detected, the function safely falls back to a fully portable scalar implementation with identical semantics.
SIMD usage is completely transparent to the caller:
No API differences
No alignment requirements
No behavioral changes
The return value is a 1-based offset relative to :
Return value
Meaning
0Not found or invalid arguments
1Match begins exactly at
begink + 1Match begins
kbytes afterbeginThis convention avoids ambiguity between:
“not found”, and
“match at index 0”.
Region pointers must lie within the allocated buffer of
haystack; otherwise the function returns0.The search is limited to the used string length, not slack allocation beyond
haystack->len.An empty
needleis treated as found at the beginning of the region and returns1.SIMD acceleration is optional and architecture-dependent but never changes correctness.
- Example: Forward search
string_expect_t h = init_string("bananana", 0, heap_allocator()); string_expect_t n = init_string("ana", 0, heap_allocator()); if (h.has_value && n.has_value) { size_t pos = find_substr(h.u.value, n.u.value, NULL, NULL, FORWARD); // "ana" first appears at index 1 → return value = 2 (1-based) }
- Example: Reverse search
size_t pos = find_substr(h.u.value, n.u.value, NULL, NULL, REVERSE); // Last occurrence at index 5 → return value = 6
- Example: Bounded window search
const uint8_t* base = (const uint8_t*)h.u.value->str; size_t pos = find_substr( h.u.value, n.u.value, base + 3, // begin search inside string base + h.u.value->len, // end of used region FORWARD); // Position is relative to begin, not start of string
- Parameters:
haystack – String being searched.
needle – Substring to locate.
begin – Pointer to start of searchable region inside
haystack->str(may beNULL).end – Pointer to one-past-end of searchable region inside
haystack->str(may beNULL).dir – Search direction: FORWARD or REVERSE.
- Return values:
0 – Not found or invalid input.
>0 – 1-based offset of first match relative to
begin.
- Returns:
size_t
find_substr_lit
-
size_t find_substr_lit(const string_t *haystack, const char *needle_lit, const uint8_t *begin, const uint8_t *end, direction_t dir)
Find the first occurrence of a literal substring within a string range.
This function searches for the first case-sensitive occurrence of the NUL-terminated C string
needle_litinside the stringhaystack, optionally constrained to the byte range[begin, end).The search semantics match find_substr, but the needle is provided as a string literal instead of a string_t object. Internally, the literal length is determined via
strlen, and the search is delegated to the same SIMD/scalar substring engine used by find_substr.See also
See also
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("Hello world, hello again", 0u, a); if (!r.has_value) { // handle error } string_t* text = r.u.value; // Find first lowercase "hello" size_t pos = find_substr_lit(text, "hello", NULL, NULL, DIR_FWD); // pos == 13 return_string(text);
Note
Matching is case-sensitive and substring-based (not word-delimited).
The search range
[begin, end)is validated against the underlying allocation and clamped to the used length of the string.An empty literal (
"") is defined as found at the start of the search region and returns the offset corresponding tobegin.The literal is not copied; only its length is computed before searching.
- Parameters:
haystack – Pointer to the source string object to be searched.
needle_lit – Pointer to a NUL-terminated C string literal representing the substring to locate.
begin – Optional pointer to the beginning of the search region within
haystack->str.
If
NULL, the search begins at the start of the used string.end –
Optional pointer to one-past-the-last byte of the search region.
If
NULL, the search continues to the end of the used string length.dir – Search direction (implementation-defined; typically forward or reverse).
- Return values:
SIZE_MAX – Returned if:
haystack == NULLhaystack->str == NULLneedle_lit == NULLthe search range is invalid or outside the allocation
the literal is not found within the specified region
- Returns:
Offset in bytes from the beginning of
haystack->strto the first matching occurrence ofneedle_lit.
print_string
-
void print_string(const string_t *s, FILE *stream)
Print a string_t to an output stream with fixed-width line wrapping.
Writes the logical contents of
stostreamwithout adding quotes, brackets, braces, or a trailing newline. The output is wrapped every 70 columns: once 70 characters have been written on the current line, printing continues on the next line until the full string has been emitted.Wrapping is performed strictly by character count and does not attempt to preserve words or break at whitespace boundaries.
This function does not modify the string and performs no allocations.
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string( "This is a long string that may need to wrap across multiple lines " "when printed to the console.", 0u, a ); if (r.has_value) { string_t* s = r.u.value; print_string(s, stdout); putchar('\n'); return_string(s); }
Note
If
sisNULL,s->strisNULL, orstreamisNULL, the function performs a silent no-op.No trailing newline is appended automatically.
The function prints at most
s->lenbytes and does not rely on the presence of a terminating null byte beyond the logical length.
- Parameters:
s – Pointer to the source string_t to print.
stream – Output stream to write to, such as
stdout,stderr, or an open file.
String Manipulation
str_concat
-
bool str_concat(string_t *s, const char *str)
Concatenate a C string onto a CSalt string.
Appends the null-terminated string
strto the end of the destination strings. If additional capacity is required, the function attempts to grow the underlying buffer using the allocator associated withs.This function is safe for overlapping memory regions. If the source pointer lies within the destination buffer and reallocation is required, a temporary copy is created before growth to preserve correctness.
string_expect_t r = init_string("Hello", 0, heap_allocator()); if (r.has_value) { string_t* s = r.u.value; if (str_concat(s, ", world!")) { printf("%s\n", const_string(s)); // "Hello, world!" } return_string(s); }
Note
The allocator stored in
sdetermines growth behavior.Arena allocators may not reclaim intermediate buffers until the arena itself is reset or destroyed.
The resulting string is always null-terminated on success.
- Parameters:
s – Destination string to be extended.
str – Null-terminated C string to append.
- Return values:
true – Concatenation succeeded.
false – Invalid arguments, allocation failure, or size overflow.
string_concat
-
bool string_concat(string_t *s, const string_t *str)
Concatenate one CSalt string onto another.
Appends the contents of
strto the destination strings. This function behaves identically to str_concat but obtains the source characters from another managed string_t instance.The operation respects allocator semantics and may trigger buffer growth using the destination string’s allocator.
string_expect_t a = init_string("CSalt", 0, heap_allocator()); string_expect_t b = init_string(" Library", 0, heap_allocator()); if (a.has_value && b.has_value) { string_t* s1 = a.u.value; string_t* s2 = b.u.value; if (string_concat(s1, s2)) { printf("%s\n", const_string(s1)); // "CSalt Library" } return_string(s1); return_string(s2); }
Note
Source and destination may reference the same underlying buffer. Overlap is handled safely.
The destination string remains null-terminated on success.
- Parameters:
s – Destination string to be extended.
str – Source string whose contents will be appended.
- Return values:
true – Concatenation succeeded.
false – Invalid arguments, allocation failure, or size overflow.
reset_string
-
static inline void reset_string(string_t *str)
Reset a string to the empty state without releasing memory.
Sets the logical length of the string to zero and, when a backing buffer exists, writes a null terminator at the first character position.
This allows subsequent concatenation operations (e.g.,
str_concat or string_concat) to begin writing from the start of the buffer while preserving the previously allocated capacity.This operation is O(1) and does not invoke the allocator.
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("hello", 0u, a); if (r.has_value) { string_t* s = r.u.value; reset_string(s); // String is now empty but reusable str_concat(s, "world"); printf("%s\n", const_string(s)); // prints "world" return_string(s); }
Note
If
strisNULL, the function performs no action.If the string has no backing buffer (
str->str == NULL), the function performs no action.Capacity remains unchanged; only the logical contents are cleared.
- Parameters:
str – Pointer to the string_t instance to reset.
copy_string
-
string_expect_t copy_string(const string_t *s, allocator_vtable_t allocator)
Creates a deep value copy of an existing string.
Allocates a new string_t instance using the supplied allocator and copies the character data from the source string
s.The copied string:
Contains identical character contents to
sHas independent storage (no shared buffer)
Uses the provided
allocatorfor memory managementAllocates the minimal required capacity of
string_size(s) + 1bytes to store the characters and null terminator
This function performs a value copy, not a structural clone of the original allocation. Any unused capacity in the source string is not preserved in the copy.
allocator_vtable_t a = heap_allocator(); string_expect_t r1 = init_string("hello", 0, a); if (!r1.has_value) { return; } string_t* original = r1.u.value; string_expect_t r2 = copy_string(original, a); if (r2.has_value) { string_t* copy = r2.u.value; // Independent modification str_concat(copy, " world"); printf("%s\n", const_string(original)); // "hello" printf("%s\n", const_string(copy)); // "hello world" return_string(copy); } return_string(original);
Note
The returned string must be released with return_string.
Modifying the copy does not affect the source.
Suitable for transferring string ownership between allocators or subsystems.
- Parameters:
s – Source string to copy.
allocator – Allocator used to create the new string.
- Return values:
has_value – = true
.u.valuepoints to a newly allocated deep copy.has_value – = false
.u.errorcontains:NULL_POINTER if
sors->stris NULLAny error propagated from init_string
- Returns:
string_expect_t
word_count
-
size_t word_count(const string_t *s, const string_t *word, const uint8_t *start, const uint8_t *end)
Count case-sensitive occurrences of a substring within a string range.
This function counts the number of non-overlapping, case-sensitive occurrences of
wordinside the strings, optionally constrained to the byte range[start, end).Internally, this function repeatedly calls
find_substr()and advances the search cursor past each successful match, ensuring forward progress and preventing infinite loops.See also
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("Hello world thisHello is hello again Hello", 45, a); if (!r.has_value) { // handle error } string_t* text = r.u.value; r = init_string("Hello", 5, a); if (!r.has_value) { // Handle error } string_t* word = r.u.value; size_t count = word_count(text, word, NULL, NULL); // count == 2 because matching is case-sensitive: // "Hello" // "thisHello" // "Hello" //
Note
Matching is case-sensitive.
Matches are substring-based, not whole-word delimited. For example, searching
"hello"will match"jonhello".Occurrences are counted non-overlapping. To count overlapping matches, advance the cursor by
+1instead of+word->lenafter each match.
- Parameters:
s – Pointer to the source string object to be searched.
word – Pointer to the substring to search for.
start – Optional pointer to the beginning of the search region within
s->str. IfNULL, the search begins at the start of the string.end – Optional pointer to one-past-the-last byte of the search region. If
NULL, the search continues to the end of the used string length.
- Return values:
0 – Returned if:
s == NULLs->str == NULLword == NULLword->str == NULLword->len == 0no matches are found
- Returns:
The number of non-overlapping occurrences of
wordfound within the specified region.
word_count_lit
-
size_t word_count_lit(const string_t *s, const char *word, const uint8_t *start, const uint8_t *end)
Count case-sensitive occurrences of a literal substring within a string range.
This function counts the number of non-overlapping, case-sensitive occurrences of the C string literal
wordinside the strings, optionally constrained to the byte range[start, end).Internally, this function constructs a temporary non-owning substring view and repeatedly calls
find_substr(), advancing the search cursor past each successful match to ensure forward progress and prevent infinite loops.See also
See also
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("Hello world thisHello is hello again Hello", 45, a); if (!r.has_value) { // handle error } string_t* text = r.u.value; size_t count = word_count_lit(text, "Hello", NULL, NULL); // count == 3 because matching is case-sensitive: // "Hello" // "thisHello" // "Hello" //
Note
Matching is case-sensitive.
Matches are substring-based, not whole-word delimited. For example, searching
"Hello"will match"thisHello".Occurrences are counted non-overlapping. To count overlapping matches, advance the cursor by
+1instead of+strlen(word)after each match.The literal
wordis not copied; only a temporary non-owning view is created for the duration of the search.
- Parameters:
s – Pointer to the source string object to be searched.
word – Pointer to a NUL-terminated C string literal representing the substring to search for.
start – Optional pointer to the beginning of the search region within
s->str. IfNULL, the search begins at the start of the string.end – Optional pointer to one-past-the-last byte of the search region. If
NULL, the search continues to the end of the used string length.
- Return values:
0 – Returned if:
s == NULLs->str == NULLword == NULLwordis an empty string ("")no matches are found
- Returns:
The number of non-overlapping occurrences of
wordfound within the specified region.
token_count
-
size_t token_count(const string_t *s, const string_t *delim, const uint8_t *begin, const uint8_t *end)
Count tokens in a string using a string_t delimiter set.
Counts the number of non-empty tokens within the specified byte range of
s, where tokens are sequences of bytes not contained in the delimiter set stored indelim.A token start is defined as a transition from:
The beginning of the search window is treated as if it were preceded by a delimiter, ensuring that a leading non-delimiter byte forms a token.delimiter → non-delimiter
See also
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r1 = init_string("one|two||three", 0u, a); string_expect_t r2 = init_string("|", 0u, a); if (!r1.has_value || !r2.has_value) { // handle error } string_t* text = r1.u.value; string_t* delim = r2.u.value; size_t count = token_count(text, delim, NULL, NULL); // Tokens: "one", "two", "three" // count == 3 return_string(delim); return_string(text);
Note
Matching is byte-wise and case-sensitive.
The window
[begin,end)is validated against the allocation and clamped to the used string length.If
delim->len == 0, the entire non-empty window is treated as a single token.SIMD acceleration may be used internally depending on the build configuration and target architecture.
- Parameters:
s – Pointer to the source string_t to analyze.
delim – Pointer to a string_t containing delimiter bytes. The delimiter set consists of the first
delim->lenbytes.begin – Optional pointer to the first byte of the search window within
s->str. IfNULL, the search begins at the start of the used string.end – Optional pointer to one-past-the-last byte of the search window. If
NULL, the search ends at the used length of the string.
- Return values:
SIZE_MAX – Returned if:
s == NULLs->str == NULLdelim == NULLdelim->str == NULL[begin,end)lies outside the string allocation
0 – Returned if:
the window is empty
the window contains only delimiter bytes
- Returns:
Number of tokens found in the specified window.
token_count_lit
-
size_t token_count_lit(const string_t *s, const char *delim, const uint8_t *begin, const uint8_t *end)
Count tokens in a string using a C-string delimiter set.
Counts the number of non-empty tokens within the specified byte range of
s, where tokens are sequences of bytes not contained in the delimiter setdelim.A token start is defined as a transition from:
The beginning of the search window is treated as if it were preceded by a delimiter, ensuring that a leading non-delimiter byte forms a token.delimiter → non-delimiter
The delimiter set is interpreted as the first
strlen(delim)bytes of the NUL-terminated C stringdelim.See also
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string(" alpha, beta;gamma ", 0u, a); if (!r.has_value) { // handle error } string_t* text = r.u.value; // Delimiters: space, comma, semicolon size_t count = token_count_lit(text, " ,;", NULL, NULL); // Tokens: "alpha", "beta", "gamma" // count == 3 return_string(text);
Note
Matching is byte-wise and case-sensitive.
The window
[begin,end)is validated against the allocation and clamped to the used string length.If
delimis an empty string (""), the entire non-empty window is treated as a single token.SIMD acceleration may be used internally depending on the build configuration and target architecture.
- Parameters:
s – Pointer to the source string_t to analyze.
delim – Pointer to a NUL-terminated C string containing delimiter bytes. Each byte in this string is treated as an independent delimiter.
begin – Optional pointer to the first byte of the search window within
s->str. IfNULL, the search begins at the start of the used string.end – Optional pointer to one-past-the-last byte of the search window. If
NULL, the search ends at the used length of the string.
- Return values:
SIZE_MAX – Returned if:
s == NULLs->str == NULLdelim == NULL[begin,end)lies outside the string allocation
0 – Returned if:
the window is empty
the window contains only delimiter bytes
- Returns:
Number of tokens found in the specified window.
to_uppercase
-
void to_uppercase(string_t *s, uint8_t *start, uint8_t *end)
Convert ASCII lowercase characters to uppercase in-place.
Converts all bytes in the specified window of
sfrom ‘a..’z’to’A’..’Z’` using ASCII-only rules. Bytes outside this range are left unchanged.The conversion is performed in-place and may be internally accelerated using SIMD instructions depending on the build configuration and target architecture.
See also
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("Hello world", 0u, a); if (!r.has_value) { // handle allocation failure } string_t* s = r.u.value; to_uppercase(s, NULL, NULL); // s->str == "HELLO WORLD" return_string(s);
Note
The window
[start, end)must lie within the string allocation.If
endextends beyond the used length, it is clamped tos->len.If arguments are invalid or the window is empty, the function performs a silent no-op.
Only ASCII
case conversion is performed.
UTF-8 multibyte sequences and locale-dependent characters are not modified.
- Parameters:
s – Pointer to the string_t to modify.
start – Optional pointer to the first byte of the conversion window within
s->str.
If
NULL, conversion begins at the start of the used string.end –
Optional pointer to one-past-the-last byte of the conversion window.
If
NULL, conversion continues to the end of the used string.
to_lowercase
-
void to_lowercase(string_t *s, uint8_t *start, uint8_t *end)
Convert ASCII uppercase characters to lowercase in-place.
Converts all bytes in the specified window of
sfrom ‘A..’Z’to’a’..’z’` using ASCII-only rules. Bytes outside this range are left unchanged.The conversion is performed in-place and may be internally accelerated using SIMD instructions depending on the build configuration and target architecture.
See also
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("HELLO WORLD", 0u, a); if (!r.has_value) { // handle allocation failure } string_t* s = r.u.value; to_lowercase(s, NULL, NULL); // s->str == "hello world" return_string(s);
Note
The window
[start, end)must lie within the string allocation.If
endextends beyond the used length, it is clamped tos->len.If arguments are invalid or the window is empty, the function performs a silent no-op.
Only ASCII
case conversion is performed.
UTF-8 multibyte sequences and locale-dependent characters are not modified.
- Parameters:
s – Pointer to the string_t to modify.
start – Optional pointer to the first byte of the conversion window within
s->str.
If
NULL, conversion begins at the start of the used string.end –
Optional pointer to one-past-the-last byte of the conversion window.
If
NULL, conversion continues to the end of the used string.
drop_substr
-
void drop_substr(string_t *s, const string_t *substring, uint8_t *min_ptr, uint8_t *max_ptr)
Remove all non-overlapping occurrences of a substring within a window.
This function removes every non-overlapping occurrence of
substringfrom the stringsthat lies within the byte range[begin, end).Removal is performed in-place by shifting the remaining suffix of the string left using
memmove, preserving the terminating NUL byte and maintaining valid C-string semantics.To minimize data movement, matches are located using a reverse search strategy so that shrinking operations occur from right-to-left.
After each removal, if a single ASCII space ‘` immediately follows the removed substring, that space is also removed.
(This helps avoid leaving double-spaces when removing words.)
See also
See also
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r1 = init_string("alpha beta beta gamma", 0u, a); string_expect_t r2 = init_string("beta", 0u, a); if (r1.has_value && r2.has_value) { string_t* text = r1.u.value; string_t* word = r2.u.value; drop_substr(text, word, NULL, NULL); // Result: "alpha gamma" return_string(word); return_string(text); }
Note
The window
[begin, end)must lie within the string allocation.If
endexceeds the used length, it is clamped tos->len.If arguments are invalid, the function performs a silent no-op.
Matches are substring-based, not word-delimited.
Only one trailing ASCII space is removed per match.
- Parameters:
s – Pointer to the destination string_t to modify.
substring – Pointer to the substring to remove.
begin – Optional pointer to the first byte of the search window within
s->str.
If
NULL, the window begins at the start of the used string.end –
Optional pointer to one-past-the-last byte of the search window.
If
NULL, the window extends to the end of the used string.
drop_substr_lit
-
void drop_substr_lit(string_t *s, const char *substring, uint8_t *min_ptr, uint8_t *max_ptr)
Remove all non-overlapping occurrences of a C-string literal substring.
This function behaves identically to drop_substr, except the substring is provided as a NUL-terminated C string literal rather than a string_t object.
Each non-overlapping occurrence of
substringfound within the window[begin, end)ofsis removed in-place by shifting the remaining suffix left, preserving the terminating NUL byte.Matches are processed using a reverse search strategy to minimize the total amount of memory movement required.
If a single ASCII space ‘` immediately follows a removed occurrence, that space is also removed.
See also
See also
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("one two two three", 0u, a); if (r.has_value) { string_t* text = r.u.value; drop_substr_lit(text, "two", NULL, NULL); // Result: "one three" return_string(text); }
Note
The window
[begin, end)must lie within the string allocation.If
endexceeds the used length, it is clamped tos->len.If arguments are invalid, the function performs a silent no-op.
Matches are case-sensitive and substring-based.
Only one trailing ASCII space is removed per match.
- Parameters:
s – Pointer to the destination string_t to modify.
substring – NUL-terminated C string containing the substring to remove.
begin – Optional pointer to the first byte of the search window within
s->str.
If
NULL, the window begins at the start of the used string.end –
Optional pointer to one-past-the-last byte of the search window.
If
NULL, the window extends to the end of the used string.
replace_substr
-
bool replace_substr(string_t *string, const string_t *pattern, const string_t *replace_string, char *min_ptr, char *max_ptr)
Replace all non-overlapping occurrences of a substring in-place.
Replaces every case-sensitive, non-overlapping occurrence of
patternwithreplace_stringinside the byte window[min_ptr, max_ptr)ofstring.This function is the string_t-based counterpart to replace_substr_lit and follows the same allocator-aware algorithm:
Match count determined using word_count.
Final length computed before modification.
Buffer resized once via the string’s allocator if required.
Replacement performed using reverse search (find_substr with
REVERSE) to minimize memory movement.
See also
See also
See also
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r1 = init_string("one two two three", 0u, a); string_expect_t r2 = init_string("two", 0u, a); string_expect_t r3 = init_string("four", 0u, a); if (r1.has_value && r2.has_value && r3.has_value) { string_t* s = r1.u.value; string_t* pat = r2.u.value; string_t* rep = r3.u.value; replace_substr(s, pat, rep, NULL, NULL); // Result: "one four four three" return_string(rep); return_string(pat); return_string(s); }
Note
Matching is case-sensitive and substring-based.
Replacements are non-overlapping.
The window is interpreted as
[min_ptr, max_ptr)(end exclusive).The terminating NUL byte is preserved.
On failure, the original string contents remain unchanged.
- Parameters:
string – Pointer to the destination string_t to modify.
pattern – Substring to search for.
replace_string – Replacement substring.
min_ptr – Optional pointer to the first byte of the replacement window within
string->str.
If
NULL, the window begins at the start of the used string.max_ptr –
Optional pointer to one-past-the-last byte of the replacement window.
If
NULL, the window extends to the end of the used string.
- Returns:
trueif the operation completed successfully or no replacements were required.falseif:any argument is invalid
the window lies outside the string allocation
memory reallocation fails
replace_substr_lit
-
bool replace_substr_lit(string_t *string, const char *pattern, const char *replace_string, uint8_t *min_ptr, uint8_t *max_ptr)
Replace all non-overlapping occurrences of a literal substring in-place.
Replaces every case-sensitive, non-overlapping occurrence of the NUL-terminated C string
patternwithreplace_stringinside the byte window[min_ptr, max_ptr)ofstring.The operation is performed in-place using allocator-aware resizing:
The number of matches is determined using word_count_lit.
The final required string length is computed before modification.
If necessary, the buffer is reallocated once via the string’s associated allocator.
Matches are processed using reverse search (find_substr_lit with
REVERSE) to minimize the totalmemmovecost.
Replaces every case-sensitive, non-overlapping occurrence of the NUL-terminated C string
patternwithreplace_stringinside the byte window[min_ptr, max_ptr)ofstring.See also
replace_substr
See also
find_substr_lit
See also
word_count_lit
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("red green red blue", 0u, a); if (!r.has_value) { // handle allocation failure } string_t* s = r.u.value; replace_substr_lit(s, "red", "yellow", NULL, NULL); // Result: "yellow green yellow blue" return_string(s);
The replacement is performed in-place using allocator-aware resizing:
The number of matches is determined using word_count_lit.
The final string length is computed before modification.
If necessary, the string buffer is reallocated once via the associated allocator.
Matches are processed using reverse search (find_substr_lit with
REVERSE) to minimize the total amount ofmemmoveshifting.
See also
replace_substr
See also
find_substr_lit
See also
word_count_lit
- Example
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("red green red blue", 0u, a); if (!r.has_value) { // handle allocation failure } string_t* s = r.u.value; replace_substr_lit(s, "red", "yellow", NULL, NULL); // Result: "yellow green yellow blue" return_string(s);
Note
Matching is case-sensitive and substring-based.
Replacements are non-overlapping.
The window is interpreted as
[min_ptr, max_ptr)(end exclusive).The terminating NUL byte is always preserved.
On failure, the original string contents remain unchanged.
Note
Matching is case-sensitive and substring-based.
Replacements are non-overlapping.
The window is interpreted as
[min_ptr, max_ptr)(end exclusive).The terminating NUL byte is always preserved.
On failure, the original string contents remain unchanged.
- Parameters:
string – Pointer to the destination string_t to modify.
pattern – NUL-terminated substring to search for.
replace_string – NUL-terminated replacement substring.
min_ptr – Optional pointer to the first byte of the replacement window within
string->str.
If
NULL, the window begins at the start of the used string.max_ptr –
Optional pointer to one-past-the-last byte of the replacement window.
If
NULL, the window extends to the end of the used string.string – Pointer to the destination string_t to modify.
pattern – NUL-terminated substring to search for.
replace_string – NUL-terminated replacement substring.
min_ptr – Optional pointer to the first byte of the replacement window within
string->str.
If
NULL, the window begins at the start of the used string.max_ptr –
Optional pointer to one-past-the-last byte of the replacement window.
If
NULL, the window extends to the end of the used string.
- Returns:
trueif the operation completed successfully or no replacements were required.falseif:any argument is invalid
the window lies outside the string allocation
memory reallocation fails
- Returns:
trueif the operation completed successfully or no replacements were required.falseif:any argument is invalid
the window lies outside the string allocation
memory reallocation fails
pop_str_token_lit
-
string_expect_t pop_str_token_lit(string_t *s, const char *token, allocator_vtable_t allocator)
Pop the substring to the right of the last literal token occurrence.
Searches for the last (reverse) occurrence of the C string literal
tokenwithin the used portion ofs. If found, all characters strictly to the right of the token are:Copied into a newly allocated string_t (using the supplied
allocator), andRemoved from
sby shrinking its logical length and resetting the null terminator.
The token itself is also removed from
s.Example:
Input string: “alpha/beta/gamma” Token: “/”
Result: Returned string -> “gamma” Modified input -> “alpha/beta”
Matching is case-sensitive.
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("red/green/blue", 0u, a); assert_true(r.has_value); string_t* s = r.u.value; string_expect_t popped = pop_str_token_lit(s, "/", a); assert_true(popped.has_value); // popped.u.value->str == "blue" // s->str == "red/green" return_string(popped.u.value); return_string(s);
See also
See also
Note
The original string is modified only if the token is found.
The returned string is independent and must be released by the caller.
The search is performed using find_substr_lit in REVERSE mode.
- Parameters:
s – Pointer to the source string_t to modify.
token – Null-terminated C string literal representing the token. Must not be
NULLor empty.allocator – Allocator used to construct the returned string.
- Return values:
has_value – == true A newly allocated string containing the substring to the right of the last token occurrence.
has_value – == false If:
s == NULLs->str == NULLtoken == NULLtokenis emptytoken is not found in
sallocation fails
- Returns:
A string_expect_t containing:
pop_str_token
-
string_expect_t pop_str_token(string_t *s, const string_t *token, allocator_vtable_t allocator)
Pop the substring to the right of the last string_t token occurrence.
Searches for the last (reverse) occurrence of the substring specified by
tokenwithin the used portion ofs.If found:
The substring strictly to the right of the token is copied into a new string_t using the supplied
allocator.The original string
sis truncated at the beginning of the token.
The token itself is removed from
s.Example:
Input string: “one::two::three” Token: “::”
Result: Returned string -> “three” Modified input -> “one::two”
Matching is case-sensitive.
allocator_vtable_t a = heap_allocator(); string_expect_t rs = init_string("path/to/file.txt", 0u, a); string_expect_t rt = init_string("/", 0u, a); if (rs.has_value && rt.has_value) { string_t* s = rs.u.value; string_t* t = rt.u.value; string_expect_t out = pop_str_token(s, t, a); assert_true(out.has_value); // out.u.value->str == "file.txt" // s->str == "path/to" return_string(out.u.value); return_string(t); return_string(s); }
See also
See also
Note
The original string is modified only if the token is found.
The returned string must be released by the caller.
The search is performed using find_substr in REVERSE mode.
- Parameters:
s – Pointer to the source string_t to modify.
token – Pointer to a string_t representing the token substring. Must not be
NULL, and must have non-zero length.allocator – Allocator used to construct the returned string.
- Return values:
has_value – == true A newly allocated string containing the substring to the right of the last token occurrence.
has_value – == false If:
s == NULLs->str == NULLtoken == NULLtoken->str == NULLtoken->len == 0token is not found in
sallocation fails
- Returns:
A string_expect_t containing:
Generic Macros
The generic macros described in this section are only available when
ARENA_USE_CONVENIENCE_MACROS is enabled and the code is not
compiled with NO_FUNCTION_MACROS.
concat_string
-
concat_string(dst, src)
Type-safe generic string concatenation convenience macro.
Dispatches to the correct concatenation routine based on the type of the source argument
srcusing the C11_Genericselection mechanism.Supported source types:
const char*→ str_concatchar*→ str_concatconst string_t*→ string_concatstring_t*→ string_concat
Any unsupported source type triggers a compile-time error in C11 builds via _concat_string_type_error, ensuring strong type safety without runtime overhead.
This macro is available only when:
ARENA_USE_CONVENIENCE_MACROSis defined, andNO_FUNCTION_MACROSis not defined (to preserve MISRA-style builds).
string_expect_t r = init_string("Answer: ", 0, heap_allocator()); if (r.has_value) { string_t* s = r.u.value; concat_string(s, "42"); printf("%s\n", const_string(s)); // "Answer: 42" return_string(s); }
Note
The destination string’s allocator controls any required buffer growth.
For arena allocators, intermediate buffers may persist until the arena is reset or destroyed.
No runtime type checks are performed; dispatch occurs entirely at compile time.
- Parameters:
dst – Destination string_t instance to be extended.
src – Source data to append (
const char*orstring_t*).
- Return values:
true – Concatenation succeeded.
false –
Concatenation failed (allocation error, overflow, or invalid arguments).
This return value originates from the selected function.
compare_string
-
compare_string(lhs, rhs)
Type-safe generic string comparison convenience macro.
compare_string(lhs, rhs)provides a single comparison interface that selects the correct implementation at compile time using the C11_Genericoperator.Compile-time dispatch rules:
If
rhsis a C string (const char*orchar*), this macro expands to: str_compare((const string_t*)lhs, (const char*)rhs)If
rhsis a string object (const string_t*orstring_t*), this macro expands to: string_compare((const string_t*)lhs, (const string_t*)rhs)
In other words, the macro performs zero runtime type checks and adds no dispatch overhead—selection happens entirely at compile time.
Availability:
Enabled only when
ARENA_USE_CONVENIENCE_MACROSis defined, andDisabled when
NO_FUNCTION_MACROSis defined (to support MISRA-style builds).
allocator_vtable_t a = heap_allocator(); string_expect_t r1 = init_string("alpha", 0u, a); string_expect_t r2 = init_string("alphabet", 0u, a); if (r1.has_value && r2.has_value) { string_t* s1 = r1.u.value; string_t* s2 = r2.u.value; // Dispatches to str_compare(s1, "alphabet") int8_t c1 = compare_string(s1, "alphabet"); // -> -1 // Dispatches to string_compare(s1, s2) int8_t c2 = compare_string(s1, s2); // -> -1 (void)c1; (void)c2; return_string(s1); return_string(s2); }
Note
If
rhsis not one of the supported types, this macro triggers a compile-time error in C11 builds via COMPARE_STRING_TYPECHECK_.Note
When dispatching to str_compare, comparison is bounded by
lhs->len.When dispatching to string_compare, the implementation may use SIMD acceleration internally (depending on build/architecture), but the return semantics remain identical across platforms.
- Parameters:
lhs – Pointer to the left-hand string_t (treated as
const string_t*).rhs – Right-hand operand. Must be one of:
const char*,char*,const string_t*,string_t*.
- Return values:
INT8_MIN – Invalid argument / error sentinel (e.g., NULL input).
-1 –
lhsis lexicographically less thanrhs.0 –
lhsis equal torhs.1 –
lhsis lexicographically greater thanrhs.
- Returns:
int8_t using the semantics of the selected function:
count_words
-
count_words(s, word, start, end)
Type-safe generic substring occurrence counting convenience macro.
count_words(s, word, start, end)provides a single counting interface that selects the correct implementation at compile time using the C11_Genericoperator.Compile-time dispatch rules:
If
wordis a C string (const char*orchar*), this macro expands to: word_count_lit((const string_t*)s, (const char*)word, start, end)If
wordis a string object (const string_t*orstring_t*), this macro expands to: word_count((const string_t*)s, (const string_t*)word, start, end)
In other words, the macro performs zero runtime type checks and adds no dispatch overhead—selection happens entirely at compile time.
Availability:
Enabled only when
ARENA_USE_CONVENIENCE_MACROSis defined, andDisabled when
NO_FUNCTION_MACROSis defined (to support MISRA-style builds).
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("Hello world thisHello is hello again Hello", 45u, a); if (!r.has_value) { // handle error } string_t* text = r.u.value; // Dispatches to word_count_lit(text, "Hello", NULL, NULL) size_t c1 = count_words(text, "Hello", NULL, NULL); // -> 3 // Dispatches to word_count(text, word_obj, NULL, NULL) string_expect_t r2 = init_string("hello", 0u, a); if (r2.has_value) { string_t* w = r2.u.value; size_t c2 = count_words(text, w, NULL, NULL); // -> 1 return_string(w); (void)c2; } (void)c1; return_string(text);
See also
See also
See also
Note
If
wordis not one of the supported types, this macro triggers a compile-time error in C11 builds via COUNT_WORDS_TYPECHECK_.Note
Matching is case-sensitive. Occurrences are counted non-overlapping by default (implementation-defined by the selected function).
- Parameters:
s – Pointer to the source string_t (treated as
const string_t*).word – Substring to search for. Must be one of:
const char*,char*,const string_t*,string_t*.start – Optional pointer to the beginning of the search region within
s->str. IfNULL, the search begins at the start of the string.end – Optional pointer to one-past-the-last byte of the search region. If
NULL, the search continues to the end of the used string length.
- Return values:
0 – Returned if the selected implementation considers the arguments invalid (e.g.,
s == NULL,s->str == NULL,word == NULL, empty word, etc.) or if no matches are found.- Returns:
size_t count using the semantics of the selected function.
find_substring
-
find_substring(haystack, needle, begin, end, dir)
Type-safe generic substring search convenience macro.
find_substring(haystack, needle, begin, end, dir)selects the correct substring search implementation at compile time using the C11_Genericoperator.Compile-time dispatch rules:
If
needleis a C string (const char*orchar*), this macro expands to: find_substr_lit((const string_t*)haystack, (const char*)needle, begin, end, dir)If
needleis a string object (const string_t*orstring_t*), this macro expands to: find_substr((const string_t*)haystack, (const string_t*)needle, begin, end, dir)
This macro performs zero runtime type checks and adds no dispatch overhead—selection happens entirely at compile time.
Availability:
Enabled only when
ARENA_USE_CONVENIENCE_MACROSis defined, andDisabled when
NO_FUNCTION_MACROSis defined (to support MISRA-style builds).
See also
See also
Note
If
needleis not one of the supported types, this macro triggers a compile-time error in C11 builds via FIND_SUBSTR_TYPECHECK_.- Parameters:
haystack – Pointer to the source string_t (treated as
const string_t*).needle – Substring to search for. Supported types:
const char*,char*,const string_t*,string_t*.begin – Optional start pointer within
haystack->str(orNULL).end – Optional end pointer within
haystack->str(orNULL).dir – Search direction (implementation-defined by underlying functions).
- Returns:
size_t offset from the beginning of
haystack, orSIZE_MAXif not found or if arguments are invalid (per the selected implementation).
count_tokens
-
count_tokens(s, delim, begin, end)
Type-safe generic token counting convenience macro.
count_tokens(s, delim, begin, end)selects the appropriate token counting implementation at compile time using the C11_Genericoperator.Dispatch rules:
If
delimis a C string (const char*orchar*), expands to: token_count_lit((const string_t*)s, (const char*)delim, begin, end)If
delimis a string object (const string_t*orstring_t*), expands to: token_count((const string_t*)s, (const string_t*)delim, begin, end)
No runtime type checks are performed; selection occurs entirely at compile time with zero dispatch overhead.
Availability:
Enabled only when
ARENA_USE_CONVENIENCE_MACROSis definedDisabled when
NO_FUNCTION_MACROSis defined
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("alpha,beta gamma", 0u, a); assert_true(r.has_value); string_t* text = r.u.value; // Dispatch → token_count_lit size_t c1 = count_tokens(text, ", ", NULL, NULL); // == 2 string_expect_t d = init_string(", ", 0u, a); assert_true(d.has_value); // Dispatch → token_count size_t c2 = count_tokens(text, d.u.value, NULL, NULL); // == 2 return_string(d.u.value); return_string(text);
Note
Passing an unsupported delimiter type triggers a compile-time error via TOKEN_COUNT_TYPECHECK_.
- Parameters:
s – Pointer to the source string_t.
delim – Delimiter specification. Must be one of:
const char*char*const string_t*string_t*
begin – Optional pointer to the beginning of the search window.
end – Optional pointer to one-past-the-last byte of the search window.
- Return values:
SIZE_MAX – Invalid argument or out-of-range window.
- Returns:
Number of tokens detected in the specified range.
drop_substring
-
drop_substring(s, needle, begin, end)
Type-safe generic substring removal convenience macro.
drop_substring(s, needle, begin, end)provides a single interface for removing all occurrences of a substring from a string_t within the byte window[begin, end). The correct implementation is selected at compile time using the C11_Genericoperator.Dispatch rules:
If
needleis a C string (const char*orchar*), this macro expands to: drop_substr_lit((string_t*)s, (const char*)needle, begin, end)If
needleis a string object (const string_t*orstring_t*), this macro expands to: drop_substr((string_t*)s, (const string_t*)needle, begin, end)
In other words, the macro performs zero runtime type checks and adds no dispatch overhead—selection happens entirely at compile time.
Availability:
Enabled only when
ARENA_USE_CONVENIENCE_MACROSis defined, andDisabled when
NO_FUNCTION_MACROSis defined (to support MISRA-style builds).
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("alpha beta beta gamma", 0u, a); assert_true(r.has_value); string_t* text = r.u.value; // Dispatches to drop_substr_lit(text, "beta", NULL, NULL) drop_substring(text, "beta", NULL, NULL); // text->str == "alpha gamma" // Rebuild input for the string_t needle example: return_string(text); r = init_string("alpha beta beta gamma", 0u, a); assert_true(r.has_value); text = r.u.value; string_expect_t rn = init_string("beta", 0u, a); assert_true(rn.has_value); string_t* needle = rn.u.value; // Dispatches to drop_substr(text, needle, NULL, NULL) drop_substring(text, needle, NULL, NULL); // text->str == "alpha gamma" return_string(needle); return_string(text);
See also
See also
Note
If
needleis not one of the supported types, this macro triggers a compile-time error in C11 builds via DROP_SUBSTRING_TYPECHECK_.Note
The behavior (non-overlapping removal, reverse search optimization, optional removal of a single trailing ASCII space after each match, window clamping) is defined by the selected underlying function:
drop_substr
drop_substr_lit
- Parameters:
s – Pointer to the destination string_t to modify.
needle – Substring to remove. Must be one of:
const char*,char*,const string_t*,string_t*.begin – Optional pointer to the first byte of the removal window within
s->str. PassNULLto start at the beginning of the used string.end – Optional pointer to one-past-the-last byte of the removal window. Pass
NULLto end at the used length of the string.
replace_substring
-
replace_substring(s, pattern, replacement, min_ptr, max_ptr)
Type-safe generic substring replacement convenience macro.
replace_substring(s, pattern, replacement, min_ptr, max_ptr)provides a single replacement interface that selects the correct implementation at compile time using the C11_Genericoperator.Dispatch rules:
If
pattern(andreplacement) are C strings (const char*orchar*), this macro expands to: replace_substr_lit((string_t*)s, (const char*)pattern, (const char*)replacement, (uint8_t*)min_ptr, (uint8_t*)max_ptr)If
pattern(andreplacement) are string objects (const string_t*orstring_t*), this macro expands to: replace_substr((string_t*)s, (const string_t*)pattern, (const string_t*)replacement, (char*)min_ptr, (char*)max_ptr)
The macro enforces at compile time that:
patternis one of:const char*,char*,const string_t*,string_t*replacementis one of the same supported typespatternandreplacementbelong to the same category (both literal, or both string objects)
Availability:
Enabled only when
ARENA_USE_CONVENIENCE_MACROSis defined, andDisabled when
NO_FUNCTION_MACROSis defined (to support MISRA-style builds).
allocator_vtable_t a = heap_allocator(); // Literal version string_expect_t r = init_string("red green red", 0u, a); assert_true(r.has_value); string_t* s = r.u.value; // Dispatches to replace_substr_lit(...) (void)replace_substring(s, "red", "blue", NULL, NULL); // s->str == "blue green blue" return_string(s); // string_t version string_expect_t r1 = init_string("one two two", 0u, a); string_expect_t r2 = init_string("two", 0u, a); string_expect_t r3 = init_string("four", 0u, a); if (r1.has_value && r2.has_value && r3.has_value) { string_t* t = r1.u.value; string_t* pat = r2.u.value; string_t* rep = r3.u.value; // Dispatches to replace_substr(...) (void)replace_substring(t, pat, rep, NULL, NULL); // t->str == "one four four" return_string(rep); return_string(pat); return_string(t); }
See also
See also
Note
The window
[min_ptr, max_ptr)is interpreted as end-exclusive. Both underlying implementations validate that the window lies within the string allocation and clamp to the used length.- Parameters:
s – Pointer to the destination string_t to modify.
pattern – Substring pattern to search for (literal or string_t).
replacement – Replacement substring (must match the category of
pattern).min_ptr – Optional pointer to the first byte of the replacement window within
s->str. PassNULLto start at the beginning of the used string.max_ptr – Optional pointer to one-past-the-last byte of the replacement window. Pass
NULLto end at the used length of the string.
- Returns:
true/falseas returned by the selected underlying function.
pop_string_token
-
pop_string_token(s, token, allocator)
Type-safe generic token pop convenience macro.
pop_string_token(s, token, allocator)selects the correct implementation at compile time using the C11_Genericoperator.Dispatch rules:
If
tokenis a C string (const char*orchar*), dispatch to pop_str_token_lit.If
tokenis a string object (const string_t*orstring_t*), dispatch to pop_str_token.
This macro performs no runtime type checks and adds no dispatch overhead.
Availability:
Enabled only when
ARENA_USE_CONVENIENCE_MACROSis defined, andDisabled when
NO_FUNCTION_MACROSis defined.
allocator_vtable_t a = heap_allocator(); string_expect_t r = init_string("a/b/c", 0u, a); assert_true(r.has_value); string_t* s = r.u.value; // Dispatches to pop_str_token_lit(s, "/", a) string_expect_t out1 = pop_string_token(s, "/", a); assert_true(out1.has_value); // out1.u.value->str == "c" // s->str == "a/b" return_string(out1.u.value); string_expect_t rt = init_string("/", 0u, a); assert_true(rt.has_value); string_t* tok = rt.u.value; // Dispatches to pop_str_token(s, tok, a) string_expect_t out2 = pop_string_token(s, tok, a); (void)out2; return_string(tok); return_string(s);
See also
See also
Note
If
tokenis not a supported type, this macro triggers a compile-time error in C11 builds.- Parameters:
s – Pointer to the source string_t to modify.
token – Token to search for (literal or string_t). Must be one of:
const char*,char*,const string_t*,string_t*.allocator – Allocator used to construct the returned string.
- Returns:
A string_expect_t as returned by the selected underlying function.