Dynamic data structures

Lists, trees, strings and hash tables in C


Description

This library contains C headers for manipulation of doubly-linked lists, binary trees, binary heaps, hash tables and dynamically allocated strings. There is also a C++ doubly-linked list implementation. Headers defined:

  • <ddslib/bheap.h>
  • <ddslib/btree.h>
  • <ddslib/dllist.h>
  • <ddslib/dllist.hh>
  • <ddslib/htab.h>
  • <ddslib/vstr.h>
  • <ddslib/vwcs.h>

Doubly linked lists

The headers <ddslib/dllist.h> and <ddslib/dllist.hh> provide some macros for managing structures in multiple doubly linked lists. Rather than defining a container type, you just get a macro to declare a structure member to maintain the structure's position in a list. For example, if you want to build two independent lists of struct foo objects:

struct foo {
  dllist_elem(struct foo) by_size, by_date;

  unsigned size;
  time_t date;
};

Another macro defines a list header pointing to first and last members:

typedef dllist_hdr(struct foo) foo_list_t;

foo_list_t size_list = dllist_HDRINIT, date_list = dllist_HDRINIT;

In a non-dynamic context, dllist_HDRINIT may be used to initialize to an empty list. In any context, dllist_init(&size_list) will do the same.

To add an element at the beginning or end of a list, use dllist_prepend or dllist_append, specifying the structure member to use to record the element in the list:

struct foo *elem = malloc(sizeof *elem);
dllist_append(&size_list, by_size, elem);

You can insert before or after another element with dllist_insertbefore or dllist_insertafter:

struct foo *previous; // already exists
struct foo *elem = malloc(sizeof *elem);
dllist_insertafter(&size_list, by_size, previous, elem);

These work even if the reference pointer is null.

The macros do no memory (de-)allocation, only linking, so you can use them with statically allocated elements too.

To iterate, use dllist_first or dllist_last to initialize, and dllist_next or dllist_prev with the name of the member to increment:

for (struct foo *ptr = dllist_first(&size_list);
     ptr != NULL;
     ptr = dllist_next(by_size, ptr)) {
  ...
}

To remove from a list:

dllist_unlink(&size_list, by_size, elem);

There are similar macros to work with closed lists:

typedef dllist_loophdr(struct foo) foo_list_t;

foo_list_t size_list = dllist_LOOPHDRINIT;

dllist_loopinit(&size_list);

dllist_loopinsert(&size_list, by_size, elem);
dllist_loopinsertafter(by_size, elem, previous);
dllist_loopinsertbefore(by_size, elem, next);

dllist_loopunlink(&size_list, by_size, elem);

For closed lists, there are also macros to split and join lists:

dllist_loopsplit(&l1, &l2, by_size, elem1, elem2);

dllist_loopjoin(&l1, &l2, by_size, elem1, elem2);

To split, l2 must be an empty list, and elem1 and elem2 must be members of l1. Everything from elem1 upto but excluding elem2 will be removed from l1, and become the new contents of l2.

To join, elem1 must belong to l1, and elem2 to l2. elem1's previous element becomes the head of the list, and elem2 becomes its next element. elem1 becomes the next element of elem2's previous element. Finally, the two elements swap previous elements, and l2 is left empty.

Hash tables

To create a hash table:

#include <ddslib/htab.h>

size_t hash_func(void *ctxt, htab_const),
int cmp(void *ctxt, htab_const, htab_const),
htab_obj copy_key(void *ctxt, htab_const),
htab_obj copy_value(void *ctxt, htab_const),
void release_key(void *ctxt, htab_obj),
void release_value(void *ctxt, htab_obj);

htab my_table = htab_open(79, ctxt,
                          &hash_fn,
                          &cmp,
                          &copy_key, &copy_value,
                          &release_key, &release_value);

79 is the number of buckets in the table. ctxt is passed as the first argument of the supplied functions. hash_func(ctxt, key) will be called to generate a hash code for a key. cmp(ctxt, key1, key2) is called to compare two keys, and should return 0 if equal.

The other functions are optional. copy_key(ctxt, key) is called to copy a key. copy_value(ctxt, value) is called to copy a value. release_key(ctxt, key) is called to deallocate a key when it is removed from the table. release_value(ctxt, value) is called to deallocate a value when its key is removed from the table.

To insert:

htab_const key = &my_key;
htab_const value = allocate_something();
if (htab_put(my_table, key, value)) {
  // worked
} else {
  // failed
}

To remove an entry, with automatic de-allocation, use either of these:

htab_pop(my_table, key, NULL);
htab_del(my_table, key);

To remove a key and retain its value:

htab_obj res;
if (htab_pop(my_table, key, &res)) {
  // res contains the old value
}

To get the value for a key without removal:

htab_obj res;
htab_get(my_table, key, &res);

To check for a key:

htab_obj res;
htab_tst(my_table, key);

Those four functions return true if the key was found.

To iterate over all entries in the table in an arbitrary order:

htab_apprc my_app(void *ctxt, htab_const key, htab_obj value);

htab_apply(my_table, ctxt, &my_app);

my_app is invoked on each key-value pair. ctxt is passed as the first argument to my_app on each call. The function should return a flag set, which may contain htab_REMOVE to indicate that the entry is to be removed, and htab_STOP to terminate the iteration.

To empty a table:

htab_clear(my_table);

To clear and destroy a table:

htab_close(my_table);

Binary heaps

To create a binary heap of elements of type struct foo, declare it as follows:

struct foo {
  ...
  bheap_elem others;
  ...
};

To initialize a heap:

int my_cmp(void *ctxt, const void *, const void *);

bheap my_heap;
bheap_init(&my_heap, struct foo, others, ctxt, &my_cmp);

The specified comparison function will be used to order elements, with ctxt passed as its first argument.

To add an element to the heap:

struct foo *elem = ...;
bheap_insert(&myheap, elem);

To remove an element from the heap:

struct foo *elem = ...;
bheap_remove(&myheap, elem);

To remove the first element from the heap:

struct foo *elem = bheap_pop(&myheap);

To see the first element without removing it:

struct foo *elem = bheap_peek(&myheap);

In both cases, elem will be NULL if the heap is empty.

Dynamic strings

<ddslib/vstr.h> and <ddslib/vwcs.h> provide dynamically allocated strings, something like std::string and std::wstring from C++. These strings are arrays of characters which can be automatically resized on demand. A string has a capacity in characters, and a length which is never bigger than the capacity. Changes to the contents which do not require the capacity to be changed will avoid memory management.

You must include <stdarg.h> before <ddslib/vstr.h> or <ddslib/vwcs.h> to use the functions that accept a va_list.

For wide-character strings, use the type vwcs and the functions beginning with vwcs_.

Initialization and reset

For a string in an initial null state, which uses no dynamic allocation:

vstr s = vstr_NULL;
void vstr_reset(vstr *);
void vstr_clear(vstr *);
int vstr_empty(vstr *);

vstr_reset(&s) will reset it to the null state, and you should do this to de-allocate before discarding. vstr_clear(&s) sets the length of a non-null string to zero, but safely leaves a null string unchanged. vstr_empty(&s) sets s to a non-null zero-length string, returning zero on success, and negative on failure to allocate memory.

Extraction

char *vstr_get(const vstr *);
size_t vstr_len(const vstr *);
char *vstr_extract(vstr *);

vstr_get(&s) yields the address of the first character in the string, or NULL if it's in the null/reset state. This address becomes invalid if a subsequent call might change the capacity of the string. vstr_extract(&s) resets the string, but also returns the contents as a dynamically allocated array, which you should de-allocate with free. vstr_len(&s) gets the length of a non-null string, but also returns zero for null strings.

Null termination

int vstr_term(vstr *);
int vstr_unterm(vstr *);

vstr_term(&s) ensures the last character is a null (unless the string itself it null), so that strings returned by vstr_get and vstr_extract can be used as regular strings. The function returns zero on success and negative on error. vstr_unterm(&s) will safely remove the last null character if present. It should always return zero.

Optimization

void vstr_compact(vstr *);
void vstr_shorten(vstr *);
int vstr_ensure(vstr *, size_t cap);

vstr_compact(&s) will make the capacity fit the length exactly. vstr_shorten(&s) will reduce an excessive capacity nearer to the length. vstr_ensure(&s, cap) will ensure that the string has space for cap characters without further allocation, returning zero on success and negative on failure.

Assignment

int vstr_set(vstr *p, const char *s);
int vstr_set0(vstr *p, const char *s);
int vstr_setn(vstr *p, const char *s, size_t n);
int vstr_setc(vstr *p, int c, size_t n);
int vstr_setf(vstr *p, const char *fmt, ...);
int vstr_vsetf(vstr *p, const char *fmt, va_list ap);
int vstr_setv(vstr *p, const vstr *q);
int vstr_setvi(vstr *p, const vstr *q, size_t qx);
int vstr_setvn(vstr *p, const vstr *q, size_t qn);
int vstr_setvin(vstr *p, const vstr *q, size_t qx, size_t qn);
int vstr_setvr(vstr *p, const vstr *q, size_t qx);
int vstr_setvrn(vstr *p, const vstr *q, size_t qx, size_t qn);

These functions overwrite the current contents of *p, with contents defined by the remaining arguments. set and set0 accept a null-terminated string, and copy it to *p, the latter also copying the terminating null character. setn copies the contents of an array with a known size. setc sets the string to a sequence of n copies of the character c. setf takes a null-terminated format string at fmt, and processes it and the trailing arguments just as printf would do, writing the result into *p. vsetf works the same, but takes a va_list instead of trailing arguments.

Functions of the setv family take another dynamic string *q, and copy its contents to *p. If *q is a null string, then *p will be also.

qx specifies an initial position within *q, counting from the start for setvi and setvin, or from the end for setvr and setvrn. Other functions implicitly use the start of the string as the initial position.

qn specifies the number of characters to copy, forward from the initial position. Other functions implicitly copy characters until the end of *q.

The functions setf and vsetf return the number of characters written. Other functions return zero on success. All functions return negative on failure.

Insertion

int vstr_insert(vstr *p, size_t x, const char *s);
int vstr_insert0(vstr *p, size_t x, const char *s);
int vstr_insertc(vstr *p, size_t x, int, size_t);
int vstr_insertn(vstr *p, size_t x, const char *, size_t);
int vstr_insertf(vstr *p, size_t x, const char *fmt, ...);
int vstr_vinsertf(vstr *p, size_t x, const char *fmt, va_list ap);
int vstr_insertv(vstr *p, size_t x, const vstr *q);
int vstr_insertvi(vstr *p, size_t x, const vstr *q, size_t qx);
int vstr_insertvn(vstr *p, size_t x, const vstr *q, size_t qn);
int vstr_insertvin(vstr *p, size_t x,
                   const vstr *q, size_t qx, size_t qn);
int vstr_insertvr(vstr *p, size_t x, const vstr *q, size_t qx);
int vstr_insertvrn(vstr *p, size_t x,
                   const vstr *q, size_t qx, size_t qn);

These functions insert characters into *p at position x from the start. The remaining arguments of each function specify what characters to insert, and take the same form as those of the corresponding set function.

Appending

int vstr_append(vstr *p, const char *s);
int vstr_append0(vstr *p, const char *s);
int vstr_appendn(vstr *p, const char *s, size_t n);
int vstr_appendc(vstr *p, int c, size_t n);
int vstr_appendf(vstr *p, const char *fmt, ...);
int vstr_vappendf(vstr *p, const char *fmt, va_list ap);
int vstr_appendv(vstr *p, const vstr *q);
int vstr_appendvi(vstr *p, const vstr *q, size_t qx);
int vstr_appendvn(vstr *p, const vstr *q, size_t qn);
int vstr_appendvin(vstr *p, const vstr *q, size_t qx, size_t qn);
int vstr_appendvr(vstr *p, const vstr *q, size_t qx);
int vstr_appendvrn(vstr *p, const vstr *q, size_t qx, size_t qn);

These functions forego the insertion index, and always insert at the end.

Writing wide-character strings into multibyte-character strings

Suppose you want to use one of the varargs functions to write a wide-character string into a normal string. It is not null-terminated, but you do have the length:

const wchar_t *msg;
size_t msglen;

You might expect this to work robustly:

vstr_setf(&s, "Message was: %.*ls", (int) len, msg);

However, the length parameter that you must provide is a measure of how many bytes to write, not how many characters to read. You need something to separately count the number of bytes generated from the wide characters:

int vstr_wcsmblen(const wchar_t *s, size_t len);

vstr_setf(&s, "Message was: %.*ls", vstr_wcsmblen(msg, len), msg);

Deletion

void vstr_elide(vstr *p, size_t x, size_t n);
void vstr_relide(vstr *p, size_t x, size_t n);
void vstr_elider(vstr *p, size_t x, size_t n);
void vstr_relider(vstr *p, size_t x, size_t n);
void vstr_elect(vstr *p, size_t x, size_t n);
void vstr_relect(vstr *p, size_t x, size_t n);
void vstr_electr(vstr *p, size_t x, size_t n);
void vstr_relectr(vstr *p, size_t x, size_t n);

vstr_elect(&str, x, n) retains n characters starting at x, deleting the others before and after.

relect and relectr count the initial position backwards from the end of the string. electr and relect count n characters towards the start.

vstr_elide(&str, x, n) removes n characters at position x. All elide functions work as their elect counterparts, but remove the selected region, rather than removing the surrounding regions.

void vstr_truncate(vstr *p, size_t n);
void vstr_rtruncate(vstr *p, size_t n);
void vstr_neck(vstr *p, size_t n);
void vstr_rneck(vstr *p, size_t n);

vstr_truncate(&str, n) retains the first n characters. vstr_rtruncate(&str, n) removes the last n characters. vstr_neck(&str, n) removes the first n characters. vstr_rneck(&str, n) retains the last n characters.


Files

File Size Last modified Description Requirements and recommendations
Source (SVN) GNU Make ISO C99 Binodeps