Vstr documentation -- overview

About

This is a string library, it's designed so you can work optimally with readv()/writev() for input/output. This means that, for instance, you can readv() data to the end of the string and writev() data from the begining of the string without having to allocate or move memory. It also means that the library is completely happy with data that has multiple zero bytes in it.

This design constraint means that unlike most string libraries Vstr doesn't have an internal representation of the string where everything can be accessed from a single (char *) pointer in C, the internal representation is of multiple "blocks" or nodes each carrying some of the data for the string. This model of representing the data also means that as a string gets bigger the Vstr memory usage only goes up linearly and has no inherent copying (due to other string libraries increasing space for the string via. realloc() the memory usage can almost double and require a complete copy of the string).

It also means that adding, substituting or moving data anywhere in the string can be optimised a lot, to require O(1) copying instead of O(n). Speaking of O(1), it's worth remembering that if you have a Vstr string with caching enabled a writev() call will take constant time as well (the cat example below shows an example of this, the write call is always constant time.

As well as having features directly related to doing IO well it contains functions for:
  • a printf like function that is fully ISO C 9899:1999 (C99) compliant, also having %m as standard and POSIX i18n parameter number modifiers. It also allows gcc warning compatible customer format specifiers (and includes pre-written custom format specifiers for ipv4 and ipv6 addresses, Vstr strings and more)
  • splitting of strings into parameter/record chunks (a la perl).
  • substituting data in a Vstr string
  • moving data from one Vstr string to another (or within a Vstr string).
  • comparing strings (without regard for case, or taking into account version information)
  • searching for data in strings (with or without regard for case).
  • counting spans of data in a string (the equivalent of strspn() in ISO C).
  • converting data in a Vstr (Ie. delete/substitute unprintable characters or makeing a Vstr string lowercase/uppercase).
  • parsing data from a Vstr string (Ie. numbers, or ipv4 addresses).
  • easily parsing and wrapping outgoing data in netstrings, for fast and simple (and hence less error prone) network communication
  • the ability to cache aspects of data about a Vstr string, to both simplify and speedup use of the string.
It also has a number of functions for exporting data from a Vstr string so you can easily use data generted with the Vstr outside of the library.

The other unusual aspect of the Vstr string library is that it attaches a notion of a locale to the string configuration and not globally (as POSIX, and pretty much everything else does). This means that you can do Network I/O in the C locale and user IO in the users locale.

A last point that shouldn't be unusual, but is. The Vstr string library comes with a "make check" test suite with over four thousand lines of code in it, so although you can never say something is bug free you have an assurance that most things will work as advertised.

For a comparison with other string libraries and printf() like implementations, see this page.

A simple introduction to the API

At first glace the vstr API looks huge as there are over One hundred and forty functions. However the API was designed so that you can mentally build functions from a template in your head ... so instead of having to remember 140 functions you just need to remember 10 to 20 pieces of a template.

All vstr functions obey one of the following template rules...

"vstr_" <verb>
"vstr_" <verb> <noun>
"vstr_" <verb> <noun> <verb>

...a good example is searching for data in a vstr, here is a list of the functions that you can use to search for data in a vstr...

vstr_srch_chr_fwd()
vstr_srch_chr_rev()
vstr_srch_buf_fwd()
vstr_srch_buf_rev()
vstr_srch_chrs_fwd()
vstr_srch_chrs_rev()
vstr_srch_vstr_fwd()
vstr_srch_vstr_rev()
vstr_srch_case_chr_fwd()
vstr_srch_case_chr_rev()
vstr_srch_case_buf_fwd()
vstr_srch_case_buf_rev()
vstr_srch_case_vstr_fwd()
vstr_srch_case_vstr_rev()
vstr_csrch_chrs_fwd()
vstr_csrch_chrs_rev()
vstr_spn_chrs_fwd()
vstr_spn_chrs_rev()
vstr_cspn_chrs_fwd()
vstr_cspn_chrs_rev()

...which is a lot of functions (and that doesn't even include the CSTR macro function variants) just to search for some data. However that can be broken up into...
"vstr_"<verb><noun><verb>
"vstr_" srch chr fwd
srch_case buf rev
csrch vstr
spn chrs
cspn

...which is much less information to remember.

Reference Documentation

All functions are documented in functions.html, all constants are in constants.html, all public members of structures are in structs.html and the namespace rules are in namespace.html. Note that if you install the library the functions reference is available as the vstr man page.

A simple and heavily commented example

To get a rough overview of how to use the library you can see heavily commented examples of a simple version of a Unix catprogram or a simple version of a hostname lookup program.

All the example programs are listed from the directory HERE and for the truley adventurous the "make check" test suite root is HERE (note that the test suite is written to try and break the Vstr string library, so although it uses all of the APIs it may not be code you want to copy and paste into your programs/libraries).


James Antill
Last modified: Thu Nov 14 01:33:58 EST 2002