This page is a rough comparison between the Vstr string library and other string libraries that I've seen. This may be somewhat biased, due to the fact that I wrote the Vstr library ... then again it is better :).
These are libraries that provide a string API.
Obviously this is in C++, and not C. This library tries to do what
Vstr does, as well as it does it, however it is "compatible" with the
normal C++ string API and so is constrained in that way. It also
doesn't really seem to have a
Also it doesn't seem possible to act with a substring of a Rope, you can only make a copy of a substring using the substr call. Vstr doesn't have a true distinction between the entire Vstr and a portion thereof.
This is probably the most used C string library, and comes with the glib utility library. This works on a simple start pointer and length, model. This makes it much more memory effiecient for small strings. This feature also makes it pretty much impossible to do IO into the strings, share data between strings and kills performance on substitutions. Failure to allocate memory calls abort().
Also note that the printf implementation just calls the host implementation of sprintf/asprintf/etc. directly. So it's impossible to have custom formatters (the most annoying fallout of this is that you can't add a GString to a GString as part of a printf() call) and portability is a problem (currently [2002-09-02] it doesn't even tell you that you'll crash your program in certain cases, and loses data in certain cases when you add a '\0' byte to the string -- I've submitted a patch for the last problem).
There is no substitution API in glib, probably because you can't share data so you just do a memcpy() and an overwrite. It's also worth noting that your program may crash if you try and add data in a GString to the GString itself.
Note that vstr_split_chrs() and other functions for using a GString are in glib, as part of the C string helper functions (Eg. g_strsplit() in the case of vstr_split_chrs()). There is also limited support in glib for doing things in ASCII regardless of the current user locale.
This library is a slightly saner version of the ISO C str* functions, with a few extentions. It works with (char *) as the native type, and doesn't do automatic allocation ... so buffer overflows are still a concern.
The printf implementation is internal and based on the Apache snprintf() function, '\'' (thousand modifiers), 'a', 'F', 'Lf', 'lld', 'td', 'zd', 'hhd' , etc. and i18n format parameter modifiers are all completly missing Unspecified precision is broken, as is corner cases for octal etc. also infinity/nan output is not correct with regard to case. Buffer overflows are possible in the integer formatting paths You can have custom modifiers, but only triggered on the system '%' character ... so gcc will currently spam warnings. It also looks like the ISO C std. is completely ignored for certain corner cases. Also note that due to the fact that the strings cannot be resized by the library the printf implementation uses a snprint() interface, this means that data can be lost using the interface if the programer isn't carefull.
This library works with (char *) as the base type, although all allocator functions are also passed a "pool" that the (char *) comes from, and so resizing can be done by the library.
There is no printf like function. However it does declare a "vector" type that is roughly equivalent to a Vstr_sects type however it only contains a "ptr" to the data ... so doing a split on a string involves at least doing a memdup() (it actually does a strndup()).
Major namespace corruption.
This is currently just a C++ wrapper class around a (char *), with a couple of extra functions not available in std. C. It does to auto matic resizing. It also look looks like the implementation could be changed to fix a lot of problems with this (Eg. the length is calculated using strlen(), so embeded '\0' characters corrupt data) without affecting source compatability. The printf like function calls the host implementation.
This is a bunch of add on functions to the std. C string functions, inspired by PHP.
This library doesn't allow "binary" (!isgraph && !isspace -- so depends depending on global locale) characters. Printf like function calls the host implementation. Has a large API for just add, delete and substitue.
Has a start pointer and length model, but it doesn't allocate/reallocate memory itself. So the user has to do half the work of the library to pre-allocate the right ammount of data before using the functions. The printf like function has almost no resemblence to the ISO printf function, which is bound to confuse the user. It provides a function to do a read, which is nice. Also has a simple "parse configuration file" function.
This is a C++ library designed to be compatible with the string library that comes with the Microsoft Foundation Classes. Each character in the string is actually a class itself. This could probably be fixed without changing source compatability.
This library is a slightly saner version of the ISO C str* functions, with a few extentions. It works with (char *) as the native type, and doesn't do automatic allocation ... but almost all functions are size limited so buffer overflows aren't a big concern, but losing data is. The library has some vector functions, but it uses (char **) as the vector type and alters the original string in it's verison of split.
An interesting library, it uses an opaque type for the string which is suitably non simplistic internally to allow quite a few opimisations. The function names all obey the 6 character C89 identifier limit (a limitation Vstr completely ignores so as to be more consistant, and hopefuly more readable). It's not obvious if it would use more or less memory than Vstr ... my guess is more, but then I'm biased.
It uses (void *) in most places and takes either a C string or an (sz *) [the internal opaque type]. It distinguishes between these by a 2 character magic constant, so if you try and use a C string with that constant life becomes interesting. There is no printf like function.
This has a fairly well abstracted namespace, esp. considering it is
only bundled with the
It is somewhat ammusing that even though this isn't a string library, it is much better than most of the other string libraries here.
This is the string implementation that comes with the Boehm Garbage Collector (and so is included in gcc etc.). It pretty much requires a GC as you can't "alter" a string, only make a new string with the alterations in it. Sharing data is a main point of this implementation, however again it isn't possible to do things directly on a "substring" you must first create that substring as a first class string.
Although the basic APIs are there, add/del/sub/etc. there are few added functions to help you deal with the strings (although it does provide something equivilent to vstr_sc_read_len_file() but it uses stdio, and there isn't any good way to deal with IO errors).
Also note that the printf implementation just calls the host implementation of sprintf/asprintf/etc. directly for anything that isn't one of the 's', 'r', 'c' or 'n' format specifiers. The custom format specifier of 'r' is the only one possible and will make gcc barf warnings if you use it. It also doesn't allow i18n argument number specifiers. This is an "old" implementation though, with the last copyright from 1994 so some of these problems probably stem from that.
This is a reimplementation of the functions in qmail, but under a GPL license. It does have a stralloc set of calls that operate on a start pointer and length model, they do dynamically reallocate memory and pass memory failure back to the caller. As with all DJB code though, the API is written as a set of small atomic operations. For instance printf like functionality is implemented over 12 different functions named fmt_* (which don't check for overflows, but some are also reimplemented as a as stralloc functions). This design makes using the API much more clumsy, for a minor speedup, makes doing i18n almost impossible and goes directly against "premature optimization is bad".
It's worth nothing that although the stralloc functions deal with dynamic memory a lot of the other function ignore bounds checking and/or assume things are terminated with a '\0' character. Even more so than the DJB functions, although this is probably bad just implementation rather than deliberate -- but then why would you think you can write good code with an interface when the implementor can't (for instance the scan_long and scan_8long functions are almost completely broken in libowfat ... but fine in qmail).
This is a set of functions used in the
It is upto the user of the library whether you have a fixed or dynamically sized buffer, and I think you can return failure if memory isn't available but the vstream.c and vstring.c implementations just assumes this can't happen. The functions to act on the buffer are just copied APIs of ISO C (FILE *) manipulators, str* and mem* (with the addition of memcat()). Importantly there are no interfaces for removing data or substituting data in the string (you could probably do remove from the end of the string easily by playing with the pointers and counters, but you'd have to write your own function for it). There is no way to access anything but the entire string, using the API, or add data anywhere but the end of the string.
There is an interface for using netstrings, but instead of the simple begin and end semantics in Vstr the interface overloads the string interface ... so you have to say netstring_memcpy( ... ) which will copy data and encapsulate it as a netstring. It's also worth noting that the counters are of type "int", and the negative bit is used in the code ... so it's not possible to have a string bigger than INT_MAX.
The printf like functions are implemented by parsing the format string and then passing known good formats through to the host implementation sprintf() (after requesting enough space to hold them). It doesn't accept long long or long doulbe types, i18n argument number specifiers or thousand seperator modifiers.
Again, C++ ... uses a pointer and length model but allows reference counting on entire QString objects. This means that an assignment of an entire string from a to b will share most of the storage, but a substring or altering any part of either object will nullify all sharing. The printf like function has an internal implementation for parsing the format string (which doesn't allow i18n argument number specifiers -- or even l ll h hh size modifiers), but it also calls out to the host sprintf() implementation for numbers, pointers and doubles.
C++ ... uses a pointer and length model but allows reference counting the Dwtring objects. This means that an assignment of an entire string, or a substring from a to b will share most of the storage, but altering any part of either object will nullify all sharing. Apart from that it is much like QString in the QT library.
Has a start pointer and length model, however it does grow the strings itself ... and call abort() is the allocations fails. However, note that altough it includes a bad snprintf() implementation (see below) it doesn't have a sprintf() call to write into the "spif string". typedef's and macro's appear to be used just to make the code less readable. It has an "interesting" set of APIs, mainly due to the overhead of adding data to a string or getting a substring. For instance you can "splice" part of a string and another string, but you can't substitue data inside a string without copying it multiple times.
The fact that almost all searching/comparing APIs map onto C library APIs means that embeded NIL characters silently fail -- even though there are APIs to initialise strings from a file descriptor.
It has terrible abuse of the namespace, outside of the string.c file however it looks like you could seperate the string.c code out without a lot of work -- at which point the namespace is well contained (but it also is built assuming that you'll be using the "spif" object model -- however this doesn't seem to be a requirement).
The printf() like function is a version of Patrick Powell's snprtinf() with a really bad version of a floating point formatter added to it (see below for the Samba version which is slightly better version of Patrick's code with floating point).
These are libraries that provide a printf() like function, note that if the library provides more than just a printf() like it will be in the "string API" section above and the printf() like function will be compared as part of the overall string API.
Not a bad implementation for what is actually done (it seems to be fairly stds. compliant, and has optional platform specific behaviour, it seems to have been written by reading man pages from what I can see -- no small feat), however there is a lot of missing functionality (most of which it readily admits to on the web page). Probably the most noticable of is the missing 'zu', 'zd', 'ju', 'jd', 'td', 'hhd', etc. formats (it is hard to write a portable program without these). Next being missng i18n format modifiers, and the 'n' format.
No attempt is made to provide user specified formats.
The printf() like function is not stds. compliant in a number of ways, however it does try to implement most of the features of the std. Also worth noting is that the i18n format parameter modifiers have an arbitrary limit, and the double output is custom code that may well not produce the same output as a real implementation and as an I/O library the output model isn't very good for a dynamically expanding string.
User specified formats are supported, however they are done in such a way that the gcc warning parser makes them useless.
This code was originally posted to BrugTraq, and has since been hacked on by at least four more people. This is the version in Samba, however there are other versions of this code floating around as it was a favourite for people to jsut copy and paste into their project.
This is very poor for stds. compliance, it doesn't even support '#x' to print a leading "0x" before the number. '\'' (thousand modifiers), 'A', 'a', 'F', 'zd', 'td', 'hhd', etc. and i18n format parameter modifiers are all completly missing. An empty precision format doesn't make the precision zero. Unspecified precisions are wrong on double values, and zero specified ones aren't correct for octal etc.
'\'' (thousand modifiers), 'a', 'F', 'Lf', 'lld', 'td', 'zd', 'hhd' , etc. and i18n format parameter modifiers are all completly missing (although 'qd' is available for "long long" ints). 'A' isn't the std. formatter but a custom one. Unspecified precision is broken, as is corner cases for octal etc. also infinity/nan output is not correct with regard to case. Buffer overflows are possible in the integer formatting paths
Aopart from the static extensions 'A' and 'I', this is the basis of the OSSP str library implementation.
The sfio97 version available here is the version I looked at, it's possible that the 2002 versions fixes some of the problems. However I didn't get access to it.
This is fairly stds. complian, for when it was written. However it doesn't support '\'' (thousand seperators) 'td', 'jd', 'zd', 'hhd', 'a', 'A', or i18n parameter modifiers etc. Also double printing of infinity is wrong with regard to case.
This code contains a bunch of static extensions, and no way to define your own.
Any corrections or omissions you see in the above, feel free to contact me at the address below