I have some software written in C that has worked fine for me on every system that I've tested it on, until now. One of our sysadmins has installed it on our cluster and some weird behavior has emerged.
In the program, I'm creating a DBM database with Kyoto Cabinet, though on this box the process fails before the db is created. The program takes command-line arguments to specify the size of the memory-map and the number of buckets used by the KC library. It appends these values to the file name in accordance with the documentation for kcdbopen
. i.e., if the user passed --mmap-size=1024 --num-buckets=256
for a db file called foo.kch
, then a string foo.kch#msiz=1024#bnum=256
is constructed and passed to kcdbopen
.
As I said, this normally works fine. However, on this particular machine, the process will fail with an error that the memory could not be allocated. Plus, some debug text shows that the file that it tried to create starts with some garbage characters and the msiz and bnum numbers have overflowed to be something huge:
Failed to open database �!�u�foo.kch#msiz=249821240256#bnum=4199616: Cannot allocate memory
I'm using argp_parse
to parse the command-line arguments. The relevant parts of that code are:
struct arguments
{
char *args[1];
unsigned long int mmap_size;
unsigned long int num_buckets;
};
static error_t
parse_opt (int key, char *arg, struct argp_state *state)
{
struct arguments *arguments = state->input;
switch (key)
{
case 'm':
arguments->mmap_size = arg ? atol (arg) : 1024;
break;
case 'b':
arguments->num_buckets = arg ? atol (arg) : 100;
break;
case ARGP_KEY_ARG:
if (state->arg_num >= 1)
argp_usage (state);
arguments->args[state->arg_num] = arg;
break;
...<snip>....
I tried to be extra (overly?) cautious in building the filename string:
void *
zdb_create (char *dbfile, unsigned long int mmap_size, unsigned long int num_buckets,
bool verbose)
{
...<snip>...
char mmap_str[32];
char buckets_str[32];
char db_str[512];
if (strlen (dbfile) > sizeof (db_str) - sizeof (buckets_str) - sizeof (mmap_str))
error (EXIT_FAILURE, errno, "Filename too long");
strncat (db_str, dbfile, strlen (dbfile));
snprintf (mmap_str, sizeof(mmap_str), "#msiz=%lu", mmap_size);
if (strlen (dbfile) + strlen (mmap_str) < sizeof (db_str))
strncat (db_str, mmap_str, 32);
else
error (EXIT_FAILURE, errno, "Filename too long");
snprintf (buckets_str, sizeof(buckets_str), "#bnum=%lu", num_buckets);
if (strlen (dbfile) + strlen (mmap_str) < sizeof (db_str))
strncat (db_str, buckets_str, 32);
else
error (EXIT_FAILURE, errno, "Filename too long");
db = zdb_open (db_str, ZDB_CREATOR, false);
...<snip>...
So, I've just learned that there is a strtoul
function, which might be the source of the problem; atol
only converts to a standard long
and not an unsigned long
. However, I want to be 100% sure that I fix the problem before I go and bother the sysadmin again to try reinstalling (unfortunately, it's impossible for me to test it on my own, AFAICT). What is the most likely source of this overflow behavior? Are the integers overflowing and the garbage characters at the start of the string related? Why would this work on most systems except this one?
The system is using Linux 2.6.18, gcc 4.1.2 and, well, it's hard to tell which version of libc...version 2.3.x according to the info
page for it. I've only tested on more recent systems, such as Linux 2.6.32 / gcc 4.4.3 / libc 2.8 and Linux 3.9.8 / gcc 4.8.1 / libc 2.17.
Thanks!