Pregunta

I figure the following program should either complain it can't compile the regular expression or else treat it as legal and compile it fine (I don't have the standard so I can't say for sure whether the expression is strictly legal; certainly reasonable interpretations are possible). Anyway, what happens with g++ (Ubuntu/Linaro 4.8.1-10ubuntu9) 4.8.1 is that, when run, it crashes hard

*** Error in `./a.out': free(): invalid next size (fast): 0x08b51248 ***

in the guts of the library.

Questions are:

a) it's bug, right? I assume (perhaps incorrectly) the standard doesn't say std::regex can crash if it doesn't like the syntax. (msvc eats it fine, fwiw)

b) if it's a bug, is there some easy way to see whether it's been reported or not (my first time poking around gnu-land bug systems was intimidating)?

#include <iostream>
#include <regex>

int main(void)
    {
    const char* Pattern = "^(%%)|";
    std::regex Machine;

    try {
        Machine = Pattern;
        }
    catch(std::regex_error e)
        {
        std::cerr << "regex could not compile pattern: "
          << Pattern << "\n"
          << e.what() << std::endl;
        throw;
        }

    return 0;
    }
¿Fue útil?

Solución

I would put this in a comment, but I can't, so...

I don't know if you already know, but it seems to be the pipe | character at the end that's causing your problems. It seems like the character representation of | as a last character (since "^(%%)|a" works fine for me) given by g++ is making a mess when regex tries to call free();

The standard (or at least the online draft I'm reading) claims that:

28.8
Class template basic_regex
[re.regex]

1 For a char-like type charT, specializations of class template basic_regex represent regular expressions
constructed from character sequences of charT characters. In the rest of 28.8, charT denotes a given char-
like type. Storage for a regular expression is allocated and freed as necessary by the member functions of
class basic_regex.

2 Objects of type specialization of basic_regex are responsible for converting the sequence of charT objects
to an internal representation. It is not specified what form this representation takes, nor how it is accessed by
algorithms that operate on regular expressions.
[ Note: Implementations will typically declare some function
templates as friends of basic_regex to achieve this — end note ]

and later,

basic_regex& operator=(const charT* ptr);

3 Requires: ptr shall not be a null pointer.

4 Effects: returns assign(ptr).

So unless g++ thinks const char* Pattern ="|"; is a null ptr (I would imagine not...), I guess it's a bug?

EDIT: Incidentally, consecutive || (even when not at the end) seem to cause a segmentation fault for me also.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top