Constrain the lifetime of a data member to one method

https://stackoverflow.com/questions/11455872

20-06-2021
|

题

I have encountered a slightly unusual problem. Consider the following code:

class parser
{
    lexer lex;

public:

    node_ptr parse(const std::string& expression)
    {
        lex.init(expression.begin(), expression.end());
        // ...
        // call some helper methods
        // return the result
    }

private:

    // lots of small helper methods, many of them accessing lex
};

The parse method initializes the lexer with an init method. Before that, the lexer is in an unusable "default" state. Usually, one should intialize a member during construction, so why don't I simply do this:

class parser
{
    lexer lex;

public:

    parser(const std::string& expr) : lex(expr.begin(), expr.end()) {}

    node_ptr parse()
    {
        // call some helper methods
        // return the result
    }
    // ...
};

First, this means a client can call the parse method multiple times, which wouldn't make much sense.

Second, and more importantly, it can very easily cause serious lifetime issues:

parser my_parser("1 * 2 + 3 * 4");
auto root = my_parser.parse();

In the above code, the lexer will be initialized with a temporary string object that ceases to exist at the end of the line, so calling the parse method in the next line will invoke undefined behavior.

For both of these reasons, I really want to initialize and parse in the same method. Unfortunately, I cannot do it in the constructor, because I need to return a result, and constructors cannot return results.

Technically it is possible to construct the lexer inside the parse method and destruct it afterwards if I also change the constructor and the destructor accordingly:

class parser
{
    static std::string dummy;
    lexer lex;

public:

    parser() : lex(dummy.begin(), dummy.end())
    {
        lex.~lexer();
    }

    node_ptr parse(const std::string& expression)
    {
        new(&lex) lexer(expression.begin(), expression.end());
        // call some helper methods
        lex.~lexer();
        // return the result
    }

    ~parser()
    {
        new(&lex) lexer(dummy.begin(), dummy.end());
    }
    // ...
};

But this is by far the ugliest code I have written in a very long time. It's also not exception-safe; what if a helper method throws? Indeed, that's exactly what happens when a parse error is encountered.

So how should I solve this issue? Use a local lexer inside parse and have a lexer* member point to it? Use a boost::optional<lexer> member? Or should I just live with the init method? Or should I do the parsing in the constructor after all and throw an "expection" containing the desired result?

解决方案

I would definitely not do your 2nd example. Better would be constructing the lexer in Parse(), and store a pointer or boost::optional. But if you want to allow this, then your helper functions must check whether the lexer is valid or not before continuing. Seems messy to me.

Better yet is to just make Parse a standalone function. I envision this to be more sensible to callers, and solve your problem:

void parser_helper(lexer& lex)
{
    ...
}

node_ptr Parse(const std::string& inp)
{
    lexer lex(inp);
    ...
    parser_helper(lex);
    ...
    return ret;
}

Or if you have more state to pass around...

class parser_helper
{
    lexer lex;
    ... other state here

public:
    parser_helper(const std::string& inp) :
        lex(inp)
    {
    }

    ... helper functions here.
    void helper_function() { }
}

node_ptr Parse(const std::string& inp)
{
    parser_helper helper(inp);
    ...
    helper.helper_function();
    ...
    return ret;
}

Either way, the lexer should just be an auto variable in the Parse function.

The idea is that the interface the caller expects is just a single function. No need to make the caller deal with a class, just because the internals of Parse have state / helper functions.

其他提示

You present no reason why parse (and indeed, lex) should not be simple functions. In addition, you present no cause for parse to take the expression, or take an initialized lexer.

Edit:

Or simply create them as lambdas on the stack of parse. However, as I've said, I can see the need for parser to exist. But it doesn't seem to need to exist outside of the member method of parse itself, questioning why you don't refactor that method to go outside the class. Something like

class parser { 
    lexer l;
     // stuff
};
node_ptr parse(...) { 
    parser p(...); 
    return p(); 
}

In the above code, the lexer will be initialized with a temporary string object that ceases to exist at the end of the line, so calling the parse method in the next line will invoke undefined behavior.

That doesn't make sense. You could make a copy of the temporary string and use it in the lexer.

Use a local lexer inside parse and have a lexer* member point to it?

That gets my vote. That way you have complete control over its lifetime.

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow