Question

So I'm trying to build a script that automagically prepends valid column names with its appropriate table prefix (e.g. "t." or "r.")

$t_columns = array('id', 'name', 'label');
$r_columns = array('related_value');

INPUT:

id > 1 AND (name = 'Hello' OR label IN ('World', 'Planet name AND label')) AND (related_value > 1 AND related_value < 50)

OUTPUT:

t.id > 1 AND (t.name = 'Hello' OR t.label IN ('World', 'Planet name AND label')) AND (r.related_value > 1 AND r.related_value < 50)

Notice how you can't do a normal str_replace. What would be the simplest code (I'm guessing preg_replace) to ensure that all table names are properly prepended?

Was it helpful?

Solution

This can be done in a lot of ways, and also using regex. I'd personally use an array approach. First of all, I'd define the mangling table this way:

$table = array(
    'id' => 't.id',
    'name' => 't.name',
    'label' => 't.label',
    'related_value' => 'r.related_value'
);

This will make a lot easier the str_replace() call:

function mangling(&$v, $k, $table)
{
    if (($k & 1) == 0)
        $v = str_replace(array_keys($table), array_values($table), $v);
}

$spans = explode("'", ' ' . $input);
array_walk($spans, 'mangling', $table);
$output = implode("'", $spans);

OTHER TIPS

After a few seconds' thinking, here's how I'd tackle it:

Walk through the string, char by char, looking for single quotes, but skipping over escaped characters. The stuff between two unescaped single quotes (i.e. the strings) would be replaced with an unique token, and put into an associative array, with that token as the key, and the original string as the value.

Now that we have the strings out of the way, do a str_replace() (or preg_replace(), if you insist) for the known column names. I'd probably construct the column names into an associative array, with the table's alias as the key, and the value as an enumerated array containing the column names. This way, the replacement could be automated.

Once the table names are filled in, just do a str_replace() for the tokens, to replace the original strings back to their places, and you're done.

I'm sure someone might whip an super-awesome (and probably next to unmaintainable) regexp to do all this in one hit. But I prefer to use regexps only in situations where regular expressions are actually the right tool, not where a CFL would be more suitable.

I don't know if a regular expression is a good idea here. I'd say it'd be worth the minimal amount of increased computational complexity to perform the validation yourself in PHP. Then, should your database require any changes, you won't have to pull your hair out worrying about how to increase the robustness of your regular expression.

As Jamie Zawinski once said, "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems. "

In terms of creating a workflow that will make sure you're working on the right database, I'd consider taking an object-oriented approach. Classes for T and R tables can be instantiated children of a model class responsible for building the correct table-type object using conditional logic within the same method, rather than a regex.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top