Question

All my column headers in a MySQL database are prefixed with a number, 1_X, 2_X, etc... which makes bringing the data into IDL impossible using just a basic select statement to bring in the entire table. I'm not sure but I see two possible ways:

1) Bring in the table with column name aliases. Can I use TRIM or SUBSTRING_INDEX to remove/replace the first two characters?

2) Create a routine that uses the information schema to to recursively go through and delete the first two characters of the column headers and create a new table with those headers and copy the data in.

If there weren't so many different tables (all with 1_X, 2_X, etc...) there'd be no problem manually selecting 1_X AS X but that's not feasible. It would be great to be able to use TRIM/SUBSTRING on column headers in the select statement.

Thanks.

Was it helpful?

Solution

It's not possible to use functions in a SQL statement to alter the identifier assigned to a column being returned. The SQL way of specifying the identifier for the column in a resultset is to use the expr AS alias approach.


Rather than trim off the leading digit characters, you could prepend the identifiers with another valid character. (Trimming off leading characters seems like it would potentially lead to another problem, duplicate and/or zero length column names.)

You could just use a SQL statement to generate the SELECT list for you.

(NOTE: the GROUP_CONCAT function is limited by some system/session variables: group_concat_max_len and max_allowed_packet, it's easy enough to adjust these higher, though changing global max_allowed_packet may require MySQL to be restarted.)

To get it back the SELECT list on all one line (assuming you won't overflow the GROUP_CONCAT limits) something like:

SELECT c.table_schema
     , c.table_name
     , GROUP_CONCAT(
          CONCAT('t.`',c.column_name,'` AS `x',c.column_name,'`')
          ORDER BY c.ordinal_position
       ) AS select_list_expr
  FROM information_schema.columns c
  FROM information_schema.columns c
 WHERE c.table_schema = 'mydatabase'
 GROUP BY c.table_schema, c.table_name

Or, you could even get back a whole SELECT statement, if you wrapped that GROUP_CONCAT expression (which produces the select list) in another CONCAT

Something like this:

SELECT CONCAT('SELECT '
          , GROUP_CONCAT(
               <select_list_expr>
            )
          , ' FROM `',c.table_schema,'`.`',c.table_name,'` t;'
       ) AS stmt
  FROM information_schema.columns c
 WHERE c.table_schema = 'mydatabase'
 GROUP BY c.table_schema, c.table_name

You could use a more clever expression for <select_list_expr>, to check for leading "digit" characters, and assign an alias to just those columns that need it, and leave the other columns unchanged, though that again introduces the potential for returning duplicate column names.

That is, if you already have columns named '1_X' and 'x1_X' in the same table. But a carefully chosen leading character may avoid that problem...

The <select_list_expr> could be more clever by doing a conditional test for leading digit character, something like this:

SELECT CONCAT('SELECT '
          , GROUP_CONCAT(
               CASE
               WHEN c.column_name REGEXP '^[[:digit:]]'
               THEN CONCAT('t.`',c.column_name,'` AS `x',c.column_name,'`')
               ELSE CONCAT('t.`',c.column_name,'`')
               END
           )
          , ' FROM `',c.table_schema,'`.`',c.table_name,'` t;'
       ) AS stmt
  FROM information_schema.columns c
 WHERE c.table_schema = 'mydatabase'
 GROUP BY c.table_schema, c.table_name

Again, there's a potential for generation "duplicate" column names with this approach. The conditional test "c.column_name REGEXP" could be extended to check for other "invalid" leading characters as well.


As a side note, at some point, someone thought it a "good idea" to name columns with leading digit characters. Just because something is allowed doesn't mean it's a good idea.


Then again, maybe all that rigamarole isn't necessary, and just wrapping the column names in backticks would be sufficient for your application.

OTHER TIPS

I think you can follow option 2. However this will not be quick solution.

Another way around this could be,

  1. Generate schema script for the tables you want to correct.
  2. Open the script in notepad++ or any editor that supports find using regular expression.
  3. Search and replace with [0-9]+_ expression and empty string for replacement.
  4. Create the new tables using this script and copy data into them.

This may sound like a manual approach but you will do this once for all of your tables.

Look into a strategy of doing 2 selects, one for the column name, then one for the data with column alias. You might have to revert to some scripting language, like PHP, for help.

First, get the column names :

show columns from tbl_client;
+-------------------------------+-----------------------------------+------+-----+---------------------+-----------------------------+
| Field                         | Type                              | Null | Key | Default             | Extra                       |
+-------------------------------+-----------------------------------+------+-----+---------------------+-----------------------------+
| 1_X                           | int(11)                           | NO   | PRI | NULL                | auto_increment              |

Then, loop through the results and create a list of column alias

Then create your new select

SELECT 1_X as NEW_COLUMN_NAME_FOR_FIELD_1 FROM tbl_client;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top