Frage

I am using pentaho for data migration testing. I have set a "table input" step where many parts of the query inside "table inputs" are variables. I have been looking for a way to capture that query after it gets executed during runtime. I was wondering if there is any specific system log variables for sql or is it to do with metadata. need help! Thanks

War es hilfreich?

Lösung

Maybe the following approach will help:

  • We assume a transformation reading a CSV file to get the dynamic portion of the SELECT statement (e.g. the columns) and setting the variable columns with it.

enter image description here

  • The second transformation uses this variable to generate the SELECT statement and store it into the variable sql_statement.

enter image description here

  • In the main transformation we use ${sql_statement} as the SELECT statement of the table input and write the data to an output file (that's the business process so to say). From the same input we copy the output to another path. There we add the current time as a field (use element "Get system data") and we add the generated SQL statement, join them as a cartesian product and group the result by the sql_statement. That way we can compute the first time and the last time that the statement was used. These results are written to a text file.

enter image description here

  • The last thing we need is a job calling the three transformations sequentially.

enter image description here

This is a sample output:

sql_statement;min_time;max_time
SELECT my_column FROM test_table;2014/05/08 00:41:21.143;2014/05/08 00:41:21.144

Andere Tipps

Thank you Marcus! I did some thing similar. It works. awesome. I gathered parts of queries from table field where they were kept and formed a full query in javascript. After that full query will be sent as parameter to a transformation that will run and log the query.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top