質問

What's the most proper and best practice driven way of configuration my transformations?

In other words let's imagine I have a big ETL solution based on kettle that does stuff by connecting to different data source, I would like to store these data sources in a centralized location and have each transformation look it up everytime it needs to connect somewhere.

In SSIS there is package configuration what is the alternative that I have with pentaho?

Ps: I do not want to install any 3rd party framework.

Thank you

役に立ちましたか?

解決

This can be done in various ways.

  1. Parameterising the database connections, and configuring the properties via kettle.properties. You could still access that kettle.properties from a shared area or something.

  2. As above, but configuring the connections by reading credentials from a database. Has to be hand crafted, but can be made to work with some caveats.

  3. If you use the repository, then the database connections are stored centrally anyway. So if you have a dev and a prd repo, when you promote, dont promote the db connection itself. Trickier than it sounds though.

As for all of that, the new 4.4(?) release should have proper lifecycle management to make dealing with all this stuff a lot easier!

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top