I'm using an Excel input step in a transformation; I need to process a lot of excel files in a directory; the problem is that kettle is processing them in an arbitrary way, so that the result is not always what I was hoping for. Is there some way to specify the order for processing the files? I need spoon to process them by date, starting from the oldest to the newest. Thank you.

有帮助吗?

解决方案

Late reply, but mybe still helpful.

You could first use a "Get File Names" and get the list of the files in the directory. Then you use "Sort Rows" and sort by "lastmodifiedtime" (don't think there is "filecreatedtime" availble, so that is a risk). Then you write the result to log. Afterwards you read this log a process the file one by one.

其他提示

I don't know if there's a reliable way to make PDI process the files in a particular order at the job level.

But what you can do is go to the 'Additional output fields' tab in the Excel input step and specify a field name for the file name (either 'Full filename field' or 'Short filename field'). This will cause your file name to be added as a column in output of the Excel input step with the name you specify. Then simply flow this through a Sort rows step and sort by that column.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top