سؤال

I'm using an Excel input step in a transformation; I need to process a lot of excel files in a directory; the problem is that kettle is processing them in an arbitrary way, so that the result is not always what I was hoping for. Is there some way to specify the order for processing the files? I need spoon to process them by date, starting from the oldest to the newest. Thank you.

هل كانت مفيدة؟

المحلول

Late reply, but mybe still helpful.

You could first use a "Get File Names" and get the list of the files in the directory. Then you use "Sort Rows" and sort by "lastmodifiedtime" (don't think there is "filecreatedtime" availble, so that is a risk). Then you write the result to log. Afterwards you read this log a process the file one by one.

نصائح أخرى

I don't know if there's a reliable way to make PDI process the files in a particular order at the job level.

But what you can do is go to the 'Additional output fields' tab in the Excel input step and specify a field name for the file name (either 'Full filename field' or 'Short filename field'). This will cause your file name to be added as a column in output of the Excel input step with the name you specify. Then simply flow this through a Sort rows step and sort by that column.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top