It sounds to me like you have 2 steps here:
- Truncate existing table.
- Insert data from file.
You need to decide which step should execute first.
If you want to execute the 'truncate' step first, you can add checks to validate the file before performing the truncate. For example, you could use a @BeforeStep
to check that the file exists, is readable, and is not size 0.
If you need to guarantee that the entire file is parsed without error before loading the database table, then you will need to parse the data into some temporary location as you mention and then in a second step move the data from the temporary location to the final table. I see a few options there:
- Create a temporary table to hold the data as you suggest. After reading the file, in a different step, truncate the target table and move from the temp table to the target table.
- Add a 'createdDateTime' column or something similar to the existing table and an 'executionDate' job parameter. Then insert your new rows as they are parsed. In a second step, delete any rows that have a created time that is less than the executionDate (This assumes you are using a generated ID for a PK on the table).
- Add a 'status' column to the existing table. Insert the new rows as 'pending'. Then in a second step, delete all rows that are 'active' and update rows that are 'pending' to 'active'.
- Store the parsed data in-memory. This is dangerous for a few reasons; especially if the file is large. This also removes the ability to restart a failed job as the data in memory would be lost on failure.