MR is a fault tolerant framework. When a Map task fails (streaming API or Java API) the behavior is the same.
Once the job tracker is notified that the task has failed it will try and reschedule the task. The temporary output generated by the failed task is deleted.
A more detailed discussion on how failures are handled in MR can be seen here
For your particular case I think you need to refer to the external source in your setup() method to find out the records which have been processed, then use this information in your mapper() methods to decide whether a particular record should be processed or not.