Java : How to find string patterns in a LARGE binary file?

Question 1

Google "finite state machine".

Or, read the file one byte at a time, if the byte just doesn't match the first character of the search term, go on to the next byte. If it does match, now you're looking for the next character in the sequence. I.e., your state has gone from 0, to 1. If your state equals (or passes) the length of the search string, you found it!

Implementation/debugging left to the reader.

Question 2

Seems like you are really looking for Aho-Corasick string matching algorithm.

The algorithm builds an automaton from the given dictionary you have, and then allows you to find matches using a single scan of your input string.

The wikipedia article links to this java implementation

Question 3

There are specialised algorithms for this but let's try a simple one first.

You can start with making the comparison on the fly, always after reading the next byte. Once you do that, it's easy to spot that you don't need to keep any bytes that are from earlier than your longest pattern.

So you can just use a buffer that is as long as your longest pattern, put new bytes in at one end and drop them at the other.

As I said, there are algorithms more effective than this but it's a good start.

Question 4

Use a FileInputStream wrapped in a BufferedInputStream and compare each byte. Keep a buffer the length of the sequence you're looking for so you backtrack if it doesn't match at some point. If the sequence you're looking for is too large, you could save the offset and re-open the file for reading.

Working with streams: http://docs.oracle.com/javase/tutorial/essential/io/
String matching algorithms: http://en.wikipedia.org/wiki/String_searching_algorithm

Or if you just want something to copy and paste you could look at this SO question.