Anagram time complexity

Question 1

Well, lets see...

Assuming that the length() method runs in constant time (ie: it doesn't work like strlen()). Your method frequencyMap is o(m), and anagram calls it twice. anagram is called n-m times. Total complexity is on the order of o(2*m*n). With m << n, 'big o' is O(n).

I can suggest a couple of optimizations. First you are re-generating the frequency map for string b at every call to anagram. Do it once at the beginning of ana_check. You can have an anagram method that takes a string and a frequency map instead of two strings.

The other thing I would do is to remove the length checks from anagram. Yes, it's a safety feature, but you already know the strings you passed in are the same size. And anyway, if they are different lengths the frequency maps will still not match, so the function is correct.

A trickier optimization would be to modify string a's frequency map instead of re-doing it every time. For the first substring, you do it as usual. But then you move ahead one character, subtracting the first character from the map and adding the new one. Sure, if m is <= 3 it won't make a difference, but anything larger than that would be a win.

Question 2

You don't need to compare the whole map for every position.

Start by creating a signed frequency map and subtracting every letter in B. Keep a counter c of how many non-zero entries are contained in the map.

Next add the first m (length of B) letters of A into the map. For each letter you add, if that count used to be zero then increment c, or if it became zero after you added the letter then decrement c.

If c is now zero then you've found an anagram (every negative count from B has been balanced by a positive count from A), otherwise carry on.

Add the next letter of A to the frequency map, and remove the letter m letters prior to that, adjusting c appropriately for both operations.

Repeat those last two steps until c becomes zero or you run out of letters in A.

You might try to optimise this further by recognising that every time you add a character that doesn't appear in B, you're guaranteed a non-match for the next m characters (this is distinct from where the count simply goes positive, as other characters you've passed may cancel that before m). So you can restart the precondition from after that letter. The complexity of the operations this allows you to skip isn't very high, though, and this special-case code might not be any faster.