That is basicly a tradeoff factor for CPU and Memory. So you have a few factors to consider. The lookupgap messes with the scratchpad that is fixed to 128KB per hasher (for Litecoin mining). So basicly your GPU has a small local memory for each core that has a VERY high bandwidth and big global memory. (You can see more about the memory arch of a GPU here: http://www.microway.com/hpc-tech-tips/gpu-memory-types-performance-comparison/ )
So basicly the operations at scratchpad are massive, if you have a better bandwidth, you will have more speed. So maybe what is happening is that the scratchpad doesnt fit on your local memory, but when you put lookup-gap = 2, you get half of the size, so it fits more on the local memory, than before, so the GPU can make these operations local.
Other point, shared memory has a problem when you are using all the cores of your GPU: They cannot all do read/write operations at the memory at same time. And for the local memory, every processor of GPU has their own, so all of them can do massive read/write operations at scratch pad.
That is a factor that can make that hashrate drop, but its not necessarily that. There is a LOT of factors that can change your hash rate. I hope it helps :D