A.Q1. You can not be exactly sure what is in the cache right now. Therefore this algorithm doesn't really make sense. You might expect most of the data to be in the cache if you read A sequentially from A
till A+L3width-1
and avoid doing ANYTHING else, but it is more or less bringing data to $ and expect it to stay there for some (short) time.
A.Q2. Nope. No way
A.Q3. Of course it would, even more than L3