The reason to use ebp is that esp will change, e.g. passing arguments to a subroutine. So your ebp will make you able to access the same variable using the same offset, no matter where esp is pointing in that moment.
When you push values on stack, its value is decremented; incremented when you pop.
The code subtracts 12 (4*3) to make room for 3 32 bit (4 byte) integers. Ebp points on the "bottom", where esp was before. So you access variable using negative offset, e.g. ebp-4. So your picture is wrong: ebp+Whatever points to something that your code should not play with.
BEFORE
lower address
|
|<-------------- esp (=ebp after mov ebp, esp)
|
|
higher address
AFTER mov ebp, esp; sub esp, 12
lower address
|<-------------- esp
|
|
|<-------------- ebp
|
|
higher address
AFTER mov [ebp-4], 10 ecc.
lower address
| 2 <-------------- esp
| 5
| 10
|<-------------- ebp
|
|
higher address
At this moment [esp]
would retrieve [ebp-12] i.e. 2.