As noted in the comment by @larsmans this can be solved by Reinforcement Learning paradigm. In the context of neural networks currently the most popular (and only?) approach is to use two neural networks:
actor network: which learns what action (propeller power in this case) the agent is ought to take in a given state (vertical speed in this case)
critic network: which learns values, in the terms of future reinforcement agent can "hope" to achieve from this state
This approach is known as Actor-Critic methods. All you need to do additionally is to design the reinforcement function. In your case it seems quite simple, as it could be equal to the vertical velocity with additional penalty for deviating from some predefined height (otherwise the networks will learn just to wait a while till the propeller falls and stops for itself).
The main issue will be tuning all parameters for all of this to work correctly, however the problem seems very simple so it maybe not be very hard.