Please compare what your original problem looks like:
A global optimization problem
minimize f(y)
is solved by looking for solutions of the derivatives system
0=grad f(y) or 0=df/dy (partial derivatives)
(the gradient is the column vector containing all partial derivatives), that is, you are computing the "flat" or horizontal points of f(y).
For optimization under constraints
minimize f(y,u) such that g(y,u)=0
one builds the Lagrangian functional
L(y,p,u) = f(y,u)+p*g(y,u) (scalar product)
and then compute the flat points of that system, that is
g(y,u)=0, dL/dy(y,p,u)=0, dL/du(y,p,u)=0
After that, as also in the global optimization case, you have to determine what the type of the flat point is, maximum, minimun or saddle point.
Optimal control problems have the structure (one of several equivalent variants)
minimize integral(0,T) f(t,y(t),u(t)) dt
such that y'(t)=g(t,y(t),u(t)), y(0)=y0 and h(T,y(T))=0
To solve it, one considers the Hamiltonian
H(t,y,p,u)=f(t,y,u)-p*g(t,y,u)
and obtained the transformed problem
y' = -dH/dp = g, (partial derivatives, gradient)
p' = dH/dy,
with boundary conditions
y(0)=y0, p(T)= something with dh/dy(T,y(T))
u(t) realizes the minimum in v -> H(t,y(t),p(t),v)