Derivation of (glu)lookAt

Question 1

When you render a triangle, the vertices' coordinates are interpreted as follows:

The x-coordinate will influence the horizontal position on the viewport. -1 is the left edge and +1 is the right edge.
The y-coordinate will influence the vertical position on the viewport. -1 is the bottom edge and +1 is the top edge.
The z-coordinate will influence the depth information. -1 is the position at the camera's plane (near) and +1 is the far plane. This value is usually used to write to the depth-buffer.

That's why your simple example renders a visible triangle at the far plane.

Now let's come to the view transformation. The transformation will be constructed from four vectors. The image of (1, 0, 0), the image of (0, 1, 0), the image of (0, 0, 1) and a translation vector. However, since the view transformation is an inverse transformation, the resulting matrix has to be inverted.

You are right that the view direction is center - eye. However, that is not what we need for the matrix. We need the image of (0, 0, 1). Usually, OpenGL programs use a right-handed coordinate system. In that system the camera looks into negative z-direction. So center - eye is actually the image of (0, 0, -1). The image of (0, 0, 1) is then just eye - center. That's what you need.

With this definition you will also need an appropriate projection transformation. Otherwise you will only see things behind the camera (because that's where the z-coordinate is positive and, hence, have a positive depth value). The projection transformation is responsible for turning negative z-coordinates into positive depth values.

Question 2

When using a virtual camera, one usually defines an eye space. This definition can be arbitraty, but there are some widespread conventions (and the old matrix stack of the GL defined or preferred some of the conventions). gluLookAt is specified for the following convention:

the camera is placed at the origin
the camera looks at the negative z direction, +x is the right axis and +y is the up direction (so we have a right-handed coordinate system)

You should be aware that the world space does not really matter when rendering. All what matters is the relative placement of the objects relative to the virtual camera/eye (and that is also why old GL matrix stack has a combined ModelView matrix, and not two separate matrices for Model and View transfroms). Using a world space where all the objects are placed and specifying a virtual camera in that space is just more intuitive. And that is what gluLookAt is supposed to do. If you leave the view matrix at Identity, it is as if your camera is at world space origin, looking into -z direction. So to get the effect of moving the camera to some specified eye point, it is the same as moving all objects in that world space in the inverse direction. Same is true for rotations. And gluLookAt will just set up a rotation and a translation. Have a look at my previous answer about gluLookAt for some of the details.