picking in 3D with ray-tracing using NinevehGL or OpenGL i-phone

Question 1

What you have is a position in 2D on the screen. The first thing to do is convert that point from pixels to normalized device coordinates — -1 to 1. Then you need to find the line in 3D space that the point represents. For this, you need the transformation matrix/ces that your 3D app uses to create a projection and camera.

Typically you have 3 matrics: projection, view and model. When you specify vertices for an object, they're in "object space". Multiplying by the model matrix gives the vertices in "world space". Multiplying again by the view matrix gives "eye/camera space". Multiplying again by the projection gives "clip space". Clip space has non-linear depth. Adding a Z component to your mouse coordinates puts them in clip space. You can perform the line/object intersection tests in any linear space, so you must at least move the mouse coordinates to eye space, but it's more convenient to perform the intersection tests in world space (or object space depending on your scene graph).

To move the mouse coordinates from clip space to world space, add a Z-component and multiply by the inverse projection matrix and then the inverse camera/view matrix. To create a line, two points along Z will be computed — from and to.

enter image description here

In the following example, I have a list of objects, each with a position and bounding radius. The intersections of course never match perfectly but it works well enough for now. This isn't pseudocode, but it uses my own vector/matrix library. You'll have to substitute your own in places.

vec2f mouse = (vec2f(mousePosition) / vec2f(windowSize)) * 2.0f - 1.0f;
mouse.y = -mouse.y; //origin is top-left and +y mouse is down

mat44 toWorld = (camera.projection * camera.transform).inverse();
//equivalent to camera.transform.inverse() * camera.projection.inverse() but faster

vec4f from = toWorld * vec4f(mouse, -1.0f, 1.0f);
vec4f to = toWorld * vec4f(mouse, 1.0f, 1.0f);

from /= from.w; //perspective divide ("normalize" homogeneous coordinates)
to /= to.w;

int clickedObject = -1;
float minDist = 99999.0f;

for (size_t i = 0; i < objects.size(); ++i)
{
    float t1, t2;
    vec3f direction = to.xyz() - from.xyz();
    if (intersectSphere(from.xyz(), direction, objects[i].position, objects[i].radius, t1, t2))
    {
        //object i has been clicked. probably best to find the minimum t1 (front-most object)
        if (t1 < minDist)
        {
            minDist = t1;
            clickedObject = (int)i;
        }
    }
}

//clicked object is objects[clickedObject]

Instead of intersectSphere, you could use a bounding box or other implicit geometry, or intersect a mesh's triangles (this may require building a kd-tree for performance reasons).

[EDIT]
Here's an implementation of the line/sphere intersect (based off the link above). It assumes the sphere is at the origin, so instead of passing from.xyz() as p, give from.xyz() - objects[i].position.

//ray at position p with direction d intersects sphere at (0,0,0) with radius r. returns intersection times along ray t1 and t2
bool intersectSphere(const vec3f& p, const vec3f& d, float r, float& t1, float& t2)
{
    //http://wiki.cgsociety.org/index.php/Ray_Sphere_Intersection
    float A = d.dot(d);
    float B = 2.0f * d.dot(p);
    float C = p.dot(p) - r * r;

    float dis = B * B - 4.0f * A * C;

    if (dis < 0.0f)
        return false;

    float S = sqrt(dis);    

    t1 = (-B - S) / (2.0f * A);
    t2 = (-B + S) / (2.0f * A);
    return true;
}

Question 2

vec4f from = toWorld * vec4f(mouse, -1.0f, 1.0f);
vec4f to = toWorld * vec4f(mouse, 1.0f, 1.0f);

I'm assuming that 'from' is the position of the mouse cursor? If so then why is its z negative one, if we are assuming openGL coordinates.

Also in this way do we assume that the depth at this time is -1 to +1 right? Rather than the depth of our frustrum.