The libraries you give do not do this automatically. I'm not sure how savvy you are with image processing, but for the sake of an answer I will assume you are.
I'm not sure if you want to simply label pixels as "wall" or "finger", or if you want to know when the finger has come within some distance threshold of the wall.
If you have a clean frame, where the Kinect is just looking at the wall and there are no fingers, then you can use that frame to find exactly where the wall is. You can either create a simple depth-based background frame by averaging the depth frame over several frames. Then for subsequent frames, if there is any significant difference (say, >3 cm), you can declare that a finger.
A slightly more advanced alternative is to use surface normals. For each depth pixel (x,y,z), take the cross product of the vector between this pixel and (x+1,y,z1), and this pixel and (x,y+1,z2). If you're looking at a non-curving wall, surface normals should be uniform across it. In the Kinect Fusion Video, they map surface normals (x,y,z) to (r,g,b) and the effect is really nice. Anything that has a surface normal that isn't the same as the wall, you can declare as a finger.