Recently, I was asked about Projection matrix in depth. I was in discussion with my colleague on this topic & for some reason, I always took this matrix for granted. With functions such as D3DXMatrixPerspectiveFovLH for DirectX & glm::perspective() for OpenGL, it becomes piece of cake to generate this matrix. However, the real crux lies in understanding how & what each term gets evaluated? Questions such as why in Perspective projection points far away converge the way they do, how near far plane governs the output etc etc can be best understood once you know what is happening under the hood. Earlier I referred to 3D Game Primer book which has excellent explanation to how this matrix is calculated. Author is spot on with some of his explanation.
Another interesting piece of information on this topic is available on Songho & I personally feel that this is the best in depth information you can ever get on this topic. Although, I still feel that even this information is not as easy to digest as it looks. In this post, I have tried explaining it in my own words. I find it quite useful to have things noted down in your own words somewhere so that it can be easily recollected in future if need arises. Also, it might help someone in some way.
I have taken scans of my notebook where I have scribbled down some of the formulas required.
Perspective Divide : If you look at the image above, it shows the how simple relation between two similar triangles can be used to find out projected point. As you can see,if we apply the analogy to camera frustum ( side view & top view ) we get relation on how to compute projected point from the actual point. In case of camera, projected points can be calculated by "dividing" by Z. This is the core concept which we need to keep in mind all the time.
What is NDC : NDC stands for Normalized Device Coordinates. In perspective projection 3D point in frustum is mapped to a cube in a range [-1,1], this space is known to be NDC. So, goal is to bring projected points in a range of [-1,1]. We will see what happens to the Z coordinate shortly.
We begin by mapping projected points in a range of [-1,1]. As you can see from the frustum diagram, x value on near plane varies between [l,r] whereas y value on near plane varies between [b,t]. Now we know maxima and minima for both our current values & the target values. This can be easily plotted as a graph with one axis varying between [l,r] & another axis varying between [-1,1]. As you can see in the image above, if we consider a straight line from point [r,1] to [l,-1], its equation can be found out using standard line equation with 'm' being the slope of the line.
Once we figure out line intercepts C1 & C2, using both slope (m1 & m2) & intercepts. Now we have line equation in terms NDC co-ords & projected co-ords. As we have seen in our earlier image, this projected points can also be expressed in terms of division by Z coordinate. Substituting & rearranging terms we get equation I & II.
Lets recall that a 4D homogeneous vector is mapped to the corresponding 3D vector by dividing by 'w'. Hence, the goal of projection matrix would be get correct values into w so that the above mentioned division causes desired projection! Actually, the projection doesn't take place when we multiply with this matrix, it happens when we divide x,y,z by w.
Please refer to the above image to understand what is happening under the hood for correct projection to occur. [x y z w] gets mapped to [x y z z] using the matrix specified, which on division by w i.e. z in our case, will cause the desired projection as x & y both the values are getting divided by z. ( Remember Perspective divide? ) There are few differences in terms of Handedness & row major or column major matrices.
OpenGL is RHS and uses column major matrices whereas D3D is LHS & uses row major matrices. If ever you need to go from one API to another then perhaps some of the major points of confusion occurs in Projection matrix itself. As a rule of thumb ( any hand thumb is fine here :D ) we need to transpose the matrix & invert the Z component while going from D3D to OpenGL or vice versa.
After comparing equation I & II with this matrix, we can fill up entries in our matrix to get the desired x & y projections.
Finally, we have to map Z param in the range [-1,1]. This can be easily done by using the current matrix & calculating A & B values as described in the image above.
Once we have values for A & B evaluated, we can construct our final Projection matrix which is guaranteed to take values from canonical view frustum to NDC space.
Code above shows our projection matrix being used in the code as it is instead of glm::perspective() function which does the exact projection of points / vectors. Refer to the image below to the final output produced by using both glm::perspective() function & our custom projection matrix calculations.
|our custom projection matrix implementation...|
|glm::perspective() function ...|
As you can see, output matches! This is how I have understood projection matrix evaluation & usage. I strongly recommend reading songho explanation for anything that i might have missed out on. Please feel free to point out any mistakes in the above description.
Hope it helps!