I had to dig a little bit, but eventually, I found some hidden gems.
According to Luis Argerich:The matrix factorization algorithms used for recommender systems try to find two matrices: P,Q such as P*Q matches the KNOWN values of the utility matrix.
This principle appeared in the famous SVD++ “Factorization meets the neighborhood” paper that unfortunately used the name “SVD++” for an algorithm that has absolutely no relationship with the SVD.
For the record, I think Funk, not the authors of SVD++, first proposed the mentioned matrix factorization for recommender systems.
In fact, SVD++, as its name suggests, is an extension of Funk’s work.
Xavier Amatriain gives us a bigger picture:Let’s start by pointing out that the method usually referred to as “SVD” that is used in the context of recommendations is not strictly speaking the mathematical Singular Value Decomposition of a matrix but rather an approximate way to compute the low-rank approximation of the matrix by minimizing the squared error loss.
A more accurate, albeit more generic, way to call this would be Matrix Factorization.
The initial version of this approach in the context of the Netflix Prize was presented by Simon Funk in his famous Try This at Home blogpost.
It is important to note that the “true SVD” approach had been indeed applied to the same task years before, with not so much practical success.
Wikipedia also has similar information buried in its Matrix factorization (recommender systems) article:The original algorithm proposed by Simon Funk in his blog post factorized the user-item rating matrix as the product of two lower-dimensional matrices, the first one has a row for each user, while the second has a column for each item.
The row or column associated with a specific user or item is referred to as latent factors.
Note that, despite its name, in FunkSVD no singular value decomposition is applied.
SVD is a somewhat complex mathematical technique that factorizes matrices intro three new matrices and has many applications, including PCA and RS.
Simon Funk applied a very smart strategy in the 2006 Netflix competition, factorizing a matrix into two other ones and using gradient descent to find optimal values of features and weights.
It’s not SVD, but he used that term anyway to describe his technique.
The more appropriate term for what Funk did is Matrix Factorization.
Because of the good results and the fame that followed, people still call that technique SVD because, well, that’s how the author named it.
I hope this helps to clarify things a bit.