At its core, cosine similarity is a way to measure how similar two vectors are, by comparing the angle between them, not their size. This makes it useful because it ignores things like changes in image brightness, the size of faces, or other factors that donβt affect the identity itself. Many modern face recognition systems and recommendation systems are trained to use cosine similarity, so using it when comparing faces or your amazon recommendations makes sense. Additionally, cosine similarity is efficient to compute (just a dot product and two magnitudes), and it often gives better results than using Euclidean distance for identifying people in most face-recognition systems or making personalised recommendations.
Where:
- is the dot product of vectors A and B
- is the magnitude (norm) of vector A
- is the magnitude (norm) of vector B
It outputs a value between -1 and 1:
- 1 = exactly same direction β very similar
- 0 = completely orthogonal (unrelated)
- -1 = exactly opposite direction
But hereβs the real kicker β it ignores the size (magnitude) of the vectors. So if two people wrote the exact same review but one used ALL CAPS and emojis while the other wrote it chill, cosine would still say βyep, same vibe.β