Several ways to calculate squared euclidean distance matrices in Python
The need to compute squared Euclidean distances between data points arises in many data mining, pattern recognition, or machine learning algorithms. Often, we even must determine whole matrices of squared distances.
If we are given an m*n data matrix X = [x1, x2, … , xn] whose n column vectors xi are m dimensional data points, the task is to compute an n*n matrix D is the subset to R where Dij = ||xi-xj||²
There are already many ways to do the euclidean distance in python, here I provide several methods that I already know and use often at work.
5 methods:
- numpy.linalg.norm(vector, order, axis)
- numpy.dot(vector, vector)
- using Gram matrix G = X.T X
- avoid using for loops
- SciPy build-in func
I am attaching the functions of methods above, which can be directly called in your wrapping python script.
- Import modules first:
- 5 methods functions as below:
I hope this summary may help you to some extent. If you have any questions, please leave your comments. If you like it, your applause for it would be appreciated.
Thanks for reading…