Several ways to calculate squared euclidean distance matrices in Python

R. Jin
2 min readSep 9, 2019

The need to compute squared Euclidean distances between data points arises in many data mining, pattern recognition, or machine learning algorithms. Often, we even must determine whole matrices of squared distances.

If we are given an m*n data matrix X = [x1, x2, … , xn] whose n column vectors xi are m dimensional data points, the task is to compute an n*n matrix D is the subset to R where Dij = ||xi-xj||²

There are already many ways to do the euclidean distance in python, here I provide several methods that I already know and use often at work.

5 methods:

  • numpy.linalg.norm(vector, order, axis)
  • numpy.dot(vector, vector)
  • using Gram matrix G = X.T X
  • avoid using for loops
  • SciPy build-in func

I am attaching the functions of methods above, which can be directly called in your wrapping python script.

  • Import modules first:
Import modules that are used in the functions
  • 5 methods functions as below:
Method 1: numpy.linalg.norm
Method 2: numpy.dot(vector, vector)
Method 3: using Gram matrix
Method 4: avoid using for loops
Method 5: using SciPy

I hope this summary may help you to some extent. If you have any questions, please leave your comments. If you like it, your applause for it would be appreciated.

Thanks for reading…

--

--

R. Jin

世界が终わるまでは… #watchman@CSUF#R&D #VR/AR #ML #randomblogging #solutionsharing #opinionsonmyown