TL;DR - By making a few geometric assumptions, the Haversine formula provides an exceptionally simple way of calculating the distance between two latitude/longitude pairs. This tutorial will show you how to implement your own version in Python.
A lot of my posts recently have focused on the analysis of spatial data coming from either the GPS on my phone (collected through Strava) or geo-tagged tweets. Because of this I ended up writing my own Python module for calculating the distance between two latitude/longitude pairs. This is accomplished using the Haversine formula. While more accurate methods exist, the Haversine formula and Python implementation couldn’t be any simpler. Below is a breakdown of the Haversine formula.
Haversine:
With:
and:
Where:
Φ = latitude
λ = longitude
R = 6371 (Earth's mean radius in km)
Much of Haverines’ simplicity comes from the underlying assumption that Earth is a perfect sphere (which it isn’t…). Because of this, it can lead to errors of up to 0.5%. More accurate methods exist but at the expense of computational complexity. Below is the Python implementation:
import math
def haversine(coord1, coord2, units='m'):
lon1, lat1 = coord1
lon2, lat2 = coord2
R = 6_371_000 # radius of Earth in meters
phi_1 = math.radians(lat1)
phi_2 = math.radians(lat2)
delta_phi=math.radians(lat2-lat1)
delta_lambda=math.radians(lon2-lon1)
a = (
math.sin(delta_phi / 2.0)**2 +
math.cos(phi_1) * math.cos(phi_2) *
math.sin(delta_lambda / 2.0)**2
)
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
match units:
case "m":
# output distance in meters
return R * c
case "km":
# output distance in kilometers
return (R * c) / 1000.0
case "mi":
# output distance in miles
return R * c * 0.000621371
case "ft":
# output distance in feet
return R * c * 3.28083888
case _:
print('unsupported units')
Example usage:
# define the points
p1 = [-84.412977, 39.152501]
p2 = [-84.412946, 39.152505]
Which will output the following:
# meters (default)
haversine(p1, p2)
>>> 2.7098232942902385
# kilometers
haversine(p1, p2, units='km')
>>> 0.0027098232942902385
# miles
haversine(p1, p2, units="mi")
>>> 0.00168380561019642
# feet
haversine(p1, p2, units="ft")
>>> 8.890493621837097
# error case
haversine(p1, p2, units="xx")
>>> unsupported units
A packaged version of this is available from my GitHub: here.
When should you use Haversine over a simpler method like Euclidean? Well if we simplify this down to the two dimensional case, the great-circle distance on a sphere becomes the arc length on a circle and the Euclidean distance becomes the chord length. If we recall from highschool geometry class, the arc length s of a circle is given by:
Where θ (in radians) is multiplied by the radius. The chord length c is given by:
And if we define the relative error as the following:
If you pause here and do the derivation, you’ll see that the radius r cancels out and the relative error becomes a function of just θ. This makes sense because the relative error between chord length and arc length will be the same whether your radius is measured in microns or light-years. If we plot this out, it looks like the following:
When we substitute the radius of the Earth into these same calculations, we can determine the relative error across a range of arc lengths as measured in miles.
At 100 miles, the Euclidean distance will be off by 14ft. At 10 miles, the error is less than a 1/4 of an inch. But again, only on the surface of a sphere (which the Earth is not).