Calculating the Distance Between Two GPS Coordinates with Python (Haversine Formula)

TL;DR - By making a few geometric assumptions, the Haversine formula provides an exceptionally simple way of calculating the distance between two latitude/longitude pairs. This tutorial will show you how to implement your own version in Python.

A lot of my posts recently have focused on the analysis of spatial data coming from either the GPS on my phone (collected through Strava) or geo-tagged tweets. Because of this I ended up writing my own Python module for calculating the distance between two latitude/longitude pairs. This is accomplished using the Haversine formula. While more accurate methods exist, the Haversine formula and Python implementation couldn’t be any simpler. Below is a breakdown of the Haversine formula.

Haversine:

With:

and:

Where:

Φ = latitude
λ = longitude
R = 6371 (Earth's mean radius in km)

Much of Haverines’ simplicity comes from the underlying assumption that Earth is a perfect sphere (which it isn’t…). Because of this, it can lead to errors of up to 0.5%. More accurate methods exist but at the expense of computational complexity. Below is the Python implementation:

import math


def haversine(coord1, coord2, units='m'):
    lon1, lat1 = coord1
    lon2, lat2 = coord2

    R = 6_371_000   # radius of Earth in meters
    phi_1 = math.radians(lat1)
    phi_2 = math.radians(lat2)

    delta_phi=math.radians(lat2-lat1)
    delta_lambda=math.radians(lon2-lon1)

    a = (
        math.sin(delta_phi / 2.0)**2 +
        math.cos(phi_1) * math.cos(phi_2) *
        math.sin(delta_lambda / 2.0)**2
    )

    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))

    match units:
        case "m":
            # output distance in meters
            return R * c
        case "km":
            # output distance in kilometers
            return (R * c) / 1000.0
        case "mi":
            # output distance in miles
            return R * c * 0.000621371
        case "ft":
            # output distance in feet
            return R * c * 3.28083888
        case _:
            print('unsupported units')

Example usage:

# define the points
p1 = [-84.412977, 39.152501]
p2 = [-84.412946, 39.152505]

Which will output the following:

# meters (default)
haversine(p1, p2)
>>> 2.7098232942902385

# kilometers
haversine(p1, p2, units='km')
>>> 0.0027098232942902385

# miles
haversine(p1, p2, units="mi")
>>> 0.00168380561019642

# feet
haversine(p1, p2, units="ft")
>>> 8.890493621837097

# error case
haversine(p1, p2, units="xx")
>>> unsupported units

A packaged version of this is available from my GitHub: here.

Haversine vs. Euclidean

When should you use Haversine over a simpler method like Euclidean? Well if we simplify this down to the two dimensional case, the great-circle distance on a sphere becomes the arc length on a circle and the Euclidean distance becomes the chord length. If we recall from highschool geometry class, the arc length s of a circle is given by:

Where θ (in radians) is multiplied by the radius. The chord length c is given by:

And if we define the relative error as the following:

If you pause here and do the derivation, you’ll see that the radius r cancels out and the relative error becomes a function of just θ. This makes sense because the relative error between chord length and arc length will be the same whether your radius is measured in microns or light-years. If we plot this out, it looks like the following:

When we substitute the radius of the Earth into these same calculations, we can determine the relative error across a range of arc lengths as measured in miles.

At 100 miles, the Euclidean distance will be off by 14ft. At 10 miles, the error is less than a 1/4 of an inch. But again, only on the surface of a sphere (which the Earth is not).