Sunday, June 2, 2019

Building a lightweight api (ASP.NET Core 2.0) to find the geo region that contains a given Coordinate

Context: Given a set of geographic areas I needed an API that would, for a supplied coordinate, return the area that contained it.

So, it needed to:

- load a set of features from a Shapefile
- provide a Web API that receives as input a latitude and longitude
- return the feature that contains that point



Conceptually this is pretty standard, and I expected to have multiple implementations in c# readily available on the web. Interestingly enough, that's not the case (or I probably didn't search properly).

I've decided to build a simple service to address the requirements stated above. It's very simple and available at: https://github.com/pmcxs/region-locator. Its README provides some info on how to setup and use.

How does it work? Breakdown below:

Load a set of features from a Shapefile

To load the Shapefile I've used NetTopologySuite, particularly the NetTopologySuite.IO.Shapefile Package.

I've been using NetTopologySuite for a long time and it works really well. Also, has been fully ported to .NET Standard and is also usable in Mac/Linux. The following code will read all features, including both the metadata properties as well as the corresponding geometries.
FeatureCollection features = new FeatureCollection();
var geometryFactory = new GeometryFactory();
using (var shapeFileDataReader = new ShapefileDataReader(shpFilename, geometryFactory))
{
    DbaseFileHeader header = shapeFileDataReader.DbaseHeader;

    while (shapeFileDataReader.Read())
    {
        Feature feature = new Feature();
        AttributesTable attributesTable = new AttributesTable();

        string[] keys = new string[header.NumFields];

        IGeometry geometry = (Geometry)shapeFileDataReader.Geometry;

        for (int i = 0; i < header.NumFields; i++)
        {
            DbaseFieldDescriptor fldDescriptor = header.Fields[i];            
            keys[i] = fldDescriptor.Name;
            attributesTable.Add(fldDescriptor.Name, shapeFileDataReader.GetValue(i + 1));
        }

        feature.Geometry = geometry;
        IGeometry envelope = geometry.Envelope;
        feature.BoundingBox = new Envelope(envelope.Coordinates[0], envelope.Coordinates[2]);
        feature.Attributes = attributesTable;
        features.Add(feature);
    }
}

Provide a Web API that receives as input a latitude and longitude

I've used ASP.NET Core 2 for this. I've created a "RegionsController" with the following method
[HttpGet("byCoordinates")]
public ActionResult Get(string longitude, string latitude)
{
}
This will define an API to be used as:
http://<host>/api/regions/byCoordinates?longitude=xxx&latitude=yyy

Inside this API there will be the code that does the actual filtering

Return the feature that contains that point

I've used a semi-brute force approach:
- Iterate all polygons
- But, as checking the intersection on a complex polygon is expensive, first check the bounding box, thus avoiding unnecessary calculation.

The code for this is very simple:
for (var i = 0; i < features.Count; i++)
{
    if (features[i].BoundingBox.Contains(coordinate) 
        && features[i].Geometry.Contains(new Point(coordinate)))
    {
        matchFeature = _features[i];
        break;
    }
}
I'm assuming that the regions as disjoint, hence breaking after finding the first match.
The effective match is obtained using the "Contains" method.

Obviously there are some additional details but you can check them at the repo itself: https://github.com/pmcxs/region-locator


Closing remarks

The performance is satisfactory. On my Mac it's taking around 1-2 ms for each reverse-geocoding operation with the ne_10m_admin_0_countries dataset from www.naturalearthdata.com.

For larger files I'm planning to support some fancier stuff like Quadtrees, which will reduce the number of polygons that need to be checked.

Also, I'm pretty sure that I might be able to use something else other than the "Contains" method, or probably breaking up the polygons initially to make computation simpler.

The README contains all required information to run this, but it's literally: clone, dotnet restore and dotnet run. Everything is setup to work out-of-the-box.

6 comments: