Saturday, April 28, 2012

Parsing UTFGrid data to polygons

I've talked previously about UTFGrid. It provides, as described by Development Seed, "a standard, scalable way of encoding data for hundreds or thousands of features alongside your map tiles".

So, to sum it up, if we have this image tile for the hexagons of one of my previous posts.


We would have a corresponding json file with this UTFGrid
         !!!!!!!!!!!!!!!!!###################$$$$$$$$$$$$$$%%
            !!!!!!!!!!!!!!!!####################$$$$$$$$$$$$$$%%
             !!!!!!!!!!!!!!######################$$$$$$$$$$$$%%%
             !!!!!!!!!!!!!!######################$$$$$$$$$$$$%%%
             &&&&&&&&&&&&&&######################''''''''''''%%%
            &&&&&&&&&&&&&&&&####################''''''''''''''%%
            &&&&&&&&&&&&&&&&&###################''''''''''''''%%
           &&&&&&&&&&&&&&&&&&##################''''''''''''''''%
           &&&&&&&&&&&&&&&&&&&#################'''''''''''''''''
          &&&&&&&&&&&&&&&&&&&&################''''''''''''''''''
         &&&&&&&&&&&&&&&&&&&&&&##############'''''''''''''''''''
         &&&&&&&&&&&&&&&&&&&&&&##############'''''''''''''''''''
        &&&&&&&&&&&&&&&&&&&&&&&&############''''''''''''''''''''
        &&&&&&&&&&&&&&&&&&&&&&&&############''''''''''''''''''''
        &&&&&&&&&&&&&&&&&&&&&&&&((((((((((((''''''''''''''''''''
         &&&&&&&&&&&&&&&&&&&&&&(((((((((((((('''''''''''''''''''
          &&&&&&&&&&&&&&&&&&&&&(((((((((((((((''''''''''''''''''
          &&&&&&&&&&&&&&&&&&&&((((((((((((((((''''''''''''''''''
           &&&&&&&&&&&&&&&&&&(((((((((((((((((('''''''''''''''')
           &&&&&&&&&&&&&&&&&&(((((((((((((((((('''''''''''''''')
            &&&&&&&&&&&&&&&&((((((((((((((((((((''''''''''''''))
             &&&&&&&&&&&&&&&((((((((((((((((((((('''''''''''''))
             &&&&&&&&&&&&&&(((((((((((((((((((((('''''''''''')))
             **&&&&&&&&&&&&(((((((((((((((((((((('''''''''''+)))
             ***************(((((((((((((((((((((+++++++++++++))
            ****************((((((((((((((((((((++++++++++++++))
           ******************((((((((((((((((((++++++++++++++++)
           ******************((((((((((((((((((++++++++++++++++)
          ********************((((((((((((((((++++++++++++++++++
          *********************(((((((((((((((++++++++++++++++++
         **********************((((((((((((((+++++++++++++++++++
         ***********************(((((((((((((+++++++++++++++++++
        ************************((((((((((((++++++++++++++++++++
        ************************(((((((((((,++++++++++++++++++++
         ***********************,,,,,,,,,,,,,+++++++++++++++++++
         **********************,,,,,,,,,,,,,,+++++++++++++++++++
          ********************,,,,,,,,,,,,,,,,++++++++++++++++++
          ********************,,,,,,,,,,,,,,,,++++++++++++++++++
           ******************,,,,,,,,,,,,,,,,,,++++++++++++++++-
            *****************,,,,,,,,,,,,,,,,,,,++++++++++++++--
            ****************,,,,,,,,,,,,,,,,,,,,++++++++++++++--
             **************,,,,,,,,,,,,,,,,,,,,,,++++++++++++---
             **************,,,,,,,,,,,,,,,,,,,,,,++++++++++++---
             ..............,,,,,,,,,,,,,,,,,,,,,,////////////---
            ................,,,,,,,,,,,,,,,,,,,,//////////////--
            .................,,,,,,,,,,,,,,,,,,,//////////////--
           ..................,,,,,,,,,,,,,,,,,,////////////////-
           ...................,,,,,,,,,,,,,,,,,/////////////////
          ....................,,,,,,,,,,,,,,,,//////////////////
         ......................,,,,,,,,,,,,,,///////////////////
         ......................,,,,,,,,,,,,,,///////////////////
        ........................,,,,,,,,,,,,////////////////////
        ........................,,,,,,,,,,,,////////////////////
        ........................000000000000////////////////////
         ......................00000000000000///////////////////
          .....................000000000000000//////////////////
          ....................0000000000000000//////////////////
           ...................00000000000000000////////////////1
           ..................000000000000000000////////////////1
            ................00000000000000000000//////////////11
            ................00000000000000000000//////////////11
             ..............0000000000000000000000////////////111
              .............0000000000000000000000//////////22111
                          00000000000000000000000222222222222111

Each symbol matches a certain feature. Each symbol representation maps a 4x4 pixels rectangle, thus, we have a 64x64 UTFGrid mapping a 256x256 tile image. A little bit of precision is lost, but the performance gain makes up for it.

Now, this data is then used to provide an interactive experience on otherwise raster data, allowing click and hover interaction. Check, for example, this page.

This approach however, has its limitations. We don't really have the vector data in client-side, as this is just a "trick" to provide some interaction features. The MapBox folks created a project called Glower which provides an additional feature: painting the data in client-side using HTML5. You can check it in action here.

Anyway, I don't really like the way it's done. It basically checks the character that is being hovered, paints itself and all the similar neighbors with small 4x4 rectangles. It doesn't work well in all browsers and is kind-of limiting.



What I would really like is to parse the UTFGrid and produce vector data, namely polygons. Then, with the polygons, one could do much more interesting stuff, like using the Map API to draw the data (without resorting to HTML5). To be honest I'm not particular sure on what will come out of this but, in a worst case scenario, I had some fun trying to come up with this algorithm.

Anyway, I want to pick something like this:
           
and somehow extract the following polygons:                    

Each polygon is composed of a bunch of vertex in a certain order. At first hand this may seem a trivial task, but it certainly wasn't for me. Not the coding part, but creating the algorithm itself. I tried to find something on the web that could be useful, but the only thing that really mapped itself to my problem was the "Convert raster to vector" algorithms, and using them would be like "killing a fly with a sledgehammer".

After some thought (and failed attempts) I devised an algorithm that I've baptized as "Domino Sweep" (fancy name, heh?).

I expect good performance because I create all the polygons in a single matrix iteration after visiting each row and column, regardless on the number of polygons displayed.

While traversing the matrix elements, I check the border edges produced on that particular element. For example, in line 1 column 3, the "A" element has a "border" to the north, east and south. I process the edges in a clockwise order, trying to find a polygon on which I may add the edge. I call this "Domino Sweep" because I only look at the first and last point of each polygon to see if they match any of the vertices of the current edge.



So, this is it:

For each element (row/column):
  • Check border edges (matrix limits or a border with a different symbol element)
  • For each border edge see if they may be added to an existing polygon for that particular feature
  • If there's no match, create a new polygon with that edge
  • If there's a match, add the edge to that polygon, connecting the similar vertex (like domino)
  • If there's a additional matching polygon, it means that we can merge both
I've implemented it in C#. The following code is not yet polished, but the algorithm works really well.

using System;
using System.Collections.Generic;
using System.Drawing;
using System.Linq;

namespace UTFGridParser
{

    [Flags]
    enum EdgeType
    {
        None = 0,
        Top = 1,
        Right = 2,
        Bottom = 3,
        Left = 4
    }

    class Point
    {
        public Point(int y, int x)
        {
            X = x;
            Y = y;
        }
        public int X { get; set; }
        public int Y { get; set; }

        public override string ToString()
        {
            return string.Format("({0},{1}", Y, X);
        }

        public override bool Equals(object obj)
        {
            var p = obj as Point;

            if(p == null)
            {
                return false;
            }

            return p.X == X && p.Y == Y;
        }

        public Polyline Polyline { get; set; }
    }


    class Polyline
    {
        public Polyline()
        {
            Points = new List<Point>();
        }

        public List<Point> Points { get; set; }

        public Point StartPoint
        {
            get
            {
                return Points.Count == 0 ? null : Points[0];
            }
        }

        public Point EndPoint
        {
            get
            {
                return Points.Count == 0 ? null : Points[Points.Count - 1];
            }
        }
    }

    class Edge
    {
        public Edge(int row, int column, EdgeType edgeType)
        {
            switch (edgeType)
            {
                case EdgeType.Top:
                    PointA = new Point(row, column);
                    PointB = new Point(row, column + 1);
                    break;
                case EdgeType.Right:
                    PointA = new Point(row, column + 1);
                    PointB = new Point(row + 1, column + 1);
                    break;
                case EdgeType.Bottom:
                    PointA = new Point(row + 1, column);
                    PointB = new Point(row + 1, column + 1);
                    break;
                case EdgeType.Left:
                    PointA = new Point(row, column);
                    PointB = new Point(row + 1, column);
                    break;
            }            
        }

        /// <summary>
        /// First point of the edge
        /// </summary>
        public Point PointA { get; set; }

        /// <summary>
        /// Second point of the edge
        /// </summary>
        public Point PointB { get; set; }

        public override string ToString()
        {
            return string.Format("{0} - {1}", PointA, PointB);
        }

    }

    class Program
    {
        static void Main()
        {

            var polygons = new Dictionary<char, List<Polyline>>();

            string[] utfGrid = {
                                    "AAADDDD",
                                    "AADDDDD",
                                    "AAACCCC",
                                    "EEFFFGG",
                                    "EEFFFFF",
                                    "HHHHHHH"
                                };

            for(int i =0; i < utfGrid.Length; i++)
            {
                for (int j = 0; j < utfGrid[i].Length; j++)
                {
                    char currentKey = utfGrid[i][j];

                    if(currentKey == ' ')
                    {
                        continue;
                    }


                    //top edge
                    if (i == 0 || utfGrid[i - 1][j] != currentKey)
                    {
                        var newEdge = new Edge(i, j, EdgeType.Top);
                        AddEdge(newEdge, currentKey, polygons);
                    }

                    //right edge
                    if (j == utfGrid[i].Length - 1 || 
                        utfGrid[i][j + 1] != currentKey)
                    {
                        var newEdge = new Edge(i, j, EdgeType.Right);
                        AddEdge(newEdge, currentKey, polygons);
                    }

                    //bottom edge
                    if (i == utfGrid.Length - 1 || 
                        utfGrid[i + 1][j] != currentKey)
                    {
                        var newEdge = new Edge(i, j, EdgeType.Bottom);
                        AddEdge(newEdge, currentKey, polygons);
                    }

                    //left edge
                    if (j == 0 || utfGrid[i][j - 1] != currentKey)
                    {
                        var newEdge = new Edge(i, j, EdgeType.Left);
                        AddEdge(newEdge, currentKey, polygons);
                    }

                }
            }

        }

        private static void AddEdge(Edge newEdge, 
                                    char currentKey,  
                                    Dictionary<char, List<Polyline>> polygons)
        {
            if (!polygons.ContainsKey(currentKey))
            {
                polygons.Add(currentKey, new List<Polyline>());
            }

            var currentPolygons = polygons[currentKey];

            //Find a polygon which connects with the current edge

            Polyline matchingPolyline = null;

            foreach (var polyline in currentPolygons)
            {

                if (matchingPolyline != null)
                {
                    if (matchingPolyline.EndPoint.Equals(polyline.StartPoint))
                    {
                        matchingPolyline.Points.AddRange(polyline.Points.Skip(1));
                        currentPolygons.Remove(polyline);
                        break;
                    }

                    if (matchingPolyline.EndPoint.Equals(polyline.EndPoint))
                    {
                        var points = new List<Point>(polyline.Points);
                        points.Reverse();
                        points.RemoveAt(0);

                        matchingPolyline.Points.AddRange(points);
                        currentPolygons.Remove(polyline);
                        break;
                    }

                    if (matchingPolyline.StartPoint.Equals(polyline.EndPoint))
                    {
                        matchingPolyline.Points.InsertRange(0, 
                            polyline.Points.Take(polyline.Points.Count - 1));
                        currentPolygons.Remove(polyline);
                        break;
                    }

                    if (matchingPolyline.StartPoint.Equals(polyline.StartPoint))
                    {

                        var points = new List<Point>(polyline.Points);
                        points.RemoveAt(0);
                        points.Reverse();

                        matchingPolyline.Points.InsertRange(0, points);
                        currentPolygons.Remove(polyline);
                        break;
                    }

                }


                //check all points
                if (polyline.StartPoint.Equals(newEdge.PointA))
                {
                    polyline.Points.Insert(0, newEdge.PointB);
                    matchingPolyline = polyline;
                    continue;
                }

                if (polyline.StartPoint.Equals(newEdge.PointB))
                {
                    polyline.Points.Insert(0, newEdge.PointA);
                    matchingPolyline = polyline;
                    continue;
                }

                if (polyline.EndPoint.Equals(newEdge.PointA))
                {
                    polyline.Points.Add(newEdge.PointB);
                    matchingPolyline = polyline;
                    continue;
                }

                if (polyline.EndPoint.Equals(newEdge.PointB))
                {
                    polyline.Points.Add(newEdge.PointA);
                    matchingPolyline = polyline;
                    continue;
                }
            }

            if (matchingPolyline == null)
            {
                var newPolyline = new Polyline();
                newPolyline.Points.Insert(0,newEdge.PointA);
                newPolyline.Points.Insert(0,newEdge.PointB);

                currentPolygons.Add(newPolyline);
            }
        }
    }
}
I'll eventually write the code in Javascript and then, let the fun begin :P

2 comments:

  1. Hi there,

    You might be interested in the flood fill algorithm: http://en.wikipedia.org/wiki/Flood_fill

    (one of the authors of utfgrid/wax/tilemill/mapbox stuff): Glower is clearly a suboptimal solution (a two-day project), but algorithmically, I was running off of the fact that Canvas is rather slow for non-sharp edges and very fast for sharp edges; also that any kind of polygon smoothing will have to be aware of full topology & smart about boundaries. So, glower is the 20% of the solution; the 80% is not just building vectors, but lots more.

    ReplyDelete
  2. Hi Tom,

    Thanks for the feedback. This was just a fun experiment, and I know that converting the UTFGrid data to vectors wouldn't be the whole deal. My idea was to, having the vector part in client-side, be able to display something in the likes of GIS Cloud, or using the map providers API to draw the on-mouse-hover polygons, something like that. I didn't really think it through too much :D

    Anyway, congratulation on all the great products that you guys develop. Development Seed is certainly a reference for me.

    ReplyDelete