Wednesday, June 12, 2019

Drawing Lines in .NET Core: Comparing ImageSharp and System.Drawing.Common

This is a follow-up post from an early experiment I did almost 3 years ago: https://build-failed.blogspot.com/2016/08/creating-simple-tileserver-with-net.html

At the time, I was trying to setup a simple Tile Server in C# that ran in .NET Core in Win, Mac and Linux, generating images on-the-fly. As I didn't find any C# lib that supported drawing lines and such, I built it myself (see above blog-post for details) over a lib called "ImageProcessor", which allowed me to set individual pixels.

Now, as I'm doing another project that also requires drawing stuff I decided to see how much the landscape has evolved over the last couple of years.

Spoiler alert, it changed significantly. As a summary:
  • ImageProcessor is now marked as "in soft archive mode", with focus shifting to another library called ImageSharp, from the same author
  • ImageSharp now supports Drawing functionality and actually seems very feature rich on that regard. Also, it's fully managed, not relying on native OS calls. If any of you worked with GDI+ in the past, you'll probably remember the mem leaks and thread-"unsafety" of it, particularly if trying to use it on the web.
  • Meanwhile, Microsoft released a Nuget package called "Microsoft.Drawing.Common", which is part of the Windows Compatibility Pack, aiming to help developers migrate their existing .NET code to .NET Core
  • As opposed to ImageSharp, Microsoft.Drawing.Common acts a a bridge to OS specific logic. In Windows it relies on GDI+, on Linux requires installing libgdiplus (from Mono).
I'll quote Scott Hanselman on this (https://www.hanselman.com/blog/HowDoYouUseSystemDrawingInNETCore.aspx):
There's lots of great options for image processing on .NET Core now! It's important to understand that this System.Drawing layer is great for existing System.Drawing code, but you probably shouldn't write NEW image management code with it. Instead, consider one of the great other open source options.
With that said, I'm very interested to understand how both libs compare, thus I've setup a very, very simple test. I'm not going to test the functional differences between both libs and I'm going to focus on the use-case that's most relevant to me: drawing lines.

All the relevant code is available on the Gitrepo: https://github.com/pmcxs/core-linedrawing-benchmark/

The following options can parametrized: number of lines being generated, image size and line width.

Ok, let's start:

Basic Functionality

Starting with a simple test, just to compare (visually) how both results fare:

Number of Lines: 10
Image Size: 800 px
Line Width: 10 px
(default zoom)
First the good news: both work :)

I did find a small difference on the output though. Although the images are very similar, both libraries handle corners differently (at least on the default behavior).

System.Drawing creates a strange protrusion on sharp edges. Zooming in on the previous image:
(zoomed in)
That seems really strange. Increasing the line thickness the effect looks even worse.

Number of Lines: 10
Image Size: 800 px
Line Width: 50 px

(default zoom)
Yeah, even on the default zoom it looks awful with System.Drawing. Basically all corners converge to a single pixel. Thus, the larger the line, the worse the effect gets.

Performance

The code I've placed above already outputs the duration of both approaches. Unfortunately (although probably expected) System.Drawing is still faster than ImageSharp. Versions being tested:
  • SixLabors.ImageSharp.Drawing: 1.0.0-dev002737
  • System.Drawing.Common: 4.6.0-preview5.19224.8
Before starting, I actually found this page: https://docs.sixlabors.com/articles/ImageSharp/GettingStarted.html and it mentions a couple of potential issues that might cause performance problems:
A few troubleshooting steps to try:
  • Make sure your code runs on 64bit! Older .NET Framework versions are using the legacy runtime on 32 bits, having no built-in SIMD support.

I can confirm my test isn't affect by those problems: That flag returns true and I'm running in 64 bits.

I've varied the number of lines (keeping the line width to 10px and image size to 800x800) and the results are as follows:
line width: 10px
I was slightly surprised by the jump from 10 to 100 lines in ImageSharp, particularly as from 100 lines to 1000 its performance was almost the same.

I then did the same test, but increasing the line width from 10px to 100px:
line width: 100px
I don't understand how ImageSharp is faster when drawing 1000 lines vs the previous test with just 100 lines:
  • 100 lines (10 pixel width): 1315 ms
  • 1000 lines (100 pixel width): 776 ms
The ImageSharp performance seems to get worse the thinner the lines are, which I can try to confirm that by going to an extreme of 1px lines.

But, lets make things even more interesting. I'll include my own custom implementation from 3 years ago to the mix. It didn't compile at first, but was easy enough to update, including using the new recommendation for a Span class to set pixels.

The results were interesting. My custom implementation is at its peak the thinnest the line is, hence it gets some really, really strong results. I'm tempted to say there's a bug or an unintended side-effect from the current implementation in ImageSharp. I'll do some additional experiments before submitting a issue.

I had to change the scale to be logarithmic, as otherwise the results would be hidden because of the strange behavior in ImageSharp, as it takes over 2 minutes to render 1000 lines.
line width: 1px
For actual numbers, with 1000 lines of 1px (from slowest to fastest):
  • ImageSharp (Linux): 140000 ms
  • ImageSharp (Windows): 127000 ms
  • System.Drawing (Linux): 132 ms
  • Custom (Linux): 74 ms
  • System.Drawing (Windows): 64 ms
  • Custom (Windows): 62 ms
Increasing the width to 10px the results aren't as strong, but still quite good:
  • ImageSharp (Linux): 1530 ms
  • ImageSharp (Windows): 1311 ms
  • Custom (Linux): 146 ms
  • Custom (Windows): 127 ms
  • System.Drawing (Linux): 123 ms
  • System.Drawing (Windows): 122 ms
There's a catch though:
  • This is literally the only use case I've built. No polygons, beziers, text, etc
  • Also, my "corner handling" logic is really crappy
Regardless, there could be some potential there, thus I might revisit this topic later on. I've included my custom logic on the repo I've mentioned above: https://github.com/pmcxs/core-linedrawing-benchmark/tree/master/src/ImageSharpCustomDrawing

Sunday, June 2, 2019

Building a lightweight api (ASP.NET Core 2.0) to find the geo region that contains a given Coordinate

Context: Given a set of geographic areas I needed an API that would, for a supplied coordinate, return the area that contained it.

So, it needed to:

- load a set of features from a Shapefile
- provide a Web API that receives as input a latitude and longitude
- return the feature that contains that point



Conceptually this is pretty standard, and I expected to have multiple implementations in c# readily available on the web. Interestingly enough, that's not the case (or I probably didn't search properly).

I've decided to build a simple service to address the requirements stated above. It's very simple and available at: https://github.com/pmcxs/region-locator. Its README provides some info on how to setup and use.

How does it work? Breakdown below:

Saturday, May 25, 2019

Playing with Mapbox Vector Tiles (Part 3 - Using Mabox GL)


On this post I'm going to pick-up what I've described on my previous two posts and creating a demo of Mapbox Vector Tiles integrated with Mapbox GL.

I'll be more or less recreating what I did on this experiment: http://psousa.net/demos/maps-and-boardgames-part-2/demo3.html


That old experiment used:
  • TileMill to create the raster tiles
  • UTFGrid tiles to provide meta-information on the hexagons
  • Leaflet as the mapping technology
This new one uses:
  • Tipecanoe to generate the vector tiles (from geojson sources)
  • Mapbox GL JS as the mapping technology
The tricky bit is making sure that the helicopters snap to the displayed hexagons, ideally leveraging the vector data that's present on the tiles.

With that said, I'm actually going to start with the end-result, followed by the break-down on how it's built.


I didn't try to replicate the experience at 100%, but it's close enough.

Live demo at: http://psousa.net/demos/vector/part3/

How this was done:

Thursday, April 18, 2019

Creating a large Geotif with Forest Coverage for the whole World

For a pet-project that I'm making I was trying to find accurate forest coverage for the whole World.

A raster file seemed more adequate and I wanted something like this, but with a much higher resolution (and ideally already georeferenced)


I found the perfect data-source from Global Forest Watch: https://www.globalforestwatch.org/

Global Forest Watch provide an incredible dataset called "Tree Cover (2000)" that has a 30x30m resolution which includes the density of tree canopy coverage overall.

It's too good to be true, right?

Well, in a sense yes. The main problem is that it's just too much data and you can't download the image as a whole.

Alternatively, they provide you an interactive map where you can download each section separately, at: http://earthenginepartners.appspot.com/science-2013-global-forest/download_v1.6.html

This consists of 504 (36x14) images, already georeferenced. For example, if you download the highlighted square above you'll get the the following picture:
https://storage.googleapis.com/earthenginepartners-hansen/GFC-2018-v1.6/Hansen_GFC-2018-v1.6_treecover2000_50N_010W.tif
It's "just" 218MB, hence you can somehow imagine the size of the whole lot. Should be massive.

So, three challenges:
  1. How to download all images
  2. How to merge them together to a single file
  3. (Optional, but recommended) Reducing the resolution a bit to make it more manageable 

1. How to Download all images

Well, doing it manually is definitively an option, although it's probably easier to do it programmatically.
import ssl
import urllib.request

ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

sections = []

for i in range(8,-6,-1):
    for j in range(-18,18):
        sections.append(f'{abs(i)*10}{"N" if i >= 0 else "S"}_{str(abs(j)*10).zfill(3)}{"E" if j >= 0 else "W"}')

for section in sections:
    url = 'https://storage.googleapis.com/earthenginepartners-hansen/GFC-2018-v1.6/' + \
            f'Hansen_GFC-2018-v1.6_treecover2000_{section}.tif'

    with urllib.request.urlopen(url, context=ctx) as u, open(f"{section}.tif", 'wb') as f:
        f.write(u.read())
The code above, in Python 3.x, iterates all the grid squares, prepares the proper download url and downloads the image.

As the https certificate isn't valid you need to turn off the ssl checks, hence the code at the beginning.

2. How to merge them together to a single file

It's actually quite simple, but you'll need GDAL for that, hence you'll need to install it first.

gdal_merge is incredibly simple to use:

gdal_merge.py -o output-file-name.tif file1.tif file2.tif fileN.tif

Adding to those parameters I would suggest compressing the output, as otherwise an already large file could become horrendously huge.

gdal_merge.py -o output-file-name.tif <files> -co COMPRESS=DEFLATE

And that's it. I'll show how this all ties together on the Python script in the end, but you can "easily" do it manually if you concatenate the 504 file names on this command.

3. Reducing the resolution a bit to make it more manageable 

As I've mentioned, the source images combined result in lots and lots of GBs, which I currently don't have available on my machine. Hence, I've reduced the resolution of each image.

Please note that this isn't simply a resolution change on a Graphics Software, as it needs to preserve the geospatial information. Again, GDAL to the rescue, now using the gdalwarp command:
gdalwarp -tr 0.0025 -0.0025 file.tif

The first two parameters represent the pixel size. From running the command gdalinfo on any of the original tifs I can see that the original pixel size is:

Pixel Size = (0.0002500000000000,-0.0002500000000000)

Empirically I've decided to keep 1/10th of the original precision, hence I've supplied the aforementioned values (0.0025 -0.0025)

As before, I would suggest compressing the content
gdalwarp -tr 0.0025 -0.0025 file.tif -co COMPRESS=DEFLATE

You do lose some quality, but it's a trade-off. If you have plenty of RAM + Disk Space you can keep an higher resolution.

Original
1/10th of resolution
Final script

The following Python 3 does everything in one go. The interesting bit is that I change the resolution of each individual tile before merging the complete map. The script also cleans up after itself, only leaving the final tif file, named "treecover2000.tif"
import ssl
import urllib.request
import os

ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
extension = ".small.tif"
sections = []

for i in range(8,-6,-1):
    for j in range(-18,18):
        sections.append(f'{abs(i)*10}{"N" if i >= 0 else "S"}_{str(abs(j)*10).zfill(3)}{"E" if j >= 0 else "W"}')

for section in sections:
    print(f'Downloading section {section}')
    url = 'https://storage.googleapis.com/earthenginepartners-hansen/GFC-2018-v1.6/' + \
            f'Hansen_GFC-2018-v1.6_treecover2000_{section}.tif'

    with urllib.request.urlopen(url, context=ctx) as u, open(f"{section}.tif", 'wb') as f:
        f.write(u.read())

    os.system(f'gdalwarp -tr 0.0025 -0.0025 -overwrite {section}.tif {section}{extension} -co COMPRESS=DEFLATE')
    os.system(f'rm {section}.tif')

os.system(f'gdal_merge.py -o treecover2000.tif { (extension + " ").join(sections)}{extension} -co COMPRESS=DEFLATE')
os.system(f'rm *{extension}')
The "treecover2000.tif" ends-up with 751MB and looks AWESOME. Zooming in on Portugal, Spain and a bit of France