11:55 AM, Thursday January 14, 2021
Discrete Global Grids (DGGs) are imaginary mosaics that divide up the surface of a globe. We call each piece of the mosaic a "cell" or "zone" and give it a unique identifier, like a social security number.
The DGG is created carefully so that given the ID of any cell, you can determine the IDs of the cells that are next to it. This makes many kinds of spatial interrogation easier, since we have hidden away the shape of the cell.
Some DGGs, probably the more useful kind, are also hierarchical. This is when each cell contains another mosaic of cells, and is itself part of a bigger cell (just like your parents both have parents, and children). This relationship can keep going (we call that "recursive") as the cells get smaller and smaller in terms of the area of the Earth that they each represent.
There is a similar concept for rasters called "pyramids":
So what's the difference? Pyramids are lower resolution versions of a higher-resolution dataset. They are used to make it faster to draw and study spatial data. Constructing pyramids is to construct "resampled" versions of your highest-resolution data, using well-known methods of resampling. But pyramids are not used to construct higher resolution versions of your data.
Pyramids are not created on-the-fly, but must be made before they can be used. Pyramids are usually stored in a file kept alongside the source dataset. Examples include an .ovr
(overview) or .rrd
(reduced resolution dataset). Pyramid files, like the source raster, can additionally be compressed. Raster formats with wavelet-compression (including JPEG 2000) actaully have "internal" pyramids—they don't need a pyramid file that sits next to the source file.
The primary difference between a raster pyramid and a hierarchical DGG is that the layers of a hierarchical DGG are not required to be or are not only downsampled versions of some higher-resolution dataset. Each layer of the hierarchical DGG could contain different attributes. For example, a hierarchical DGG with landuse information for each cell might, at a high level, use very broad or mixed classifications, such as "orcharding". As you step down through the levels, cells with an "orcharding" parent may be further resolved into "kiwfruit orcharding", "viticulture", etc. Further still we might discriminate between features within an orchard, such as sheds, access roads, shelter belts, and actual plants.
There is nothing that prevents a hierarchical DGG being used in a similar way to a raster pyramids, particularly if the hierarchical DGG cells contain continuous measurement data. But what's nice about hierarchical DGGs is that you aren't limited to that kind of downsampling.
Using a hierarchical DGG there is also no need to always downsample a cell; we can simply keep the large cell around without further work to split into smaller and smaller pieces—if these pieces would have the same value as the larger cell. This is just like how in some styles of mosaic, you can get away with using large blue pieces to represent a sea. This is called compacting.
(Advanced readers may be interested in learning about the idea of R-trees, and Hilbert curves, which also relate to the hierarchical division of space: a graph data structure.)
Now we can start to see what makes a DGG similar to both raster and vector data. Like vector data, each cell has an identifier and can therefore be associated with any kind of data. Like raster data pyramids, each layer of a hierarchical DGG allows for more detail to emerge.
So far we have been talking about discrete global grids and have shown that they are discrete (each cell is distinct from each other cell) and form a regular grid (each cell has neighbour cells, and for hierarchical DGGs also "child" and "parent" cells). But we have not explained why they are global.
When making maps of the planet, we have to have some idea of what the planet looks like: how big it is, how round it is, how lumpy it is. This idea of the planet's shape and size is called a datum. Sometimes we need to look at or think about the Earth as being flat (for example, showing maps on flat screens and in books), and like peeling an orange and trying to get all the pieces to lie flat, it's not possible to do this without ripping it apart.
But if we instead pretend that the planet is like a special "pointy orange", we can get past some problems. One of the best problems we solve is one about how we can keep track of what each faces each of the special pointy orange faces touch: we call this topology. The best thing about the special pointy oranges is that they don't have gaps, yet still sort of look like balls, if you squint and enough faces are used.
The special pointy orange used to represent the Earth can be made up of different cell shapes, the same as how a mosaic wall can be made of differently-shaped tiles (rectangles, triangles, hexagons). However, we try to avoid using a random selection of shapes, as if the shapes are all the same then it is really fast for us to use computers to tell us things like "what cells are next to this cell?"
Different kinds of special pointy oranges are more useful than others. One thing we often want to do is to consider each cell to represent the same size area of the Earth. This is a really handy thing to do when counting things on Earth: if the cells were not the same size, larger cells would be more likely to have things inside them, just because they're bigger. One DGG that does this equal-area stuff is called HEALPix (hierarchical equal area isolatitude pixelisation of a sphere), invented by some people at NASA.
HEALPix begins with 12 base cells. Each level divides these into four smaller ones.
HEALPix only works with spheres: perfectly round balls. Although it looks like it is, Earth is not actually a sphere; it is more like an "ellipsoid". A group of New Zealanders (Gibb, Raichev & Speth) have taken the idea of HEALPix and made an alternative that works with ellipsoids. (Specifically, with "ellipsoids of revolution", which is where two axes of the ellipsoid have the same length. In reality, the Earth isn't an ellipsoid of revolution: since the third axis is shorter than the second, it is closer to an "oblate spheroid". The WGS84 ellipsoid, which you've probably used a lot is an ellipsoid of revolution, so it's not like this is a huge problem.) rHEALPix does this without losing any of the other things people like about HEALPix.
Not happy to just leave it there, rHEALPix is also able to be flattened out to a series of squares that are nicely aligned vertically and horizontally, like a tower of square blocks. Each of these squares contains some more squares (it is hierarchical). This makes rHEALPix easy for a computer to draw on a screen.
A lot of this thinking stems from work done by futurist architect Richard (R.) Buckminster Fuller. He is an interesting character in this story. He designed a geodesic dome known as the La Biosphère, which is home to a museum of the environment. In 1927, when experiencing a dark night of the soul and contemplating suicide, he has claimed to have felt himself suspended several feet above the ground, enclosed in a white sphere of light. A voice spoke directly to him:
From now on you need never await temporal attestation to your thought. You think the truth. You do not have the right to eliminate yourself. You do not belong to you. You belong to the Universe. Your significance will remain forever obscure to you, but you may assume that you are fulfilling your role if you apply yourself to converting your experiences to the highest advantage of others.
However, this account was not recorded by Fuller himself at the time he claims it occurred; and this is relevant because Fuller has the most self-documented life in human history—literally.
The benefits of using squares in HEALPix doesn't mean that a different DGG that uses triangles or hexagons is worse. Choosing the best one for you depends on what you need to do. rHEALPix is based on squares, and each square in a grid of squares has eight neighbours, but only four of them share a side. A grid of triangles has a similar issue, with even more neighbours. Instead, a grid based on hexagons may be better since all six neighbours of a hexagon share a side. So if you're thinking about a problem to do with movement, or if you only want to consider one type of cell neighbour, then a DGG with hexagon cells may be better than one with square or triangle cells.
There is no DGGS that is perfect for every job. Hexagons are bad because it is not possible to make a perfect-shape hierarchical DGG with hexagons: you cannot form a large hexagon out of small hexagons. This does not make such a DGG bad. Actually, this is exactly what Uber uses for its H3 grid, where aligned shapes aren't as important as an abstract hierarchy, or the use of touching shapes for studying movement.
A hierarchical DGG is also called a discrete gobal grid system, or DGGS. I prefer "hierarchical DGG" and not "DGGS" here because it's clearer about the difference between a DGG with one level, and a DGGS with nested ("hierarchical") levels.
This has been written with help from xkcd's Simple Writer, which tells me when I use difficult words. I have left some difficult words in because I could not think of words or phrases that I liked better. I'm not a technical writer by profession, so this is my best attempt at a general purpose introduction to DGGs.