Saturday, November 14, 2009

Pixel worth 1000 bits

It took rather long time for raster imagery (maps but also aerial and satellite photography) to gain wide acceptance for use in online GIS applications. Raster imagery used to be just a backdrop to “real data” (ie. vectors) as substantial file size (often in obscure proprietary format) and generally low resolution made it difficult to incorporate imagery into applications and rather impractical to use for more than just the initial “splashback”. The fact that remote sensing, aerial surveys and GIS were traditionally seen as totally separate disciplines was not helping in driving amalgamation of various formats into a single data management and online visualisation solution.

It would be fair to say that the release of Google online mapping technology was a catalyst to initiate change. These days raster imagery is dominating in online mapping applications, especially in those with high traffic such as Google Maps, Bing Maps, Yahoo Maps. The ability to show ground features in detail increased significantly with the advances in image capture technologies, reducing the reliance on vectors to depict such information. The ingenuity of using small image tiles, to overcome the issues with raster data file size and to improve efficiency of online distribution, made it much easier to use raster rather than vector data format for presenting information.

[exmple of raster map: Topo 250K by Geoscience Australia]

Just to clarify, in the traditional online GIS applications both raster images and vectors are presented on a computer screen as images (eg. gif, jpg, png). However, the difference is in how those images are created. In case of vector data (and some raster data with pixel attribute information), all images are generated from data on the fly by a map server (caching techniques are now available for static information to reduce the load) and there is a “link” maintained between what is presented on the image and vector data so if user “clicks” on the line representing eg. road, attribute information specific to that segment of the line can be returned as a query result (eg. name, width, type). In case of raster imagery without attribute information for each pixel, the image is pre-generated in advance (eg. raster topographic map). With this approach, dynamic referencing to attribute information in the source dataset is lost.

[example of map image generated from vector data: Topo 250K by Geoscience Australia]

It is technically possible to use a mix of vector data and raster imagery in a single application to get the best of both approaches (ie. high resolution imagery or nice pre-designed cartographic representation of vector data, delivered fast as tiled images yet still referenced to attribute information in the original dataset) but I have not seen this yet implemented as a general practice. Here is another idea for Google – add “get feature” service to your maps! It would work exactly as “reverse geocoding” service but rather than returning address information for the point (based on cadastre vector dataset) it could also return information on other map features (like roads, parks, buildings, other POI, etc). Creating a “link” to source vector data could also open up the opportunities for all sorts of spatial query service options: “distance”, “area”, but also more complex like “select within”, “adjacent” etc.

With the exception of a handful of technologies, specifically developed for true vector and imagery data streaming, the overwhelming majority of online mapping applications is not capable of processing efficiently vector data into complex images - on the fly and in volumes required for today’s Internet – without a massive hardware infrastructure behind it.

Currently, reliance on true vector data in browsers is limited to presenting a small number of points, polygons or 3D objects, or only to highly specialised applications. There is support for true vector data in a browser environment via Java, Flash or Silverlight but to make it work efficiently it requires sophisticated solutions to handle on-the-fly vector generalisation and local data caching (as mentioned above, only a handful of companies managed to do it, and they are not the industry leaders! Although, I should mentioned that I am very impressed with an online application I have seen recently, developed in Silverlight and showing nicely quite a volume of vector data - I will have to investigate in more detail!

Applications such as Google Map make use of browser’s support for vector data in VML/SVG format but overall, browser processing capabilities are very limited. Therefore, although Google Map will accepts vector data in KML format but if the file is too big (100KB +) Google will convert it to image tiles to speed rendering of the information in the browser. It is appropriate for presentation of static data but will not work with dynamic information (eg. thematic maps) because once vectors are converted to images on initial map load they cannot be changed with a script. If the same amount of data is imported into Google Map application and rendered as true vectors (eg. with GPolygon() function), loading and drawing of the information on the map is rather slow.

There is a new concept emerging for handling of spatial information, regardless whether the source is raster imagery or vector data. It is the concept of a spatial grid. Traditionally, girds were used with Digital Elevation Models (DEM) data. Later they were also found to be applicable in the field of spatial analysis. Now grid concept can also be applied to referencing myriad of attributes to a specific location (cell) on the surface of the Earth – making all the data hierarchical and analysis-ready. If I understand correctly, such organisations as USGS are now planning to start releasing information as referenced grid cells rather than traditional imagery although there are still some challenges in defining those grids, indexing the data, and of course in storage capabilities for high resolution datasets.

The theory and technologies developed for handling imagery will find use in implementation of this new approach. After all, image pixels are a form of a grid. Graphic formats offer more efficient storage and compression capabilities than traditional spatial databases and emergence of graphical processing capabilities (GPU) offer a great hope for very fast analytical capabilities – a new and exciting era in spatial data management, processing and delivery!


Jorge said...


I propose the following equation to answer question, grid or vectorial format?

If interest [x,y] > interest [z] then preferred format = vectorial format
If interest [z] > interest [x,y] then preferred format = grid format

The problem is when interest [x,y] =interest [z] then ....!

Note [x,y] are planimetric values of spatial data and [z] thematic values of spatial data

Arek said...

Hi Jorge,

The equations suggested seem logical however, if one looks from totally objective perspective, [z] gives many more options. What I mean is that it is theoretically possible to extract planimetric values from gridded data as well, although the algorithms may be more complex. So, potentially one data format can fit all requirements.

It appears that gridded format may have much wider application than we can currently anticipate. The new paradigm is that this is no longer vector vs raster argument but rather vector + raster vs grid. Again, in theory, grid can replace the other two formats.

I am not trying to suggest that "grid is better than the rest" but I am curious to explore the limits of this approach.