Academia.eduAcademia.edu

Representing scientific data sets in KML: Methods and challenges

2010, Computers & …

Computers & Geosciences 37 (2011) 57–64 Contents lists available at ScienceDirect Computers & Geosciences journal homepage: www.elsevier.com/locate/cageo Representing scientific data sets in KML: Methods and challenges Lisa M. Ballagh a,n, Bruce H. Raup a, Ruth E. Duerr a, Siri Jodha S. Khalsa a, Christopher Helm a,c, Doug Fowler a, Amruta Gupte b a National Snow and Ice Data Center, Cooperative Institute for Research in Environmental Sciences, University of Colorado, 449 UCB, Boulder, CO 80309-0449, USA Department of Electrical, Computer, and Energy Engineering, University of Colorado, 425 UCB, Boulder, CO 80309-0425, USA c National Renewable Energy Laboratory, 1617 Cole Boulevard, Golden, CO 80401-3393, USA b a r t i c l e in fo abstract Article history: Received 30 May 2009 Received in revised form 7 May 2010 Accepted 13 May 2010 Virtual Globes such as Google Earth and NASA World Wind permit users to explore rich imagery and the topography of the Earth. While other online services such as map servers provide ways to view, query, and download geographic information, the public has become captivated with the ability to view the Earth’s features virtually. The National Snow and Ice Data Center began to display scientific data on Virtual Globes in 2006. The work continues to evolve with the production of high-quality Keyhole Markup Language (KML) representations of scientific data and an assortment of technical experiments. KML files are interoperable with many Virtual Globe or mapping software packages. This paper discusses the science benefits of Virtual Globes, summarizes KML creation methods, and introduces a guide for selecting tools and methods for authoring KML for use with scientific data sets. & 2010 Elsevier Ltd. All rights reserved. Keywords: Virtual Globe Google Earth Science education and outreach Cryosphere Snow and ice 1. Introduction 2. How Virtual Globes enhance science Virtual Globes such as Google Earth and NASA World Wind permit users to explore features on the Earth’s surface. These Virtual Globes tap into huge databases of imagery, roads, and other geographic features. One of the prime merits of Google Earth is the ability to view seamless, true color satellite imagery at every location on the surface of Earth. Although the quality of the imagery is not the same everywhere, even at lower resolutions significant features can be easily seen at every location. The date that the imagery was acquired varies but the information conveyed – land cover characteristics, settlements, structures, topography – is generally representative of the present or recent past. The National Snow and Ice Data Center (NSIDC) initiated Virtual Globes work in 2006 and these efforts continue to evolve with the development of high-quality Keyhole Markup Language (KML) representations of scientific data. Since KML is an Open Geospatial Consortium (OGC) standard (Wilson, 2008), KML files are interoperable with many Virtual Globes. This paper describes the science benefits of Virtual Globes, summarizes KML creation methods, and introduces a KML authoring guide for use with scientific data sets. While Virtual Globes are renowned for their intuitive visualization capabilities, they possess a characteristic that may be overlooked: the enhancement of science. Goodchild (2008) points out that by enabling exploration of spatial relationships Virtual Globes can be powerful aids to understanding and insight. In a report from the National Research Council (2006) it is argued that spatial thinking is integral to the everyday work of scientists and engineers. It has underpinned many scientific and technical breakthroughs and is therefore an ‘‘educational necessity.’’ There are now innumerable examples of Earth science data being displayed in Google Earth. The KML files produced by the US Geological Survey are in everyday use by citizens, scientists and decision makers worldwide, with two outstanding examples: real-time water1 and real-time earthquake data.2 O’Brien (2009) described an experiment in which magnetic declination was accurately determined using only a GPS receiver and a Virtual Globe. Researchers with different expertise are also collaborating via Google Earth. For example, by combining an Arctic sea ice layer with a layer tracking walrus movement in Google Earth, researchers can observe how changes to the Arctic sea ice are impacting walruses’ habitat, migratory patterns and behavior (Butler, 2006). At NSIDC, we have improved internal data quality n Corresponding author. Tel.: +1 303 735 5402; fax: + 1 303 492 2468. E-mail addresses: [email protected], [email protected] (L.M. Ballagh). 0098-3004/$ - see front matter & 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.cageo.2010.05.004 1 2 http://waterwatch.usgs.gov/index.php?m=real&w=kml. http://earthquake.usgs.gov/learn/kml.php. 58 L.M. Ballagh et al. / Computers & Geosciences 37 (2011) 57–64 Fig. 1. GLAS data shown in Google Earth. A GLAS ground track crossing Ruth Glacier. The return pulse waveform (red) reflects a combination of roughness and slope within selected footprint on glacier surface. With repeat orbits, changes in glacial elevation can be determined. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) control processes, provided users with means to browse and visualize satellite and other imagery, and enhanced communication of scientific results to the general public by making cryospheric images and other data accessible via KML files. 2.1. Improving data quality control A practical example of the effectiveness of visualizing data on a Virtual Globe is the Global Land Ice Measurements from Space (GLIMS) data ingest process. The GLIMS group located at NSIDC receives glacier outlines and other associated data from analysts around the world. A number of quality control (QC) steps are applied before inserting the data into the GLIMS glacier database (Armstrong et al., 2005; Raup et al., 2007). For example, polygons are checked for closure and proper assignment of basic attributes such as identification numbers. One step in this QC process is to convert incoming glacier outlines to KML, color-coding the glacier outlines according to their glaciological meaning (e.g. glacier boundary, rock or lake boundary within a glacier, or transient snow line). Paterson (1994) describes these glaciological meanings in more detail. Visualizing the data over Google Earth imagery allows for efficient checking of geolocation and proper assignment of data attributes, and also allows for detection of glaciers for which no outlines were sent. 2.2. Improving browse capabilities NSIDC experimented with the use of Google Earth to browse data obtained with NASA’s Ice, Cloud, and land Elevation Satellite (ICESat). ICESat carries one instrument, the Geoscience Laser Altimeter System (GLAS). The GLAS laser emits short pulses of light toward the Earth and records the reflected radiation. Elevation of surface features is computed from the roundtrip travel time of the pulse. The footprint illuminated on the surface of the Earth by the laser is approximately 70 m in diameter. The laser fires 40 times per second, placing footprints at 170 m intervals along the ground track. NSIDC now archives the data from over 1.5 billion laser shots from GLAS. Users ordering GLAS data do so without any knowledge of the data’s quality and often with only an approximate knowledge of where the individual shots are relative to their area of interest. NSIDC saw the use of Google Earth as one way to decrease the time users spend searching for and inspecting GLAS data. From the Global Elevation Product (GLA06; Zwally et al., 2009), we extracted parameters for geolocating and assigning quality indicators to each laser shot. We used the NSIDC GLAS Altimetry elevation extractor Tool (NGAT), a tool available to users, to retrieve the data. Our approach was to display the GLAS footprints as disks of 70 m diameter which, when displayed on a Virtual Globe, allows users to see more or less precisely where each shot was in the context of what was on the surface. The footprints are drawn with 3-D objects using KML models (Wernecke, 2008), so that their size is invariant with zoom level. Disks are color-coded to convey the quality of each shot. The timing and shape of the return laser pulse, called the waveform, is the basis of all information derived from GLAS. Aside from its primary mission of ice sheet change detection, the GLAS waveform can also be used to study vegetation height and treecovered locations which can be used to estimate biophysical parameters (Harding and Carabajal, 2005). To properly interpret and quality check the elevations produced by GLAS, inspection of the waveforms is essential. For this reason we decided that in addition to providing precise location and some indication of quality through coloring of the disks, we would also display the waveform associated with each shot, allowing more advanced users to make precise assessments of data quality (Fig. 1). Each footprint has a coincident and similarly-colored placemark which if clicked displays an image of the waveform, along with additional information such as the time and computed elevation. For this experiment, all waveform images were first created and then used in the generation of the KML files. One future endeavor may be to produce the waveforms on-the-fly. Although GLAS acquires data globally, our initial experiment was with a 5 by 5 degree region around Anchorage, Alaska. This area was chosen for the diversity of snow, ice, water, tundra, forest, and urban surface types present. The density of data we were attempting to portray required us to employ various techniques3 to stay within the capabilities of most users’ computers. This included partitioning the area into smaller tiles (a method called regionation, which is described later) and displaying only a sampling of footprints until the zoom level allows individual footprints to be seen. 3 http://nsidc.org/data/icesat/visge/. L.M. Ballagh et al. / Computers & Geosciences 37 (2011) 57–64 2.3. Exploring temporal change NSIDC explored delivering an entire time series of data products as KML. From NASA’s Moderate Resolution Imaging Spectroradiometer (MODIS) snow and ice product suite distributed by NSIDC, the monthly snow cover was selected as the first MODIS data set to experiment with primarily because it is the simplest of all the MODIS products (Hall et al., 2006). Also, because MODIS is an optical instrument, cloud cover affects the 59 data. Due to clouds, substantial data gaps in the daily and 8-day products exist. The monthly product has the least number of pixels with missing data due to cloud contamination, making it the best product for visualizing snow cover. The MODIS team at NSIDC initially created grayscale PNG images showing fractional snow cover using a 0–100% white transparency gradient and then overlaid the PNG images on Google Earth via KML. The transparency gradient allows various amounts of the underlying imagery to show through based on the amount of snow detected. KML files for the years from 2000 to 2008 were created. Each yearly file references the individual monthly products for the year. For example, Fig. 2 shows a MODIS image from April of 2001. If a user downloads and selects all eight KML files, Google Earth automatically configures the timeline so that the entire time series can be animated. In addition, by selecting and deselecting individual months within each year, the user can quickly create animations for specific months. For example, a user could create an animation of wintertime snow over the Rocky Mountains in April for the last eight years. 2.4. Communicating science with Virtual Globes Fig. 2. MODIS Image Overlay Displayed in Google Earth. Fractional snow cover over North America for April 2001 from MODIS/Terra Snow Cover Monthly L3 Global .05deg CMG product (MOD10CM). KML files can serve as an effective outreach tool. The number of sources for KML files is growing and includes web pages such as NSIDC’s Virtual Globes web site (http://nsidc.org/data/virtual_ globes/). As an example, consider that it is interesting to learn that Carroll Glacier in Alaska has thinned by more than 50 m and has retreated between 1906 and 2004 (Molnia, 2007). Then seeing the side-by-side photo comparisons of the glacier retreat in Google Earth makes the retreat of Carroll Glacier even more compelling and puts that change into the regional context (Fig. 3). Scientists, researchers and KML developers have used KML files to help explain scientific phenomenon and to convey scientific data and results. The ubiquity of Google Earth, with over 500 million Fig. 3. Repeat photographs of Carroll Glacier in Google Earth. Carroll Glacier, taken in 1906 by Charles W. Wright and in 2004 by Bruce F. Molnia. This image does not replicate exact positions from original photographs, but it shows glacier terminus and provides spatial context. Source: NSIDC/WDC for Glaciology (2009). 60 L.M. Ballagh et al. / Computers & Geosciences 37 (2011) 57–64 unique downloads of the application,4 makes it a familiar platform that has great appeal due to the powerful visualization capabilities the software has to offer. The ability to communicate using Virtual Globes, whether through a captured image in a publication, an animation in a presentation, or an online tour, has become a powerful and efficient way to enhance science. Scientists publish their results in refereed journals and through other channels, and some authors are choosing to use Google Earth figures in their publications. For example, Serreze et al. (2007) used a Google Earth figure to illustrate how the September 2005 Arctic sea ice extent compares to the median ice extents in March and September from 1979 to 2000. Scientists are giving talks at educational workshops5 and educators are publishing papers describing how Virtual Globes can benefit education (Schultz et al., 2008). Forums are beginning to emerge that allow KML developers working in scientific fields to present their work to a larger audience. One such forum is the American Geophysical Union Virtual Globes sessions that take place annually. Presentations at this conference and at other related Virtual Globes workshops cover a wide variety of topics that include, for example, how NASA World Wind has modified its mission based on user needs and requirements (Hogan et al., 2008), to showing how SketchUp and Google Earth can work together for scientific applications,6 and a critique of the way in which maps are designed.7 Whether the focus of the presentation is on the science or the technology, these presentations provide insightful ways to communicate science in a manner that is visually energizing. While KML files are beneficial for the visual portrayal they enable, they also provide a way for scientists to communicate their results at scientific and outreach/educational conferences and in their published works. 3. A review of KML creation methods It is clear that Virtual Globes can enhance science. A valuable resource for those learning KML and even for KML developers wanting to enrich their skills at a more intricate level is The KML Handbook (Wernecke, 2008). This book guides the reader through various KML commands and also provides some scientific examples (e.g. models of volcanic ash cloud dispersion). In addition, open source and proprietary solutions exist for converting spatial data sets into KML. NSIDC has relied mainly on writing custom code and utilizing open source software, but we have experimented with proprietary software as well. When evaluating methods to create a KML file, a developer must be familiar with the data set attributes, the data volume, the intended audience, and any styling requirements before deciding on the most suitable KML creation method. The term ‘‘styling’’ is used here to refer to any coloring and classification used to visually distinguish different features in the KML. 3.1. Proprietary software Proprietary software provides a common way to convert data into KML. These typically include the use of a graphical user interface (GUI) to assist with the conversion. An example of a GUI based tool is the ArcGIS plug-in Arc2Earth (http://www.arc2earth. com/). This is a widely used tool that permits the direct conversion of a spatial data file into a functional KML file. With ArcGIS v9.3, map layers created in ArcMap, ArcGlobe or ArcScene can be used to prepare and deliver information using KML. Geoprocessing tools are provided to convert individual map layers or entire maps into KML. This allows for users to utilize the advanced styling capabilities of ArcGIS and produce KML that directly reflects those styles. This type of tool is very useful for data sets that require many detailed styles or extensive data classification. However, such tools present significant hurdles to generating useful KML for large data sets as they often lack the ability to build KML files that utilize the full KML specification, such as network links (Wernecke, 2008), and they do not support the ability to generate KML on-the-fly. Such hurdles make these tools ideal for fast KML generation where a small number of files are involved or for products that require human interaction to create, but is less of an option for large spatial data sets that often involve complex file structures in their design. 3.2. Custom code Another option is to write custom code; however, this requires the ability to program and can be more labor intensive to implement. There are many examples of custom code generated KML available from NSIDC. One such example is the KML file containing the average location of the sea ice edge in the Nordic Seas between 1967 and 2002 (Divine and Dick, 2007). Writing a script to generate KML involves manually writing the opening section, writing a KML block for each data element, and then writing a closing section. By writing custom code, developers have the flexibility to tailor the KML file and its styling as they deem fit with complete control over the functionality and aesthetics. If other similar data sets exist it may be optimal to re-use the custom code. The level of effort required to write custom scripts ranges from very low to very high, depending on the nature of the script. For example, a GDAL utility program can be used to convert a large number of shapefiles to KML format by putting that command in a simple loop in a four-line script in a language such as Bash (on UNIX/ Linux) or a batch file (Windows). One only has to know the basic syntax of a scripting language to create a loop. On the other hand, a script to generate a system of KML files with full control over the styling can require more programming experience in order to design a system of data structures and subroutines properly. In this case, the script itself would generate the KML, rather than relying on a utility program to do so. In short, some level of scripting is accessible to nearly anyone working with KML files, and the more one knows about KML and a particular programming language, the more control one can have over the KML generated. 3.3. Open source software There is a vibrant and active community producing a wealth of open source geospatial software. These packages range from lowlevel libraries for working with map projections or format conversions to fully-functional desktop GIS software. The open source software that can be used for producing KML files is flexible and can be incorporated into a variety of computing processes. Open source software allows flexibility in the terms of use. The down side is that the learning curve can sometimes be high, though not higher than any proprietary software package. The primary difference being that many developers have more experience with proprietary packages. NSIDC has experience with the Regionator, GeoServer and GDAL tools, all open source solutions. 4 http://en.oreilly.com/where2010. http://serc.carleton.edu/files/NAGTWorkshops/tools08/meier_presentation. v2.ppt. 6 http://www.youtube.com/watch?v=J64kSF6JVuM. 7 http://www.youtube.com/watch?v=levgAXgxYw0. 5 3.3.1. Regionator overview When dealing with large data sets, it may be necessary to partition the data into ‘‘regions’’ in order to avoid the initial burden L.M. Ballagh et al. / Computers & Geosciences 37 (2011) 57–64 of loading a very large file. A ‘‘region’’ is a subset of the original data that either covers a small fraction of the area covered by the original data set or which covers a substantial fraction of that area at lower resolution than the original data (Wernecke, 2008). In either case, it is only necessary to load those ‘‘regions’’ that match the user’s current view, thereby minimizing the size of the data files that need to be loaded. We present two approaches to this partitioning, the first to create regions using a custom script and the second utilizing the Regionator open source software. 3.3.1.1. Custom Regionator. The GLIMS initiative makes extensive use of imagery from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) instrument for glacier mapping. To help the GLIMS community find suitable imagery, NSIDC serves metadata and browse imagery for approximately 200,000 ASTER images of glacierized terrain. This database of ASTER Fig. 4. ASTER footprints in a KML file structure. A hierarchical structure of KML files used to provide a Google Earth interface to hundreds of thousands of ASTER browse images. Structure includes separation of data by time and region. 61 metadata is updated daily. We have implemented a Virtual Globe interface to this database of ASTER metadata. The Google Earth interface is implemented through a hierarchy of KML files that are linked to each other in a tree-like structure using network links. The top-level KML file links to a set of files, each of which contains data for only one year’s worth of imagery. Each year file is in turn linked to a set of files that divide the imagery into approximately 300 regions of glacierized terrain. Finally, each of these region files contains a point (KML placemark), image footprint (KML polygon), and link to the browse image for each image in this region, for this year. This is illustrated in Fig. 4. Several hurdles had to be overcome in developing this KML structure. The initial design used two few levels of network links, which led to too little regionation—the smallest pieces to be loaded at one time were still too large. This proved to be slow and ineffective due to the large size of the annual files, each of which contained all regions and were hundreds of megabytes. The solution to this problem was the addition of one more layer of links, and the global set of data was divided into smaller regions. These regions are spatial groupings of data that are activated and deactivated automatically within KML as they go in and out of view. The use of regions radically improved the performance of the whole KML system. The final result was a dynamically linked series of hierarchical KML files that rendered faster than the original design. The KML files are generated automatically using a custom program written in Perl. It queries the database year-by-year, extracting the image data by region by intersecting the image footprints with the region boundaries using functions built into PostGIS. In the interface, the footprint center point can be clicked on in Google Earth to display the image metadata (e.g. granule IDs, timestamps, and links to download the imagery). The bounding box represents the spatial extent of each footprint, and is used to show where images overlap. The browse imagery, overlaid on the Virtual Globe, permits users to quickly detect the quality of each image (e.g. the amount of cloud cover within the image) (Fig. 5). The timestamps in the ASTER image KML shown in Fig. 5 activate the timeline tool within Google Earth. Fig. 5. ASTER Browse Imagery over Glaciated Regions in Google Earth. ASTER Browse Imagery and their rectangular outlines (red boxes) are draped over terrain. GLIMS boundaries for several Argentine glaciers, including Torre and Blanco glaciers, are shown in red. Rock outcrops within glacier boundaries (nunataks) are outlined in green. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article). 62 L.M. Ballagh et al. / Computers & Geosciences 37 (2011) 57–64 Each KML file makes up a single region within the overall KML file structure. Separating the data into different files not only increases the speed with which the data are rendered but also allow users to selectively turn on and off each data type. The scripting process used to generate all of the data files allows for the customization of the styles, links, and regions used to make this data set a usable and effective interface to the large inventory of footprints. 3.3.1.2. Open source Regionator software. One problem with the initial version of the KML for the MODIS snow cover product was the large amount of time it took to load a single, 7200  3600 pixel, 2.1 MB image. Download and display of a single year of data in Google Earth may take several tens of seconds on a high-speed network and many minutes or hours for dial-up or other low-data rate users. As a result, many users found it practically impossible to download and display animations of the full 8-year time series. The open source Regionator software,8 developed by Google, helped address this problem. The Regionator creates smaller images termed ‘‘tiles’’ at different resolutions from an original image. These images are then linked to different children KML files. Depending on how much of the globe is viewed, only the much smaller KML files appropriate for that region are loaded. If the whole globe is in view, only a single reduced resolution image is displayed. Thus, the initial burden of loading very large image files in Google Earth is substantially reduced. New KML files were generated using the Regionator tool. A substantial reduction in download times resulted. Using the command-line Regionator software is reasonably straightforward. Users must have sufficient access to a target host to download and install the software, which has dependencies on other software packages, namely Python, Numeric, and GDAL. Once the Regionator is successfully installed, running it is a simple matter of identifying which image in a KML file to regionate, selecting the size of the image tiles to be created, identifying a directory to place the resulting tiled child images and associated KML files in, and running the command: KMLsuperoverlay:py -i /image_nameS -k /old_KML_nameS -r /new_KML_nameS -d /folder_nameS -t /tile_sizeS Selecting a tile size is perhaps the most challenging part of this process. Google recommends that tiles evenly divide the image. For the MODIS monthly snow product (MOD10CM), a tile size of 225 evenly divides a MODIS image in both directions and produces 173 sub-images in a hierarchy at five different resolutions. Our initial impression, that the amount of disk space used per original image would increase substantially as the tile size decreased, was wrong—the total file size varied at most by 20% for tile sizes ranging from 1024 down to 225; while the total number of images grew from 13 to 173 files and the size of each file decreased from  2 MB to  0.1 MB. Given this, it seems logical to use a tile size near the recommended 256 pixels. Lastly, the new KML created simply creates the network link information needed to call the proper images, so this code needs to be merged back into the original KML file, if that contains other information that needs to be preserved. 3.3.2. GDAL (ogr2ogr) In addition to the Regionator tools already discussed, there are other open source software tools that can read and write KML. OGR (not an acronym) is a library and set of command-line utilities for handling approximately 35 different vector data formats. It can be used to convert directly between any of those formats, including shapefiles, KML, Generic Mapping Tools, GeoJSON, Geography Markup Language, and geospatial databases. As an example, the following command converts a shapefile to KML: ogr2ogr -f KML output:kml input:shp -dsco AltitudeMode ¼ absolute The geometry and attribute information in the input.shp file will be written to the file output.kml. There are options for customizing the name and description of features, as well as the altitude mode (as in example), which affects how the features are displayed relative to the ground. The OGR set of utilities is particularly useful for two types of applications: scripted tasks for processing many files at once, and web applications for providing data in a choice of formats. Through the GLIMS Glacier Database, users can download glacier outline data in a choice of formats, including KML. The program ogr2ogr is used to generate KML (or other formats) from the data selected from a PostGIS geospatial database. At present, options for styling the resulting KML (i.e. color for lines or areas) are limited with the ogr2ogr program. One must edit the resulting KML file manually or with a script to add or change the styling, which may be burdensome when dealing with large data sets and/ or when many complex styles are needed. Also, text-type attribute fields cannot exceed 254 characters in length. 3.3.3. GeoServer For those who prefer not to write a custom program or use proprietary software, there is an alternative solution. In a Virtual Globes context, GeoServer (http://geoserver.org/) is an innovative solution, as it is an open source server that automatically generates KML files. A user can supply GeoServer with input data in a variety of formats such as PostGIS and GeoServer renders a KML file automatically without any effort on the user’s part. This remedy seemed plausible for the World Glacier Inventory (WGI) data set (National Snow and Ice Data Center, 2009). There are approximately 100,000 data records in the inventory. A static KML file would have been thousands of lines in length. If new data were added to the data set, one would have to manually update the static KML file. GeoServer communicates with PostGIS dynamically and hence, when new WGI data are added to PostGIS, the KML updates are made instantaneously. This dynamic approach enables a KML file to instantaneously update without developer intervention. While GeoServer is a user friendly option for non-programmers, it may require resources from a systems administrator for routine installation and maintenance. The GeoServer graphical user interface is very intuitive and the online guides are well written. To better style the KML file, users have the option of creating styled layer descriptors (SLD) and KML templates. For the WGI data, a KML template was deployed, to allow a view of each glacier metadata record when a specific placemark is clicked. The benefits of using GeoServer include a minimal learning curve, the simple user interface, and the dynamic approach to updating data. While the SLD and KML templates assist with styling, there are other aspects of the KML file that the developer cannot edit. For example, a developer does not have direct access to the KML generation code and therefore cannot make all styling changes. 3.3.4. Other open source tools Custom programs that use open source KML libraries provide an effective means of addressing the limitations with many conversion tools. Employing custom scripts to generate KML enables total control over the KML, opening up the full capabilities of the KML standard. PyKML9 or KeyTree10 are a couple of the 9 8 http://code.google.com/p/regionator/. 10 http://sourceforge.net/projects/pykml/. http://pypi.python.org/pypi/keytree/. 63 L.M. Ballagh et al. / Computers & Geosciences 37 (2011) 57–64 Table 1 Resource for KML Developers in Scientific Community. This table lists criteria for data (e.g. raster) and for KML development (e.g. styling) and offers a list of open source and proprietary software solutions dependent on scientific data set structure. Authors developed this table based on their experience working with KML and with scientific data. X symbols mark optimal software solutions based on the criteria. Criteria Open source software Proprietary Software Data and criteria description GDAL/OGR GRASS GeoServer Vector Large Small X X X X Raster Large Small X X X Styling 1–3 (1¼ min, 3¼ max styling) 1 1 Scripting Required (Y/N) Allowed (Y/N) N Y N Y Regionator Python (PyKML) or Perl (Geo::KML) Arc2Earth ArcGIS 9.3 X X X X X X X X X X X X 2 1 3 3 3 N Y N Y Y Y N N N Y libraries that are available for this purpose. These libraries can be invoked from a script and used to easily create KML files. These libraries contain functions that read in specific information such as data coordinates and output the data as KML placemark strings. Collections of generated placemarks can then be written to an output KML file. This allows the programmer to customize the KML and use various optimization techniques for large data. Adding to the flexibility of this method is the ability for scripting languages to interface with a wide number of GIS formats as well as connect to various databases. This provides the ability to perform everything from simple format conversions to the development of KML files based on large, spatially dispersed data. Other open source tools that can create KML include ‘‘libKML’’,11 Google’s C++ (Java and Python bindings available) library for reading, writing, and operating on KML; the Perl module ‘‘Geo::KML’’12; and other software such as Geographic Resources Analysis Support System (GRASS) (http://grass.osgeo.org/). 4. Introductory guide for KML developers in the scientific community Based on NSIDC’s experience working with custom scripts, proprietary software and open source software, we introduce a guide (Table 1) for KML developers working in the scientific community. Table 1 groups scientific data sets into two main categories: raster and vector data sets. The table also includes two additional criteria that KML developers should consider: the level of support for styling a KML file and whether or not a solution requires or allows scripting. The table contains a list of open source and proprietary software solutions, with our optimal solutions marked with the X symbol. This guide is based on the KML experience of the authors and can be used as a starting point for KML developers. The purpose of the guide is to provide a set of options for KML developers who are working with a scientific data set. For example, if you are working with a small vector data set and do not want to write a custom script, then GeoServer may be a viable option. However, if you want to create a heavily styled custom KML file and are working with a small raster data set, then utilizing Python or Perl libraries may be an optimal solution. This list of solutions is not exclusive. There are many ways to create KML files and this guide can serve as a reference for those in the planning stages of KML development. 11 12 http://code.google.com/p/libkml/. http://cpan.uwinnipeg.ca/htdocs/Geo-KML/Geo/KML.html. 5. Conclusions NSIDC has used Virtual Globes to enhance science in a variety of ways: by assisting with the quality control for glacier ingest into a database, supporting users with browsing and viewing complex data products, facilitating visualization of long time series of data, and communicating science to the general public in an easily digested and very visceral way. Data sets vary dramatically in format and size. The tools appropriate for generating KML for these data sets also vary, not just based on the characteristics of the data but also on the needs and skills of the developer. Developing custom scripts using open source software is the most common approach used by the authors. The optimal solution for each project depends on the source data, the skills of the developer, and the time permitted to work on various projects. We provide a guide for KML developers based on our experience with multiple data sets and tools. A variety of KML creation methods were described. GeoServer works well for non-programmers dealing with vector data. Even if developers have programming experience, GeoServer offers a variety of options. The Regionator is useful for handling the display of large data sets and can be used within custom scripts to automate data handling. GDAL is often used within custom scripts and custom scripts help with styling issues. A good way to begin writing KML is to view another KML file. Alternatively, the Google KML Interactive Sampler13 allows users to easily see how changes in KML affect the display on the Earth. Acknowledgments We acknowledge Ross Swick for developing the GLAS KML files and the sea ice animations. The PostGIS support for the World Glacier Inventory was provided by I-Pin Wang. The GLIMS ASTER metadata system relies on Web services from the EOS Clearing House (ECHO) system. The projects discussed in this paper were funded by NOAA’s National Geophysical Data Center (cooperative agreement: NA17RJ1229) and by NASA (award numbers: NNG08HZ07C (Snow and Ice Distributed Active Archive Center) and NNG04GF51A (GLIMS)). References Armstrong, R., Raup, B., Khalsa, S.J.S., Barry, R., Kargel, J., Helm, C., Kieffer, H., 2005. GLIMS glacier database. National Snow and Ice Data Center. Boulder, Colorado, USA. Digital media /http://nsidc.org/data/nsidc-0272.htmlS (accessed 28 June, 2010). 13 http://kml-samples.googlecode.com/svn/trunk/interactive/index.html. 64 L.M. Ballagh et al. / Computers & Geosciences 37 (2011) 57–64 Butler, D., 2006. The web-wide world. Nature 439, 776–778. Divine, D.V., Dick, C., 2007. March through August ice edge positions in the Nordic Seas, 1750–2002. National Snow and Ice Data Center. Boulder, Colorado, USA. Digital media /http://nsidc.org/data/g02169.htmlS (accessed 28 June, 2010). Goodchild, M.F., 2008. The use cases of digital earth. International Journal of Digital Earth 1 (1), 31–42. doi:10.1080/17538940701782528. Hall, D.K., Riggs, G.A., Salomonson, V.V., 2006. Updated monthly. MODIS/Terra snow cover monthly L3 global 0.05deg CMG V005, February 2000–December 2008. National Snow and Ice Data Center. Boulder, Colorado, USA. Digital media /http://nsidc.org/data/mod10cmv5.htmlS (accessed 28 June, 2010). Harding, D.J., Carabajal, C.C., 2005. ICESat waveform measurements of withinfootprint topographic relief and vegetation vertical structure. Geophysical Research Letters 32 (L21S10), 1–4. doi:10.1029/2005GL023471. Hogan, P., Gaskins, T., Bailey, J.E., 2008. NASA World Wind: A New Mission. Eos Transactions American Geophysical Union 89(53), Fall Meeting Supplement, Abstract IN43B-06. Molnia, B.F., 2007. Late nineteenth to early twenty-first century behavior of Alaskan glaciers as indicators of changing regional climate. Global and Planetary Change 56 (1–2), 23–56. doi:10.1016/j.gloplacha.2006.07.011. National Research Council, 2006. Learning to Think Spatially: GIS as a Support System in the K-12 Curriculum. National Academies Press, Washington, DC 314 pp. National Snow and Ice Data Center, 2009. World glacier inventory. World Glacier Monitoring Service and National Snow and Ice Data Center/World Data Center for Glaciology. Boulder, Colorado USA. Digital media /http://nsidc.org/data/ g01130.htmlS (accessed 28 June, 2010). NSIDC/WDC for Glaciology, Boulder, compiler, 2009. Glacier photograph collection. National Snow and Ice Data Center/World Data Center for Glaciology. Boulder, Colorado USA. Digital media /http://nsidc.org/data/g00472.htmlS (accessed 28 June, 2010). O’Brien Jr., W.P., 2009. Measuring magnetic declination with a compass, virtual globes and a global positioning system. International Journal of Digital Earth 2 (1), 31–43. doi:10.1080/17538940802585515. Paterson, W.S.B., 1994. The Physics of Glaciers 3rd edn. Pergamon Press, Tarrytown, NY 480 pp. Raup, B.H., Racoviteanu, A., Khalsa, S.J.S., Helm, C., Armstrong, R., Arnaud, Y., 2007. The GLIMS geospatial glacier database: a new tool for studying glacier change. Global and Planetary Change 56 (1–2), 101–110. doi:10.1016/j.gloplacha.2006.07.018. Schultz, R.B., Kerski, J.J., Patterson, T.C., 2008. The use of virtual globes as a spatial teaching tool with suggestions for metadata standards. Journal of Geography 107 (1), 27–34. doi:10.1080/00221340802049844. Serreze, M.C., Holland, M.M., Stroeve, J., 2007. Perspectives on the Arctic’s shrinking sea-ice cover. Science 315 (5818), 1533–1536. doi:10.1126/science.1139426. Wernecke, J., 2008. The KML Handbook: Geographic Visualization for the Web 1st edn. Addison-Wesley Professional, San Francisco, CA 368 pp. Wilson, T. (Ed.), 2008. OGC KML. OGC 07-147r2. Open Geospatial Consortium, Inc., 251 pp /http://portal.opengeospatial.org/files/index.php?artifact_id=27810S (accessed 28 June, 2010). Zwally, H.J., Schutz, R., Bentley, C., Bufton, J., Herring, T., Minster, J., Spinhirne, J., Thomas, R., 2009. GLAS/ICESat L1B global elevation data v028, 20 February 2003 to 21 March 2008. National Snow and Ice Data Center. Boulder, Colorado USA. Digital media /http://nsidc.org/data/gla06.htmlS (accessed 28 June, 2010).