Cloud-based raster processors out there
/Hi all,
Just trying to get my head around some of the new big raster processors out there, in addition of course to Google Earth Engine. Bear with me while I sort through these. Thanks for raster sleuth Stefania Di Tomasso for helping.
1. Geotrellis (https://geotrellis.io/)
Geotrellis is a Scala-based raster processing engine, and it is one of the first geospatial libraries on Spark. Geotrellis is able to process big datasets. Users can interact with geospatial data and see results in real time in an interactive web application (for regional, statewide dataset). For larger raster datasets (eg. US NED). GeoTrellis performs fast batch processing using Akka clustering to distribute data across the cluster. GeoTrellis was designed to solve three core problems, with a focus on raster processing:
- Creating scalable, high performance geoprocessing web services;
- Creating distributed geoprocessing services that can act on large data sets; and
- Parallelizing geoprocessing operations to take full advantage of multi-core architecture.
Features:
- GeoTrellis is designed to help a developer create simple, standard REST services that return the results of geoprocessing models.
- GeoTrellis will automatically parallelize and optimize your geoprocessing models where possible.
- In the spirit of the object-functional style of Scala, it is easy to both create new operations and compose new operations with existing operations.
2. GeoPySpark - in synthesis GeoTrellis for Python community
Geopyspark provides python bindings for working with geospatial data on PySpark (PySpark is the Python API for Spark). Spark is open source processing engine originally developed at UC Berkeley in 2009. GeoPySpark makes Geotrellis (https://geotrellis.io/) accessible to the python community. Scala is a difficult language so they have created this Python library.
3. RasterFoundry (https://www.rasterfoundry.com/)
They say: "We help you find, combine and analyze earth imagery at any scale, and share it on the web." And "Whether you’re working with data already accessible through our platform or uploading your own, we do the heavy lifting to make processing your imagery go quickly no matter the scale."
Key RasterFoundry workflow:
- Browse public data
- Stitch together imagery
- Ingest your own data
- Build an analysis pipeline
- Edit and iterate quickly
- Integrate with their API
4. GeoNotebooks
From the Kitware blog: Kitware has partnered with The NASA Earth Exchange (NEX) to design GeoNotebook, a Jupyter Notebook extension created to solve these problems (i.e. big raster data stacks from imagery). Their shared vision: a flexible, reproducible analysis process that makes data easy to explore with statistical and analytics services, allowing users to focus more on the science by improving their ability to interactively assess data quality at scale at any stage of the processing.
Extending Jupyter Notebooks and Jupyter Hub, this python analysis environment provides the means to easily perform reproducible geospatial analysis tasks that can be saved at any state and easily shared. As the geospatial datasets come in, they are ingested into the system and converted into tiles for visualization, creating a dynamic map that can be managed from the web UI and can communicate back to a server to perform operations like data subsetting and visualization.
Blog post: https://blog.kitware.com/geonotebook-data-driven-quality-assurance-for-geospatial-data/