blog authors

Welcome to the Kellylab blog

geospatial matters

Please read the UC Berkeley Computer Use Policy. Only members can post comments on this blog.


SimplyMap & PolicyMap

Today I went to a great D-Lab Workshop on Demographic Mapping Tools.  Berkeley's GIS and Map Librarian, Susan Powell walked us through the use of several very easy to use mapping tools available through UC Berkeley.  Both are really great for quickly visualizing data from many different sources. 

#1: SimplyMap:

Pros: This interface allows the easy visualization of census data (back to 1980), crime data, as well as lifestyle and market data.  SimplyMap is accessible with a UC Berkeley login and can be accessed through the Berkeley Library website.  It allows you to export data as shapefiles or image files, has a table-building function, and allows limited data filtering and masking.  The data provided come with metadata.  Most data available can be visualized down to the census tract or zipcode level.  You can save and share maps from your private account.

Cons: You cannot combine variables or years of data in the map itself, but you can do this in SimplyMap's table building function and export that.  The user interface is not always simple or straightforward. 

Above: Dollar amount spent at restaurants in Berkeley in 2014 by census tract. Map created using SimplyMaps.

#2 PolicyMap:

Pros: PolicyMap includes census data (back to 2000), housing, health, government programs, crime, and education data.  Like SimplyMap this allows the quick and easy visualization of data in a single year.  PolicyMap also allows you to upload and overlay your own data with its existing datasets and generally allows for a bit more overlaying of datasets--point data can be added on top of polygons.  You can generate quick pre-defined reports on specific cities or areas.  You can also define a custom study area in PolicyMap.  It has a table-builder as well as a really great data-dictionary that explains where its data come from.  

Cons: There are no private accounts.  All of Berkeley has a single account, so you can see everyone else's data, and they can see yours.  Thus, you must log-in through UC Berkeley's website to gain access.  This datasharing may not be an absolute con, but it is a little weird.  PolicyMap does not allow you to export shapefiles, but it does allow you to build tables that can be easily joined with shapefiles if need be.  It too has some user-interface quirks that could probably be improved upon. 


New VTM retakes, this time from Heather

Plus sa change, plus sa la meme chose. Thanks to Heather Constable, who went out exploring near Morro Bay. Here is one of her retakes. 

Date of original photo: Feb 25, 1936, taken in San Luis Obispo County, California, US. Looking north toward Morro Bay. Shows almost dense stand of Arctostaphylos morroensis in foreground. Quad name: Cayucos. Quad number: 132B. Reference to map: 1. Photographer: Albert Wieslander.


Big data, Landsat and earth science


California Water Use Map

In response to Gov. Jerry Brown's announcement yesterday, calling all California residents to reduce water use by 25%, the folks at the New York Times put togther a nice interactive map. The map shows residential water use in California in gallons per day.

Take a look here!


Mapping the Berkeley Boom: Social Media and Mapping Help Unravel a Mystery

Last night we heard the Berkeley Boom again.  We’ve been hearing this thunderous boom quite frequently in the last month here in Berkeley, but this one sounded bigger than most.  Car alarms went off on the street.  The dog jumped.  “What IS that?” I wondered aloud.  With a quick search on the internet I found that that the Berkeley Boom is a phenomena whose Twitter reports are being actively mapped.  While Berkeley police and residents still have no idea what the mystery boom is, through the combined powers of social media and mapping we are gathering an understanding of where it is happening.  As Berkeley residents continue reporting the boom (#BerkeleyBoom), perhaps we’ll get to the bottom of this, the newest of Berkeley’s many mysteries. 

For more on the Berkeley Boom see the Berkeleyside article:

Map from Berkeleyside Article:


The drought indeed hits home: Berkeley water less than its usual quality

The hills and lawns might look green still, but the drought has hit the east bay hard. The sparkling, clean, tasty water we usually have delivered through our taps via  the Mokelumne River Basin in the Sierra Nevada. Get out Britas!

From our favorite and fastest source for local news Berkeleyside

The drinking water for 1 million customers of East Bay Municipal Utilities District had an “off” odor and taste over the weekend and, while EBMUD is fixing the issue, customers might have to get used to it. The culprit? The drought.

EBMUD usually draws the drinking water for the majority of its customers from the bottom of Pardee Reservoir, about 100 miles east of Berkeley, according to Abby Figueroa, a spokeswoman for EBMUD. But on Thursday, the water district started taking water from the top portion of the reservoir. The water there is warmer and contains some algae, so even though it was treated before gushing into pipes in Berkeley, Oakland and elsewhere, there was a peculiar smell.

Route from the Mokelumne River Basin in the Sierra Nevada, to the East BayAccordingly there was a run on Brita filters at all local hardware/houseware stores.

New water restrictions for California announced.


Satellites can be vulnerable to solar storms

I don't use ocean color data, but found this report of interest nonetheless. From the HICO website. HICO is the Hyperspectral Imager for the Coastal Ocean.

HICO Operations Ended. March 20, 2015

In September 2014 during an X-class solar storm, HICO’s computer took a severe radiation hit, from which it never recovered.  Over the past several months, engineers at NRL and NASA have attempted to restart the computer and have conducted numerous tests to find alternative pathways to communicate with it.  None of these attempts have been successful.  So it is with great sadness that we bid a fond farewell to HICO.

Yet we rejoice that HICO performed splendidly for five years, despite being built in only 18 months from non space-hardened, commercial-off-the-shelf parts for a bargain price.  Having met all its Navy goals in the first year, HICO was granted a two-year operations extension from the Office of Naval Research and then NASA stepped in to sponsor this ISS-based sensor, extending HICO’s operations another two years.  All told, HICO operated for 5 years, during which it collected approximately 10,000 hyperspectral scenes of the earth.

Most of the HICO scenes taken over sites worldwide are available now, and will remain accessible to researchers through two websites: and  HICO will live on through research conducted by scientists using HICO data, especially studies exploring the complexities of the world’s coastal oceans.


Data science for the 21st century: building a new team of researchers

Berkeley is one out of eight new awards from the National Science Foundation's recently launched NSF Research Traineeship (NRT) program. These programs develop innovative approaches to graduate training used across these projects include industry internships, international experiences, citizen science engagement, interdisciplinary team projects, and training in communication with the media, policy makers, and general public.

Our program at UC Berkeley is called Data Science for the 21st centur: DS421.  Three Grand Challenges motivate our program:

  1. Data: data acquisition, assimilation, and analysis, and the resulting challenges and opportunities for the research community and society at large. The data revolution is a potentially disruptive advance that challenges the norms and traditions of scientific research. Data science is an opportunity, entailing a revolution in training and a reorientation of research priorities. Open science— open access to datasets, literature, scripted workflows and the like—is a fundamental transformation that integrates scientific publication with the underlying data, analysis, and reasoning, using metadata and machine-readable research products to facilitate a semantic web of knowledge. These practices will make our research reproducible and transparent, documenting the evidentiary basis for scientific conclusions and their implications for policy.
  2. System dynamics: coupled human-natural systems and their responses to rapid environmental change. Social-ecological systems display a complex array of ecological and social processes interconnected across broad spatial, temporal, and socio-political scales. Our current approach to understanding ecological and economic systems is dominated by partial equilibrium models that are poorly suited to the dynamics of rapidly changing systems. Important research avenues include: characterizing the dynamics and feedbacks among and within systems to better plan for cross-scale and nonlinear uncertainties; identifying the proximity of tipping points or other critical transitions; understanding how the spatial structure of interactions affects system dynamics; and detecting and attributing responses to environmental and climatic drivers. Real-time data analytics combined with long-term monitoring and forecasting are critical tools to address to these challenges.
  3. Action: evidence-based proposals in public policy, natural resource management, and environmental design to mitigate the impacts of rapid environmental change, and enhance societal resilience and sustainability. Effective decision-making depends on networks of diverse stakeholders, with rapid feedback between individuals and groups to evaluate the impact, efficiency, equity, and efficacy of policy and management actions. This third component is at the core of a practical data science ethic critical for translating science to societal benefit, and makes use of our partnerships with academic, private, governmental, and non-governmental organizations.

Cutting across these challenges, all students, and especially those engaged in interdisciplinary research,
need excellent communication skills and the ability to adjust content and style to reach their audiences. Welcome to the new cohort!


Mapsense talk at BIDS for your viewing pleasure

Here is Erez Cohen's excellent talk from the BIDS feed:

Title: Big Data Mapping: Modern Tools for Geographic Analysis and Visualization

Speaker: Erez Cohen, Co-Founder and CEO of Mapsense

We'll discuss how smart spatial indexes can be used for performant search and filtering for generating interactive and dynamic maps in the browser over massive datasets. We'll go over vector maps, quadtree indices, geographic simplification, density sampling, and real-time ingestion. We'll use example datasets featuring real-time maps of tweets, California condors, and crimes in San Francisco. 

The BIDS Data Science Lecture Series is co-hosted by BIDS and the Data, Science, and Inference Seminar. 

About the Speaker

Erez is co-founder and CEO at Mapsense, which is builds software for the analysis and visualization of massive spatial datasets. Previously Erez was an engineer at Palantir Technologies, where he worked with credit derivatives and mortgage portfolio datasets. Erez holds a BS/MS from UC Berkeley's Industrial Engineer and Operations Research Department. He was a PhD candidate in the same department at Columbia University.


print 'Hello World (from FOSS4G NA 2015)'

FOSS4G NA 2015 is going on this week in the Bay Area, and so far, it has been a great conference.

Monday had a great line-up of tutorials (including mine on PySAL and Rasterio), and yesterday was full of inspiring talks.  Highlights of my day: PostGIS Feature Frenzy, a new geoprocessing Python package called PyGeoprocessing, just released last Thurs(!) from our colleagues down at Stanford who work on the Natural Capital Project, and a very interesting talk about AppGeo's history and future of integrating open source geospatial solutions into their business applications. 

The talk by Michael Terner from AppGeo echoed my own ideas about tool development (one that is also shared by many others including ESRI) that open source, closed source and commercial ventures are not mutually exclusive and can often be leveraged in one project to maximize the benefits that each brings. No one tool will satisfy all needs.

In fact, at the end of my talk yesterday on Spatial Data Analysis in Python, someone had a great comment related to this: "Everytime I start a project, I always wonder if this is going to be the one where I stay in Python all the way through..."  He encouraged me to be honest about that reality and also about how Python is not always the easiest or best option.

Similarly, in his talk about the history and future of PostGIS features, Paul Ramsey from CartoDB also reflected on how PostGIS is really great for geoprocessing because it leverages the benefits of database functionality (SQL, spatial querying, indexing) but that it is not so strong at spatial data analysis that requires mathematical operations like interpolation, spatial auto-correleation, etc. He ended by saying that he is interested in expanding those capabilities but the reality is that there are so many other tools that already do that.  PostGIS may never be as good at mathematical functions as those other options, and why should we expect one tool to be great at everything?  I completely agree.