Originally published: Feb 11, 2014
This article is an excerpt from chapter 12 in the book Beyond Transparency.
The past decade has brought enormous and growing benefits to ordinary citizens through applications built on public data. Any release of data offers advantages to experts, such as developers and journalists, but there is a crucial common factor in the most successful open data applications for non-experts: excellent design. In fact, open data and citizen-centered design are natural partners, especially as the government 2.0 movement turns to improving service delivery and government interaction in tandem with transparency. It’s nearly impossible to design innovative citizen experiences without data, but that data will not reach its full potential without careful choices about how to aggregate, present, and enable interaction with it.
Public data is rarely usable by ordinary citizens in the form in which it is first released. The release is a crucial early step, but it is only one step in the process of maximizing the usefulness of public resources for the people who own them. Because data carries important information about the parts of people’s lives that are necessarily communal, it needs to be available and accessible to all. It needs to be presented in ways that illuminate the information it contains and that allow residents to interact with it and incorporate that information into their lives.
The real-time transit apps that are such a strong early example of useful open data do more than offer a raw feed of bus positions. The best of them allow riders to search arrivals on multiple lines of their choosing and adjust their commute plans accordingly. We can see the difference between great and merely adequate design in markets where multiple applications have been built based on the same data. Designs that more smoothly facilitate the tasks that people want to do are the most adopted. Conversely, valuable data that is presented in a way that is at odds with citizens’ mental models or awkward to use often doesn’t get the attention it deserves.
When internal systems or processes first become transparent to end-users via the internet, something profound happens. Assumptions that seemed rock solid can come into question, and the entire reason for running the systems and processes can be redefined. I had the privilege of working at a large financial company during the early days of online stock trading in the 1990s. Since it was founded, the company had employed brokers to interact with customers and accept their trade requests. If the back-end systems supporting trading happened to go down, the brokers covered for them with multiple layers of backup plans. As experts and daily users, they also covered for quirks in the system, odd error messages, etc. The company invested heavily in technology and had a track record of ninety-nine percent system uptime, of which it was justifiably proud.
However, once it opened its doors on the web and allowed customers to place trade orders online, things changed. Ninety-nine percent uptime meant potentially fifteen minutes of downtime in twenty-four hours, which was enough to inconvenience thousands of customers if it happened to fall during the market day. A metric that had been important to the company, and on which it thought it was winning, was no longer close to good enough. Error messages written for employees who received six months of training (and were, of course, being paid to be there) were not clear or friendly enough for customers who were becoming accustomed to online interaction through retail. The company had to rethink everything from how it measured its mainframe performance to how it designed its web pages in order to present errors gracefully. It had to intentionally write and design error messages for the first time. It had to consider the needs of people who were not being paid to be there (and indeed, who had plenty of options with the company’s competitors) in making choices about its technology systems.
I’m happy to say that my old employer recognized and took on the challenge, and it continues to be a leader in modern, internet-enabled financial services today. I see an analogy between what happened in that industry in the 1990s and what is happening in government now in the 2010s. It was the opening of the systems to customer interaction that triggered a revolution in how the company approached designing for customers. This wasn’t just a financial industry phenomenon. As retail stalwarts like Nordstrom attracted online customers, inventory systems designed for internal use became accessible–or at first inaccessible–to customers, creating a frustrating experience. What Nordstrom did in its 2010 redesign has some similarities to a municipal open data release: the company exposed its entire inventory to customers shopping online, enabling people to directly find what they were looking for, wherever it existed within the company’s distribution and warehousing systems or its stores (Clifford, 2010). Again, the needs of customers now able to interact with Nordstrom’s systems engendered a profound rethinking of what information (data) it provided and how (design) it provided it.
Open data has the potential to trigger a similar revolution in how governments think about providing services to citizens and how they measure their success. It’s a huge opportunity, and to take advantage of it will require understanding citizen needs, behaviors, and mental models, and making choices about how to use data to support all of those. In other words, it will require design.
Data science can be understood in terms of seven stages: acquire, parse, filter, mine, represent, refine, and interact (Fry, 2004). For the eagerly waiting civic hacker, the first step, acquire, is accomplished through an open data release. For the skilled civic hacker, or for many journalists, that step is the critical one-she can thank the agency that released the data and proceed with her project. The average city resident, however, finds him or herself dependent on others for six of those seven steps after data is released, and in particular, on the final three steps–represent, refine, and interact. These steps are strongly associated with the practice of citizen-centered design.
The difficult task of making data meaningful and useful to all the people who can benefit from it can draw on many methods and examples, but skipping these final steps or doing them poorly can lead to confusion and underutilization of the data that activists have worked hard to get released. Consider US Census data, to take a large example. Early versions of American FactFinder simply provided links to available datasets–a valuable service and a vast improvement on what was available before, via the internet. Still, it was very challenging for untrained people to wade through it.
The latest version of FactFinder, which was released with the 2010 census data in early 2013, has employed design in order to go much further (see http://factfinder2.census.gov). This has been a process of evolution, from the first online releases after the 1990 census to today. The latest version allows a search by ZIP code and returns a set of tabs, each of which highlights one critical piece of information, such as the total population of that ZIP code on the population tab. The Income tab highlights median household income. There are many more facts available in neatly arranged web tables via links, and there is even a Guided Search wizard that helps users find their way to tables that are likely to interest them. It’s not Nordstrom.com (or any other large retailer with a design staff and design culture) in terms of ease of use, but it does a great deal to return this data, which is collected by the government and owned by the people, to the people in a form in which they can use it.
There’s more to designing open data well than just making it searchable and presenting it attractively. In a recent study of US counties’ official election department websites, my collaborators and I discovered a problem with election results released online (Chisnell, 2012). Counties, as everyone who follows elections knows, are the units that precincts roll up to, and for most of the US, they are the level of government that has officials who are responsible for ensuring fair elections and publishing results. All of the counties that we studied fulfilled their statutory obligation to provide vote totals within their county, but voters with whom we conducted usability sessions were dissatisfied with what they found. Why? The counties are releasing the same information they have released for decades, to newspapers in earlier days and to radio and television journalists as the twentieth century progressed. For hundreds of years, journalists (and state election officials) have performed the service of aggregating these county tallies for voters, so that they know who actually won. This is what voters have come to expect as the meaning of “election results”–“who” or “which side” prevailed in the contests they voted on. So, voters looking at county election websites were confused and disappointed by the results sections, which provided only local tallies and no “results.”
There’s nothing wrong with this public data, but there is a problem with information design. Voters look to these sites for what they think of as results, particularly on second and third rank contests that may not be well covered by the media. The sites currently don’t provide voters’ idea of results, but simple design changes would allow them to. Without complicating these sites’ visual or interaction design, the counties could provide links to the overall outcomes of the contests they report on and satisfy everything citizens want. Design isn’t necessarily about being fancy or even pretty–much of it is about the right information at the right time.
The government has collected the first names of children registered for Social Security since the program began. They’ve collected baby names from birth registrations for longer. In fact, births and names are a basic public record. In the 1990s, after the advent of the web, these records became much more interesting because the data was made available in a form that was easy to explore. We can thank an SSA employee named Michael Shackleford for writing the first search program and making first name data public (Graham, 2013). The agency has since evolved its own design and seen others build on top of its open data. One famous example is NameVoyager. NameVoyager offers a brilliant design on top of public data–the step of visualizing the popularity of names over time on a graph, with pink and blue bands representing girls’ and boys’ names, and the simple interface that constricts the bands as the user types each letter of a name turns a bureaucratic dataset into a game.
Mobile apps using transit data are one of the biggest citizen-facing open data success stories, but again, an app that simply provides a feed of GPS coordinates for buses isn’t a winner. Those that provide the most features aren’t necessarily the best ones either.
Weather data has seen some interesting developments in 2012 and 2013 in terms of design. Government weather data has been considered a public good since the government gained the capability to collect meaningful weather data. However, until very recently, it has been offered to the public through basically a single information model. This model was regional (because information was distributed by broadcast), focused on large events and weather patterns, both because those make sense in the regional model and because the entities with the most pressing need for weather data were agricultural and industrial.
Now, consider three recent weather apps, all for mobile phones, that take public weather data a step further: Dark Sky, Swackett, and Yahoo! Weather. All use essentially the same public data, and each offers a different experience. Swackett (released in January 2012) proposes that the main reason individuals need weather data is to understand what kind of clothes to put on or whether or not to bring a jacket. Its interface shows a whimsical figure, which the user can customize through different editions, dressed appropriately for that day’s predicted weather in the user’s location. More traditional weather information is available through navigation.
Dark Sky (released in April 2012) doesn’t show a person, but it also assumes that an individual’s reason for looking up the weather is both hyper-local and immediate-future. Dark Sky’s default screen shows expected rainfall over the next sixty minutes for the user’s current location. It answers the question “do I need to take an umbrella if I go out right now,” and it sends notifications like “light rain starting in five minutes.” (All of this is only useful because the data is excellent.)
Yahoo! Weather’s new app, released in April 2013, combines government data with Yahoo’s repository of photos shared by its users to provide a simple temperature with a photographic background that gives a sense of what that weather feels like in the current location. Its designers chose radical simplicity–there are no radar maps, no extended forecasts, and no extras. Different people might choose differently among these three apps–none of them is clearly “better” than the others–but they all employ design in combination with open data to deliver an experience that far exceeds anything that existed prior to the 2010s.
Even our work in open data standards can be supported by good design choices. I don’t mean colors and fonts, but choices about where and how to display information that takes account of how people use it. I’ve been guilty of being a killjoy in the past when I’ve heard about restaurant health inspection score data being released and civic hackers building apps on it. As a UX designer, I’ve never observed anyone paying attention to the required public posting of scores in restaurant windows, and it’s hard for me to imagine that anyone would actually use such an app in the course of ordinary restaurant going. That said, when Code for America collaborated with the city of San Francisco and Yelp to place restaurants’ latest scores within their Yelp profiles using the LIVES standard, I predicted that this would be a useful and successful design.
Why? Yelp is one of the key places where people make decisions about restaurants already. Having one more piece of information available within that interface supports established behaviors that would be difficult to change, whereas having to download and install a separate app specific to health inspections would complicate the process of evaluating restaurants. While this may seem like just an implementation choice, it’s a design choice that makes an enormous difference to the user experience.
Much of the work that we are proudest of at Code for America involves strong design, as well as clever technology. BlightStatus, built for the city of New Orleans by Alex Pandel, Amir Reavis-Bey, and Eddie Tejeda in 2012, is celebrated for its success in integrating data from seven disparate city departments. It employs plain language, simple and familiar web affordances, and clear information hierarchies.
DiscoverBPS, the Boston Public Schools search app created by Joel Mahoney in 2011, succeeds because it looks at the process of school choice from a parent’s perspective. Rather than listing data school by school, it allows comparison across factors that are likely to be important to families (based on the creators’ user research). In reducing the burden required to extract meaning (i.e. the specific information categories they care about) from public data, it uses design to make the information more accessible to everyone.
Read the full chapter The Beginning of a Beautiful Friendship: Data and Design in Innovative Citizen Experiences.
Cyd is renowned for building communities, which she’s done in San Francisco (as co-founder of Women on the Web), and at Code for America (where she’s a mentor). She’s also an adviser to ethnio, a product that helps people recruit research participants. You can follow Cyd on Twitter at @cydharrell.
What choices has your team made to present innovative experiences with both public and private data? Tell us about it at the UIE blog.
Read related articles: