Geeks With Blogs

BrustBlog Pontifications on Microsoft and the Tech Industry


June 21st, 2010


The Importance of Open Government Data

To all those present, good afternoon. My name is Andrew Brust. I help run a consulting firm, twentysix New York, here in Manhattan. I am also a technology columnist and blogger, and serve on the New York Technology Council’s Advisory Board. As I have explained in previous testimony, I am a lifelong New Yorker, and began my IT career in the employ of the government of the City of New York.

I’ve testified to this Committee before, voicing my support for Open Government Data. I’ll reiterate today that I feel the benefits of publishing data from all City agencies are huge. In the City of New York we have a large government consisting of many departments, authorities and commissions. Given the pervasiveness, especially recently, of technology in ordinary peoples’ lives, it only makes sense to publish this data online. Data published in raw form allows citizens to query that data from computers, smart phones, tablets and other devices. It also allows software developers, be they hobbyists or entrepreneurs, to create applications that analyze the data, mash it up with private data, or visualize it geographically, or through charts, gauges, and diagrams.

Just as the Federal government has created as a portal for data from Federal agencies, so too should the City of New York, perhaps through DoITT, create a portal for City data. This would provide a platform for an aftermarket in City data-based products and services. It could stimulate greater public participation in government. In the same spirit as the Green Book directory and the various NYC TV cable channels, a City data portal that were both human- and machine-readable could enable self-service acquisition of government information and make City services more effective in the process.

The prospect of opening up each data stream in each agency might seem daunting to City IT professionals. I would encourage DoITT and the individual agencies to conceive of the requirement with the right mindset. Data feeds are just software services, and good software is built on the premise of designing a service layer at the foundation. So rather than taking the approach of building closed systems and then opening them up, agencies should premise the architecture of their systems on building the services/feeds first, then layering application logic and functionality on top of them. With this approach, Open Government Data would become a byproduct of normal software development, rather than a burdensome, discrete step.

This would still leave the “back catalog” of applications and databases, of course, but that can be processed in a phased, scheduled way. Each pre-existing data source exposed would facilitate not just public consumption of the data but re-use of that same data by the agency in new software applications.


Mashup Examples Redux

In my previous testimony, I suggested some examples of how government data could be utilized and commercialized. Allow me to present these ideas, briefly, once more.

Google Maps should be able to show where the big potholes are; Zagat should be able to indicate which restaurants have a sterling Health Department inspection record; WebMD should be able to create heatmaps showing which neighborhoods are hardest hit by an epidemic, and the New York Times ought to be able to indicate which boroughs and neighborhoods are getting the most, or least, arts funding.

Retail consultancies should be able to show which precincts are best and least served by certain types of shops. Tourists should be able to see where the highest concentrations of hotel rooms are, and thus where the most availability and best prices may exist. Members of this Committee should be able to see how well Verizon is living up to its commitment to deploy FiOS service to all areas of all five boroughs.

Children’s Aid Society should be able to illustrate where concentrations of child homelessness and abuse exist. Food for Survival should be able to show which ethnic, geographic, economic and age groups are most susceptible to hunger.

Ultimately, the thing to remember is that data is a raw material, which the City government can refine only to a certain extent. Making the raw material available to the public allows a far greater amount of refinement and value to be added to that data, than can be had by keeping it sequestered within the agency that has collected it.

The City as Data Consumer

Even City government can directly benefit from such Open Government Data. That’s because integration of systems between agencies will be much better facilitated through a normal data sharing regime than through customized, point-to-point data interchange. This will enable streamlined construction of numerous systems. For example, a comprehensive city-wide enterprise data warehouse will be much easier to build in a data sharing environment, making the Mayor’s Management Report much easier to produce. The notion of a general inquiry system, across all agencies, for 311, becomes compellingly feasible. Key Performance Indicators, and hierarchical balanced scorecards for the entire City government become approachable, as does an automated alerting system that would cover the situation where any of those KPIs fell below or exceeded acceptable values.

The exciting internal applications for Open Government data should eliminate any fear that the external applications would be limited or underwhelming. They should also eliminate doubt as to the importance of sharing virtually all data (within important privacy and security limits), no matter how mundane some of the data, in raw form, may seem.

A Possible Technology

The technology platform with which agencies publish and even host their data most likely should be determined at the discretion of the individual agency itself. Making all agencies adhere to a single technology, hosting or cloud platform would likely be cumbersome to the point of being counterproductive to the goal of publishing the data in the first place.

That said, I would like to alert the Committee to an important technology and platform, from a perhaps unlikely source: Microsoft. Microsoft has created a framework called the Open Government Data Initiative, or OGDI. The software developer’s kit for OGDI is, believe it or not, published under an open source license. It was developed not by a product team on the corporate campus in Redmond, WA, but by the field organization that works with developers in the U.S. public sector (including federal, state and local government). OGDI is already being used to publish data from the US Bureau of Labor Statistics, General Services Agency and Geological Survey, as well as from the city government in the District of Columbia, and the City of Edmonton in Canada. Rather than just a static feed, OGDI provides a fully queryable Web service as well as built-in capabilities for mapping and charting the data. Data can also be downloaded in CSV, Excel, or KML formats, the last of these being compatible with Google Earth.

You may know I work closely with Microsoft and have done so now for the better part of two decades. As I have done so, I have often been critical of the proprietary approach the company takes to certain technologies, including data access. But recently things here have changed.

First Microsoft developed a data Web Services technology for its .NET software platform; the technology is called WCF Data Services, but its original code name, which many people still use, was “Astoria.” From the beginning I thought any technology that shared a name with a neighborhood in Queens had to be good. And it was: Astoria is based on open standards including HTTP, REST, ATOM, XML and JSON. These are common Web programming technologies that are in no way unique to Microsoft; that, in and of itself, was noteworthy. But then Microsoft took this approach a step further, doing something quite uncharacteristic: it took Astoria’s format and wire protocol and published it as an open protocol, which could be implemented on any platform. Microsoft christened the platform-neutral assets from Astoria the Open Data Protocol, or OData.

OData is compatible, in a raw way, with any programming environment which has the capability of interfacing with the Web. But what about a full library that consumes OData feeds and makes their data items appear as rich objects that developers can program against, without having to write the code to procure the feeds and parse each record and field within them? Of course such a facility exists for Microsoft’s .NET platform. But it extends well beyond that: OData native client libraries also exist for JavaScript, Java, PHP, Ruby, and Objective C (used by the iPhone/iPad platform). Microsoft has actually published the source code from the .NET client implementation (available under the Apache Open Source license) so that everyone has access to a sturdy reference implementation. On the server side, in addition to Microsoft’s Astoria implementation, IBM has implemented OData for its WebSphere eXtreme Scale REST data service (for its grid database service).

The Open Government Data Initiative is built upon OData, and I hope City agencies will consider using it. To me, this isn’t about supporting Microsoft. It’s about getting Open Government Data up quickly and easily, in machine-readable form and with a basic user interface that is Section 508-compliant for accessibility. It’s also about encouraging Microsoft to continue this open approach to technology, so that it becomes more the rule than the occasion.

Data for the People

Regardless of which technology or collection of technologies the City Of New York uses to open its data, my hope is that it will indeed do so, somehow, and quickly. The data shouldn’t be proprietary to City agencies, because the data isn’t the City government’s property. The data belongs to the public, to use for public benefit and innovative results. I applaud the Council and this Committee for being such strong advocates of such an outcome, and I fervently support passage of Int. 029-2010. Thank you for your time and attention today.

Posted on Thursday, June 24, 2010 1:03 PM | Back to top

Comments on this post: Testimony to the New York City Council Committee on Technology’s Hearing on Introduction 029-2010

No comments posted yet.
Your comment:
 (will show your gravatar)

Copyright © andrewbrust | Powered by: