government

Participating in the national online dialogue around recovery.gov

Yesterday, I wrote a story on ProgrammableWeb (An Online Dialogue to Shape Recovery.gov) to educate readers on recovery.gov (the government website aimed to let American track the spending of money arising from the  American Recovery and Reinvestment Act of 2009 — the "Stimulus Package")   and to draw attention to a “national dialogue” this week (until May 3) to solicit ideas aimed at answering the key question:

What ideas, tools, and approaches can make Recovery.gov a place where all citizens can transparently monitor the expenditure and use of recovery funds?

I've been reading some of the ideas presented so far and voted on a couple.  I added comments to two so far.   In response to the proposal XML Web Services ("Make recovery data available as a web service via SOAP XML."), I wrote:

I agree that some type of rigorous programmatic interface that allows developers to access the data from recovery.gov is essential. I think that SOAP and associated the rest of WS-* stack might be one way to implement such access mechanisms, but I would not want SOAP to the exclusive protocol used. I would argue, for instance, that a RESTful approach is also an excellent alternative to consider for recovery.gov.

On a front closer to what our work has been about, in response to Making stimulus spending data accessible to the public, I wrote

I'm one of the Berkeley researchers mentioned above involved with making recommendations on how data feeds should be use to make the recovery more transparent (see http://www.ischool.berkeley.edu/newsandevents/news/20090417recoveryguidelines and http://isd.ischool.berkeley.edu/stimulus/2009-029/)

Although some (but not all) agencies receiving and dispersing recovery funds are using feeds in their reporting (see a list that we compiled at http://isd.ischool.berkeley.edu/stimulus/feeds/feeds.html), the best data on dollars appropriated, obligated, or spent is in the Excel spreadsheets. Although there are apparently templates for the reports, they keep changing format and there's nothing to stop agencies from inserting extra fields or omitting other fields. We know this for a fact since we've written programs to scrape the data from the spreadsheets and find it a challenge to keep up with changes that keep breaking our scripts.

The federal government should made the data in the form of XML feeds in the first place (backed by a schema so that we can check that the data is valid), instead of making people who want to use that data scrape it out of Excel in a highly fragile process.

As I wrote yesterday, it will be interesting to see how well the recovery.gov site actually does at aggregating a large number of proposals and surfacing the best ones. Moreover,

government
recovery.gov tracking

Comments (1)

Permalink

working with the bioguide ID for congressperson in Freebase

The Congressional Biographical Directory contains entries for every congressperson from 1774 to the present.  Each congressional representative is associated with an identifier (a bioguide ID).  For example, the bioguide ID for Edward (Ted) Kennedy is K000105.  With this ID, you can determine the URL for the coresponding biographical directory — e.g., Kennedy's is

http://bioguide.congress.gov/scripts/biodisplay.pl?index=K000105

I would like to make use of the bioguide ID in interacting with Freebase with respect to congresspeople.

http://www.freebase.com/view/en/ted_kennedy

hit explore:

http://www.freebase.com/tools/explore/en/ted_kennedy to see

Outbound key(s):

key namespace
184136 /wikipedia/en_id
Ted_Kennedy /wikipedia/en
Edward_M$002E_Kennedy /wikipedia/en
Edward_Moore_Kennedy /wikipedia/en
Teddy_Kennedy /wikipedia/en
Edward_kennedy /wikipedia/en
Edward_M_Kennedy /wikipedia/en
EMK /wikipedia/en
Ed_Kennedy /wikipedia/en
Caroline_Bilodeau /wikipedia/en
aa1a62ca-f027-426e-810f-63556da55434 /authority/musicbrainz
ARTIST349855 /authority/musicbrainz/name
Edward_Kennedy /wikipedia/en
Ted_Kennedy$002FDraft_1 /wikipedia/en
Senator_Ted_Kennedy /wikipedia/en
ted_kennedy /en
The_Lion_of_the_Senate /wikipedia/en
Edward_Moore_$0022Ted$0022_Kennedy /wikipedia/en
K000105 /user/jamie/sunlight/bioguide_id
Cape_Cod_Orca /wikipedia/en

What's the MQL query to read all the keys for the topic?

{
  "id" : "/en/ted_kennedy",
  "key" : [
    {}
  ]
}

we get among the various keys

{
  "namespace" : "/user/jamie/sunlight/bioguide_id",
  "type" : "/type/key",
  "value" : "K000105"
}

Keys are new to me — so I need to do a bit of learning right now.   Now, let's note the following

Let's now figure out how to write the bioguide ID for one of the senators without the bioguide ID:  Jeanne Shaheen facts – Freebase. Her bioguide_id is S001181. Here's a MQL write query that writes the bioguide_id to Freebase:

{
  "id" : "/en/jeanne_shaheen",
  "key" : {
    "connect" : "insert",
    "namespace" : "/user/jamie/sunlight/bioguide_id",
    "type" : "/type/key",
    "value" : "S001181"
  }
}

Things to figure out:  how to create keys in the first place in the freebase UI and in MQL.  I think regular users can create keys but I'm not aware of how to do so in the Freebase UI.  I didn't even see a way to insert the bioguide_id using the Freebase UI.

freebase
government

Comments (1)

Permalink

When did Obama pledge "universally acccessible formats" for government data?

Barack Obama, as a presidential candidate,  pledged that his administration would "put government data online in universally accessible formats".    I learned about this campaign promise from the post techPresident – Sell Obama stimulus and create new transparency era by democratizing data by W. David Stephenson, who is the co-author of an upcoming O'Reilly book Democratizing Data, a book about:

strategies for automated structured data feeds and their use to improve worker efficiency, transparency, and to stimulate mass collaboration. He argues that governments and corporations, by creating automated data feeds in formats such as XML and KML, can simultaneously

  • for the first time give their entire workforces, not just senior management, access to the information they need to do their jobs more efficiently, but also collaborate organization-wide
  • restore public confidence through transparent operations that watchdog groups, the media, regulators, and the public can monitor on a real-time basis
  • find creative new solutions to problems and add profitable new services through mass collaboration leveraging their organizational data.

Stephenson refers to a YouTube video of a talk Obama gave at Google as a specific instance of Obama's mention of "universally accessible formats".  I was curious to nail down what Obama said exactly. If you jump to 9:11 into the talk, you will hear Obama say the following:

To seize this moment, we have to use technology to open up our democracy. It's no coincidence that one of the most secretive administrations in our history has favored special interests and pursued policies that could not stand up to the sunlight. As president, I'm going to change that. We will put government data online in universally acccessible formats. [cheer from the Google crowd] I'll let citizens track federal grants, contracts, earmarks, and lobbying contracts. I'll let you participate in government forums,. ask questions, in real time offer suggestions that will be reviewed before decisions are made, and let you comment on legislation before it is signed. And to ensure that every government agency is meeting 21st century standards, I will appoint the nation's first chief technology officer to coordinate and make certain that we are always at the forefront of technology and that we are incorporating it into every decision that we make.

I was gratified to see that on the very first full day of work in Obama's administration, Obama issued a Presidential Memorandum on Transparency and Open Government (pdf), which states:

  • "Executive departments and agencies should harness new technologies to put information about their operations and decisions online and readily available to the public. Executive departments and agencies should also solicit public feedback to identify information of greatest use to the public."
  • "Executive departments and agencies should offer Americans increased opportunities to participate in policymaking and to provide their Government with the benefits of their collective expertise and information. Executive departments and agencies should also solicit public input on how we can increase and improve opportunities for public participation in Government."
  • "Executive departments and agencies should solicit public feedback to assess and improve their level of collaboration and to identify new opportunities for cooperation."
  • "[Obama] direct[s] the Chief Technology Officer, in coordination with the Director of the Office of Management and Budget (OMB) and the Administrator of General Services, to coordinate the development by appropriate executive departments and agencies, within 120 days, of recommendations for an Open Government Directive, to be issued by the Director of OMB, that instructs executive departments and agencies to take specific actions implementing the principles set forth in this memorandum. The independent agencies should comply with the Open Government Directive."

(Right now, this memo is not available from either whitehouse.gov or the Federal Register.)

You might be interested in hearing Obama explain the concept of transparent government to the White House staff:

A geek note:  In providing a reference to the YouTube video, I was able to provide a URL that loaded the YouTube video and fast forwarded to the moment of interest using the experimental service VTagIt! (by Rick Jaffe).   VTagIt is a great proof-of-concept for services that will be increasingly useful as time goes by.  It's so much more useful to be able to point someone to a specific point in a video instead of saying "go to the video and fast-forward to point X".

government
politics

Comments (1)

Permalink

Tracking Mr. Shelby

One of the project areas I'm hoping that some of my students in Mixing and Remixing Information 2009 (MRI 2009) course will take up is promoting government accountability and transparency in the new Obama era. (Interestingly enough, John Musser wrote last week on ProgrammableWeb about the Sunlight Labs Mashup Contest — something I'll definitely have to mention to the class in January.)

This morning, while reading Banks Got Bailout; Are They Making More Loans? : NPR, I came across a quote from United States Senator Richard Shelby (R, Alabama) that made me wonder what his voting record has been on a number of recent bills.My first thought was whether anyone has fed this data to freebase — a quick look (Richard Shelby facts – Freebase) seems that no, not yet, anyhow.

My next thought was to turn to the Sunlight Foundation projects. It turns out that OpenCongress was able to list the voting record for Richard Shelby, including Nay on the bank bailout bill (H.R.1424 Emergency Economic Stabilization Act of 2008).

Is it possible to retrieve this type of voting data with an API?  Yes, with a  bit of digging based on the following pointer at GovTrack: Source Code, Data, and APIs

Vote Database API:  Get voting records in XML, too. To get an overall list, see the XML download link at the top of the votes page. To include the votes of members of Congress, first find the member, then follow the link to the votes database, and then grab the XML download link.)

I found an XML feed of Shelby's voting record linked off GovTrack: Sen. Richard Shelby [R-AL]'s Voting Record.

Clearly there's a lot more that we can do with this data — stuff to explore in class.

government
Mixing and Remixing information
politics

Comments (1)

Permalink