Data Unbound

Helping organizations access and share data effectively. Special focus on web APIs for data integration.

September 3rd, 2009

plotting data for counties on Google Maps: Part I

There is a huge amount of government and socio-economic data in general  gathered at the county level.  It would be nice to be able to plot that data on an desktop or online map (e.g., Google maps).  This morning I posted a question on the  Sunlight labs mailing list asking for some help:

I would like to display US counties on a Google map based on some  scalar value (e.g., population)  for each county and a color map that associates values to colors.  Does anyone know of a library that makes this easy to do?  (I'm interested in doing the same for other adminstrative regions, such as zip codes and congressional districts.)

(http://groups.google.com/group/Google-Maps-API/browse_frm/thread/fbc9266d4144e8fd/dbf74647b8baf8d1 contains a good discussion of the topic — and I have found other references that might be helpful,  but I have not seen the functionality I'm looking for distilled down into an easy-to-use library.)

Building a ground overlay

When I tweeted my question, I got a very helpful response from Sean Gillies:

That's a lot of polygons (3489, see http://sgillies.net/blog/870/a-more-perfect-union-continued/) to draw in the browser. Make an image layer with OpenLayers?

Sean confirmed what I was thinking that I had to compute a static image to use as an overlay — otherwise drawing 3000+ polygons with slow down Google maps prohibitively.   In fact, in many ways, I've been trying to use the approach I've seen from the demo gallery of the Google Maps API v3:   John Coryat's  ProjectedOverlay example, which "uses OverlayView to render an image inside a given bounding box (LatLngBounds) on top of the map".  (You can look at the overlay image (.png) directly and reuse ProjectedOverlay.js)

So one approach would be to calculate a png of the counties (colored appropriately), and this png would provide an efficient way to display county data.  I had started down this road a while ago — Sean's post gave me some more direct guidance in how to create a useful Python-based desktop GIS setup to be able to handle such tasks as creating my desired map in a png form.  To be honest, I've found the whole open source GIS world fairly confusing.  I bought and read part of Gary Sherman's Desktop GIS: Mapping the Planet with Open Source Tools. (Illustrated edition. Pragmatic Bookshelf, 2008) and was considering installing FWTools, GRASS GIS, and Quantum GIS. His post alerted me to OSGeo.org, and convinced me to try OSGeo4W , which is

a binary distribution of a broad set of open source geospatial software for Win32 environments (Windows XP, Vista, etc). OSGeo4W includes GDAL/OGR, GRASS, MapServer, OpenEV, uDig, QGIS as well as many other packages (about 70 as of summer 2008).

I installed OSGeo4W but have not been able to figure out the Python bindings (and hence can't yet try out the code that Sean posted).   Neither has the Python setup from FWTools 2.4.3 worked for me.  My next steps is to follow the instructions at Python Package Index : GDAL 1.6.1 to see whether I'll have better luck.

Joshua Tauberer's WMS service

Joshua Tauberer of Govtrack.us responded to my query by referring me to his experimental WMS service, which produces WMS layer for entities ranging from Congressional and state districts to counties.   I modified one of the examples that  to try to plot the counties.   For some reason, not all the counties show up yet.  Still, this approach is very promising since it would save me the work of calculating the coordinates of the county boundaries to begin with.  I have to come back to study and apply the techniques documented at WMS Server API Documentation.

Other things to study further

June 16th, 2007

Notelets for 2007.06.09 (a while ago)

‘omg my mom joined facebook!!’ – New York Times captures some of my own experiences on Facebook and might make a good piece for my Building Next Generation Web Applications course:

    So last week I joined Facebook, the social network for students that opened its doors last fall to anyone with an e-mail address. The decision not only doubled its active membership to 24 million (more than 50 percent of whom are not students), but it also made it possible for parents like me to peek at our children in their online lair.

I'm glad to hear that the current Youtube API will evolve on top the Google GData API: YouTube API Blog: The Future

I didn't know about the Ruby-based API to Sketchup: SketchUp Ruby – Wikipedia, the free encyclopedia.

I follow EveryBlock with great interest. (See also Knight Foundation grant Holovaty.com — and Poynter Online – E-Media Tidbits, which has more preliminary details about EveryBlock.)

It is important to remember that JavaScript code is case sensitive. I couldn't get an event handler to fire because I wrote alink.onClick and not the correct alink.onclick

May 28th, 2007

Notelets: hosting, Wordpress, open access repositories, Firefox, LibraryLookup

My Dreamhost-hosted sites are down again: DreamHost Status » Blog Archive » Spacey filer issues. Time to move? But where to go?

If I want to add SSL access to any of the domains I host on dreamhost.com, I will need a unique IP address, which costs an extra $4/month . Some threads on this topic: Re: Unique ip?

Since I use Wordpress to display code, I'd dearly like to get the bug #3066 (backslash disappears in <pre>) fixed.

I'm glad to see the emergence of APIs in the scholarly/library realm: OpenDOAR – About OpenDOAR – Directory of Open Access Repositories and the corresponding OpenDOAR – Application Programmers' Interface (API)

I'd like to learn how to write a FireFox toolbar. Born Geek » Firefox Toolbar Tutorial is a tutorial that might help:

    This tutorial explains how to create a toolbar extension for the Firefox web browser (specifically for version 1.5 and later). It provides an overview of how extensions are developed, the tools required to create an extension, and details on how toolbars are created. Please note that this tutorial is lengthy; I recommend spending time with it over the course of a few days (it makes for a good weekend read).

The online Barnes and Noble stor (barnesandnoble.com) uses ISBN-13 in the links to books. (e.g., RESTful Web Services) Amazon.com uses ISBN-10. Something to keep in mind to et LibraryLookup to work for Barnes and Noble.

Because I really dig Python, I perk up with any mention of free (?) Plone hosting, such as Objectis – Objectis Community

May 17th, 2007

A data architect on hiatus

Ever since I left my job as a data architect to focus on writing my book on mashups, I've not had much to say publicly about data architecture, especially as it applies to higher education and the world of libraries. Often, my posts have been in response to specific pieces of news that arrive on my desk in the course of my job. Now, since I have fewer immediate matters to which to react, I've been relatively inactive on this blog.

However, I do think a lot about some perhaps mundane problems that I face as I write my book, barriers that make it difficult to do research and to write up that research and present it on the Web. An example: even though I cite sources in my book, I've not figured out the best way to integrate Zotero (a bibliographic reference manager) into the writing process. I'm a tad embarrassed to admit that I've been formatting references by hand — even though I have a pretty good understanding of bibliographic reference managers and their potential benefits. (I used BibTeX in my Ph.D. dissertation.) How do I manage references that are scattered throughout my digital universe: my social bookmarks, in my Word documents, in my wiki and blogs….etc?

At any rate, please expect sporadic updates over the next months. Most of my blogging around my professional work will be happening on mashupguide.net. I will, however, write about ideas that come to me as I start to build up my consulting business around the use of XML, web services, and mashup-type thinking.

April 18th, 2007

My "guest expertise" on "Writing for Digital Media"

As a "guest expert" last week in Writing for Digital Media, an online course at Chatham College Online, I had a lot of fun interacting with students in back and forth writing. I thought that I'd capture (in a slightly edited form), what I wrote. (It's probably even more interesting to write down what the students asked me, but for now, I'll record one side of a conversation.) Here goes:

In my introduction I wrote:

    I'm currently writing a book on "web mashups." If you look up mashups on the Wikipedia (http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)), you'll find the following definition: "A mashup is a website or application that combines content from more than one source into an integrated experience." I also teach a course about mashups at the Information School at UC Berkeley. Why should you care about web mashups? With mashups, you are able to easily recombine content and applications to make powerful new creations without much work. We can talk more about mashups if you'd like. Perhaps you'd be interested in discussing the topic of electronic publishing. I've been a weblogger for over 7 years and plan to publish my book on my website in a variety of forms: PDF, Microsoft Word, HTML, a Wiki.

How did I get started with writing on the Web?

    I got interesting in writing for the web when I laid eyes on Dave Winer's Scripting News in early 2000. I remember thinking that what it was that I was looking at. It had a personal voice, sometime obnoxious but mostly engaging. Dave W. was also pioneering an online writing environment (Manila) that involved one-click editing (that is, you see the page and you could hit the edit button). That close feedback loop between reading and writing was addictive. Moreover, the fact that the online writing environment was programmable hooked me for life. (I know I'm writing like a true geek here — I don't know whether you find a programmable writing environment cool, but I do!) I should say that I love blogging because writing helps to clarify my own thinking. There's nothing like having to explain stuff to others to work out what's really going on in my head. I also want to give back to the Web. I've learned so much from the Web that it's a great experience to share what I know online.

    In terms of starting out, in many ways, I'm not the best person to give advice. I say this because I can't say that I'm a model blogger. I write sporadically. I probably try to write to too varied an audience. However, I will recommend two articles written by my colleague Chris Ashley that convey some of the spirit of the new writing world: http://istpub.berkeley.edu:4201/bcc/Fall2001/feat.weblogging.html and http://istpub.berkeley.edu:4201/bcc/Winter2002/feat.weblogging2.html

    Chris has done an amazing job of actively and continually writing. See, for instance, http://chrisashley.net/weblog/

When asked for more details about mashups, specifically how difficult it is to write them and what are some specific examples, I answered:

    Mashups are becoming easier to create by non-programmers — and the term mashups applies whether the new combination of content is created by a programmer or non-programmer. A good example to look at is http://housingmaps.com — which is a cross between craigslist and Google map. That is, you can look at real estate listings from craigslist on a map. Note that housingmaps.com was created by neither Google nor craigslist but by Paul Rademacher (http://www.technologyreview.com/tr35/Profile.aspx?Cand=T&TRID=437). When Paul R. made housingmaps.com in 2005, it was a really creative act. Google made it easier by releasing an API (an application programmer interface) –http://www.google.com/apis/maps/ — and you can follow the instructions, and it's not super hard but it does help to be techie. Later, people started building tools (such as http://mapbuilder.net) to help people make maps without any programming background. Finally, Google decided to add features for making maps back into its own product (the new "My Maps" feature in http://maps.google.com) Still, the most powerful mashups will need programming skill to create at this point.

    http://www.programmableweb.com/popular is a good list of mashups. The easiest ones to understand are map-based ones. Chicagocrimes.org is another one in that genre.

I was asked to clarify the difference between mashups and a Google search:

    In most cases, web mashing is about making a web site that pulls data from different source together. When you do a google search, you find things that Google has brought together under a search term. But you are not really joining them together as in a web mashup.

My advice on how to get started with writing on the Web:

    A good start in writing for the Web is to find other people who are writing to the same audience and start engaging those folks in dialog. Link to those people. Comment on their work. There's a good chance that they will link back if you are constructively engaging. Reflect on the websites (especially weblogs) that you currently like and read. What is it about them that you like? Is the implicit or explicit audience that you are trying to reach similar to those sites?

How new of an operating system is needed?

What's an integrated experience in a mashup?

    An integrated experience can be either transitory or durable. I often write programs that are "throw-away", meaning I made them to do this one act of integration and then I use the product and don't really keep the program around. Other times, I want to create something that lasts and that can be used by others.

How about copyright?

    In terms of copyright, which is a broad and complex topic, (and for which I have little expertise — IANAL ("I am not a lawyer."), what specifically are you interested in? For DRM, take a look at http://en.wikipedia.org/wiki/Digital_Rights_Management: "Digital Rights Management (DRM) is an umbrella term referring to technologies used by publishers or copyright owners to control access to or usage of digital data or hardware, and to restrictions associated with a specific instance of a digital work or device." (e.g., copy protection on DVDs or on iTunes songs.) Yes, copyright is always a concern. But there are important provisions of fair use to keep in mind. (http://en.wikipedia.org/wiki/Fair_use)

Can you do web production all alone or should you work with others?

    In terms of collaboration: I would have to agree that there are very, very few people who are good at all the skills needed. I will have to say that I've found it difficult to pull together a team with all the skills at a very deep level. But often, you have to do with good-enough and move on.

Do you find other people's comments useful on your blog?

    I do find people's comments on my blog as I think aloud helpful. All my new blogs have comments turned on. Unfortunately, I have a bunch of older blogs that I have yet to upgrade — and I turned the comments off there because the spammers were overrunning the blogs!

When to write for online? How about online novels?

    • Lot of the references for my book are URLs.
    • I have a lot of examples which are displayed on the web.
    • My book comes in chunks that can be roughly aligned with a web page.
    • There's lots of inter-linking of materials.
    • My examples are in danger of going out of date hours after I commit stuff to paper.
    • And yet my publisher Apress and I are still producing a paperback book because we believe that people will want to read a lot of the narrative in book form — even if they are allowed to print the whole thing out on their own printers. I myself own a lot of computer books and used online books at the same time — they serve complementary purposes.
  • You have to consider the genre of what you want to publish in deciding whether to go for electronic (or specifically, web-based) publishing. My book is about web mashups, a subject that is naturally tied to the web. Let me list some reasons why I will publish my book on the web in a variety of forms:

    Novels are different entities. A lot of the reasons I list for publishing computer books online don't apply to novels (unless you are writing nonlinear novels embedded in the web!) Paper is an amazing medium, especially for novels. I can imagine that I'd like a novel in electronic format if I wanted to interact with it in new ways (for instance, searching for certain texts, being to annotate the novel and share those annotations with others, participate in online discussions and communities around the novel.) — but I don't know of much work on that front. I'd love to hear of such work — I just don't know that field. You asked about keeping the traditional audience while gaining a broader audience receptive to e-reading: maybe excerpting your work and putting it out on a blog as one idea?
|