Archive for May, 2007

Google Developer Day

With too many things on my plate right now, I decided not to attend next week’s Google Developer Day 2007 - Mountain View. One can, however, follow the sessions as they are broadcast: Google Developer Day 2007 - Mountain View - Sessions . Given what I just read in the Times about Google’s intense recruiting efforts, I wouldn’t doubt that many (Google employees and would-be employees) will show up in hopes of a bright future for Google.

Notelets: I School Master’s Projects and why write a computer book?

Over the summer, I hope to take a closer look at all the wonderful work contained in the collection of Master’s Final Projects: 2007.

I don’t think that there is an official API for Google Reader although Niall Kennedy documened an unofficial Google Reader API a while back.

Shaking up tech publishing (Loud Thinking) is an interesting thread on the economics and incentive structures behind writing a computer book (such as what I’m doing right now!)

Notelets: hosting, Wordpress, open access repositories, Firefox, LibraryLookup

My Dreamhost-hosted sites are down again: DreamHost Status » Blog Archive » Spacey filer issues. Time to move? But where to go?

If I want to add SSL access to any of the domains I host on dreamhost.com, I will need a unique IP address, which costs an extra $4/month . Some threads on this topic: Re: Unique ip?

Since I use Wordpress to display code, I’d dearly like to get the bug #3066 (backslash disappears in <pre>) fixed.

I’m glad to see the emergence of APIs in the scholarly/library realm: OpenDOAR - About OpenDOAR - Directory of Open Access Repositories and the corresponding OpenDOAR - Application Programmers’ Interface (API)

I’d like to learn how to write a FireFox toolbar. Born Geek » Firefox Toolbar Tutorial is a tutorial that might help:

    This tutorial explains how to create a toolbar extension for the Firefox web browser (specifically for version 1.5 and later). It provides an overview of how extensions are developed, the tools required to create an extension, and details on how toolbars are created. Please note that this tutorial is lengthy; I recommend spending time with it over the course of a few days (it makes for a good weekend read).

The online Barnes and Noble stor (barnesandnoble.com) uses ISBN-13 in the links to books. (e.g., RESTful Web Services) Amazon.com uses ISBN-10. Something to keep in mind to et LibraryLookup to work for Barnes and Noble.

Because I really dig Python, I perk up with any mention of free (?) Plone hosting, such as Objectis - Objectis Community

Cool to see a digital historian explain screen-scraping

I’m adding Digital History Hacks to my list of weblogs to follow on the strength the author (William J. Turkel) ’s being a historian working in “digital history” and writing about web spidering and scraping. To wit, Digital History Hacks: Teaching Young Historians to Search, Spider and Scrape:

    To get the most out of the web, however, it is crucial that we begin to teach history students the rudiments of web programming. Spidering, for example, is the (automated) process of visiting a webpage, creating an index and a list of links to further pages, and then following each of those in turn and doing the same thing. Whenever we follow the citations in a footnote to another source, and then begin to read its footnotes, we are doing a kind of spidering. By teaching students how to implement this process on the computer we will not only teach them a crucial skill, we will make them more aware of the technologies that have long underlain the historian’s craft. Scraping refers to the process of mechanically extracting information from sources (like webpages) that are intended to be read by people rather than machines. Because computers don’t understand text in the way that people do, scraping has to rely on the form of the text to extract information, rather than the meaning. As a result, scrapers are ‘brittle’: if the form changes, the scraper breaks. For this reason, it is important for historians to be able to create their own tools, rather than using the tools created by others, and this, again, means that it is necessary to learn some rudimentary web programming.

A data architect on hiatus

Ever since I left my job as a data architect to focus on writing my book on mashups, I’ve not had much to say publicly about data architecture, especially as it applies to higher education and the world of libraries. Often, my posts have been in response to specific pieces of news that arrive on my desk in the course of my job. Now, since I have fewer immediate matters to which to react, I’ve been relatively inactive on this blog.

However, I do think a lot about some perhaps mundane problems that I face as I write my book, barriers that make it difficult to do research and to write up that research and present it on the Web. An example: even though I cite sources in my book, I’ve not figured out the best way to integrate Zotero (a bibliographic reference manager) into the writing process. I’m a tad embarrassed to admit that I’ve been formatting references by hand — even though I have a pretty good understanding of bibliographic reference managers and their potential benefits. (I used BibTeX in my Ph.D. dissertation.) How do I manage references that are scattered throughout my digital universe: my social bookmarks, in my Word documents, in my wiki and blogs….etc?

At any rate, please expect sporadic updates over the next months. Most of my blogging around my professional work will be happening on mashupguide.net. I will, however, write about ideas that come to me as I start to build up my consulting business around the use of XML, web services, and mashup-type thinking.