05.28.07
Posted in iSchool, notelets, publishing at 6:25 am by yee
Over the summer, I hope to take a closer look at all the wonderful work contained in the collection of Master's Final Projects: 2007.
I don't think that there is an official API for Google Reader although Niall Kennedy documened an unofficial Google Reader API a while back.
Shaking up tech publishing (Loud Thinking) is an interesting thread on the economics and incentive structures behind writing a computer book (such as what I'm doing right now!)
Permalink
Posted in Wordpress, mashups, notelets, open access, repositories, web hosting at 6:18 am by yee
My Dreamhost-hosted sites are down again: DreamHost Status » Blog Archive » Spacey filer issues. Time to move? But where to go?
If I want to add SSL access to any of the domains I host on dreamhost.com, I will need a unique IP address, which costs an extra $4/month . Some threads on this topic: Re: Unique ip?
Since I use Wordpress to display code, I'd dearly like to get the bug #3066 (backslash disappears in <pre>) fixed.
I'm glad to see the emergence of APIs in the scholarly/library realm: OpenDOAR - About OpenDOAR - Directory of Open Access Repositories and the corresponding OpenDOAR - Application Programmers' Interface (API)
I'd like to learn how to write a FireFox toolbar. Born Geek » Firefox Toolbar Tutorial is a tutorial that might help:
This tutorial explains how to create a toolbar extension for the Firefox web browser (specifically for version 1.5 and later). It provides an overview of how extensions are developed, the tools required to create an extension, and details on how toolbars are created. Please note that this tutorial is lengthy; I recommend spending time with it over the course of a few days (it makes for a good weekend read).
The online Barnes and Noble stor (barnesandnoble.com) uses ISBN-13 in the links to books. (e.g., RESTful Web Services) Amazon.com uses ISBN-10. Something to keep in mind to et LibraryLookup to work for Barnes and Noble.
Because I really dig Python, I perk up with any mention of free (?) Plone hosting, such as Objectis - Objectis Community
Permalink
05.23.07
Posted in digital scholarship, higher education, humanities, screen scraping at 12:51 pm by yee
I'm adding Digital History Hacks to my list of weblogs to follow on the strength the author (William J. Turkel) 's being a historian working in "digital history" and writing about web spidering and scraping. To wit, Digital History Hacks: Teaching Young Historians to Search, Spider and Scrape:
To get the most out of the web, however, it is crucial that we begin to teach history students the rudiments of web programming. Spidering, for example, is the (automated) process of visiting a webpage, creating an index and a list of links to further pages, and then following each of those in turn and doing the same thing. Whenever we follow the citations in a footnote to another source, and then begin to read its footnotes, we are doing a kind of spidering. By teaching students how to implement this process on the computer we will not only teach them a crucial skill, we will make them more aware of the technologies that have long underlain the historian's craft. Scraping refers to the process of mechanically extracting information from sources (like webpages) that are intended to be read by people rather than machines. Because computers don't understand text in the way that people do, scraping has to rely on the form of the text to extract information, rather than the meaning. As a result, scrapers are 'brittle': if the form changes, the scraper breaks. For this reason, it is important for historians to be able to create their own tools, rather than using the tools created by others, and this, again, means that it is necessary to learn some rudimentary web programming.
Permalink
05.17.07
Posted in bibliographics, consulting, mashups at 6:44 pm by yee
Ever since I left my job as a data architect to focus on writing my book on mashups, I've not had much to say publicly about data architecture, especially as it applies to higher education and the world of libraries. Often, my posts have been in response to specific pieces of news that arrive on my desk in the course of my job. Now, since I have fewer immediate matters to which to react, I've been relatively inactive on this blog.
However, I do think a lot about some perhaps mundane problems that I face as I write my book, barriers that make it difficult to do research and to write up that research and present it on the Web. An example: even though I cite sources in my book, I've not figured out the best way to integrate Zotero (a bibliographic reference manager) into the writing process. I'm a tad embarrassed to admit that I've been formatting references by hand — even though I have a pretty good understanding of bibliographic reference managers and their potential benefits. (I used BibTeX in my Ph.D. dissertation.) How do I manage references that are scattered throughout my digital universe: my social bookmarks, in my Word documents, in my wiki and blogs….etc?
At any rate, please expect sporadic updates over the next months. Most of my blogging around my professional work will be happening on mashupguide.net. I will, however, write about ideas that come to me as I start to build up my consulting business around the use of XML, web services, and mashup-type thinking.
Permalink
04.26.07
Posted in libraries, services at 3:49 pm by yee
Just to express some agreement ith Peter Brantley when he wrote in PB keynote at DLF Forum:
I do not think it is the place of libraries to build applications that directly permit the sciences' domain consumption of content, but I do believe that libraries should develop services that allow our content riches to be discovered, manipulated, and recombined. I think, in other words, that we need to go up the stack, beyond the content, a bit more than we have in the past.
Without such services, how can we effectively interact with library materials, of which more and more are coming in digital form?
Permalink
04.24.07
Posted in Citizendium, OCLC, Uncategorized, libraries, notelets at 5:08 pm by yee
I wish I could attend the Educause Western Regional Conference happening the week after next in SF, whose speaker list includes a number of folks I know personally.
It's great to see more library catalogs with APIs, such as those documented in REST output from Huddersfield's catalogue.
Congratulations to Roy Tennant on his new position at OCLC:
With OCLC I have an incredible opportunity to be active on a broader stage. OCLC is big enough to put libraries on the Internet map in a way that none of us could achieve alone. Open WorldCat is but one example of many. I will be working as a Senior Program Manager with the RLG Programs unit of OCLC Research and Programs. I will report to Jim Michalko, who in turn reports to Lorcan Dempsey. I have met virtually all of the top management team at OCLC and I've been very impressed. They know where things are heading and they're determined to position libraries in a way that will do us the most good.
It's a big loss for CDL — but I'm looking forward to seeing Roy's influence at work on the larger playing field of OCLC.
I unintentionally deleted all my cookies in Firefox Argh. The interface should have prompted me that I was deleting all my cookies and not just the one I had highlighted. and deleting cookies — should be prompted!
The Citizendium editorial council email list is archived on the web — e.g., The Cz-editcouncil April 2007 Archive by thread
Any reason to use Shelfari instead of LibraryThing?
Permalink
Posted in APIs, Citizendium at 10:48 am by yee
I'll have to write more soon about Citizendium, which is
an experimental new wiki project. The project, started by a founder of Wikipedia, aims to improve on that model by adding "gentle expert oversight" and requiring contributors to use their real names.
As a member of the Citizendium Editorial Council, I've not yet had much time at all to contribute to the Citizendium but hope to have more time once I'm over the big push right now on my book. My interests around Citizendium are as as much about the technological framework as the actual content of the system. For instance, I hope to help shape technical standards and APIs at the Citizendium to allow for better reuse of its content. (See the documentation for the API available at the Wikipedia that allows external programs to access the Wikipedia and recombine and reuse content.)
Permalink
04.18.07
Posted in mashups, weblogging, writing at 5:13 pm by yee
As a "guest expert" last week in Writing for Digital Media, an online course at Chatham College Online, I had a lot of fun interacting with students in back and forth writing. I thought that I'd capture (in a slightly edited form), what I wrote. (It's probably even more interesting to write down what the students asked me, but for now, I'll record one side of a conversation.) Here goes:
In my introduction I wrote:
I'm currently writing a book on "web mashups." If you look up mashups on the Wikipedia (http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)), you'll find the following definition: "A mashup is a website or application that combines content from more than one source into an integrated experience." I also teach a course about mashups at the Information School at UC Berkeley. Why should you care about web mashups? With mashups, you are able to easily recombine content and applications to make powerful new creations without much work. We can talk more about mashups if you'd like. Perhaps you'd be interested in discussing the topic of electronic publishing. I've been a weblogger for over 7 years and plan to publish my book on my website in a variety of forms: PDF, Microsoft Word, HTML, a Wiki.
How did I get started with writing on the Web?
I got interesting in writing for the web when I laid eyes on Dave Winer's Scripting News in early 2000. I remember thinking that what it was that I was looking at. It had a personal voice, sometime obnoxious but mostly engaging. Dave W. was also pioneering an online writing environment (Manila) that involved one-click editing (that is, you see the page and you could hit the edit button). That close feedback loop between reading and writing was addictive. Moreover, the fact that the online writing environment was programmable hooked me for life. (I know I'm writing like a true geek here — I don't know whether you find a programmable writing environment cool, but I do!) I should say that I love blogging because writing helps to clarify my own thinking. There's nothing like having to explain stuff to others to work out what's really going on in my head. I also want to give back to the Web. I've learned so much from the Web that it's a great experience to share what I know online.
In terms of starting out, in many ways, I'm not the best person to give advice. I say this because I can't say that I'm a model blogger. I write sporadically. I probably try to write to too varied an audience. However, I will recommend two articles written by my colleague Chris Ashley that convey some of the spirit of the new writing world: http://istpub.berkeley.edu:4201/bcc/Fall2001/feat.weblogging.html and http://istpub.berkeley.edu:4201/bcc/Winter2002/feat.weblogging2.html
Chris has done an amazing job of actively and continually writing. See, for instance, http://chrisashley.net/weblog/
When asked for more details about mashups, specifically how difficult it is to write them and what are some specific examples, I answered:
Mashups are becoming easier to create by non-programmers — and the term mashups applies whether the new combination of content is created by a programmer or non-programmer. A good example to look at is http://housingmaps.com — which is a cross between craigslist and Google map. That is, you can look at real estate listings from craigslist on a map. Note that housingmaps.com was created by neither Google nor craigslist but by Paul Rademacher (http://www.technologyreview.com/tr35/Profile.aspx?Cand=T&TRID=437). When Paul R. made housingmaps.com in 2005, it was a really creative act. Google made it easier by releasing an API (an application programmer interface) –http://www.google.com/apis/maps/ — and you can follow the instructions, and it's not super hard but it does help to be techie. Later, people started building tools (such as http://mapbuilder.net) to help people make maps without any programming background. Finally, Google decided to add features for making maps back into its own product (the new "My Maps" feature in http://maps.google.com) Still, the most powerful mashups will need programming skill to create at this point.
http://www.programmableweb.com/popular is a good list of mashups. The easiest ones to understand are map-based ones. Chicagocrimes.org is another one in that genre.
I was asked to clarify the difference between mashups and a Google search:
In most cases, web mashing is about making a web site that pulls data from different source together. When you do a google search, you find things that Google has brought together under a search term. But you are not really joining them together as in a web mashup.
My advice on how to get started with writing on the Web:
A good start in writing for the Web is to find other people who are writing to the same audience and start engaging those folks in dialog. Link to those people. Comment on their work. There's a good chance that they will link back if you are constructively engaging. Reflect on the websites (especially weblogs) that you currently like and read. What is it about them that you like? Is the implicit or explicit audience that you are trying to reach similar to those sites?
How new of an operating system is needed?
What's an integrated experience in a mashup?
An integrated experience can be either transitory or durable. I often write programs that are "throw-away", meaning I made them to do this one act of integration and then I use the product and don't really keep the program around. Other times, I want to create something that lasts and that can be used by others.
How about copyright?
In terms of copyright, which is a broad and complex topic, (and for which I have little expertise — IANAL ("I am not a lawyer."), what specifically are you interested in? For DRM, take a look at http://en.wikipedia.org/wiki/Digital_Rights_Management: "Digital Rights Management (DRM) is an umbrella term referring to technologies used by publishers or copyright owners to control access to or usage of digital data or hardware, and to restrictions associated with a specific instance of a digital work or device." (e.g., copy protection on DVDs or on iTunes songs.) Yes, copyright is always a concern. But there are important provisions of fair use to keep in mind. (http://en.wikipedia.org/wiki/Fair_use)
Can you do web production all alone or should you work with others?
In terms of collaboration: I would have to agree that there are very, very few people who are good at all the skills needed. I will have to say that I've found it difficult to pull together a team with all the skills at a very deep level. But often, you have to do with good-enough and move on.
Do you find other people's comments useful on your blog?
I do find people's comments on my blog as I think aloud helpful. All my new blogs have comments turned on. Unfortunately, I have a bunch of older blogs that I have yet to upgrade — and I turned the comments off there because the spammers were overrunning the blogs!
When to write for online? How about online novels?
-
- Lot of the references for my book are URLs.
- I have a lot of examples which are displayed on the web.
- My book comes in chunks that can be roughly aligned with a web page.
- There's lots of inter-linking of materials.
- My examples are in danger of going out of date hours after I commit stuff to paper.
- And yet my publisher Apress and I are still producing a paperback book because we believe that people will want to read a lot of the narrative in book form — even if they are allowed to print the whole thing out on their own printers. I myself own a lot of computer books and used online books at the same time — they serve complementary purposes.
You have to consider the genre of what you want to publish in deciding whether to go for electronic (or specifically, web-based) publishing. My book is about web mashups, a subject that is naturally tied to the web. Let me list some reasons why I will publish my book on the web in a variety of forms:
Novels are different entities. A lot of the reasons I list for publishing computer books online don't apply to novels (unless you are writing nonlinear novels embedded in the web!) Paper is an amazing medium, especially for novels. I can imagine that I'd like a novel in electronic format if I wanted to interact with it in new ways (for instance, searching for certain texts, being to annotate the novel and share those annotations with others, participate in online discussions and communities around the novel.) — but I don't know of much work on that front. I'd love to hear of such work — I just don't know that field. You asked about keeping the traditional audience while gaining a broader audience receptive to e-reading: maybe excerpting your work and putting it out on a blog as one idea?
Permalink
04.03.07
Posted in personal news at 7:21 am by yee
April 17 is my last day as a data architect in IST-Data Services. I'm stepping down to make more time to finish my book Pro Web 2.0 Mashups: Remixing Data and Web Services , which will be published this fall. I'm sad to leave the close working relationships that I've developed over the 8-1/2 years I've been with IST. I've learned a tremendous amount during that time. I would not be writing about mashups today had it not been for the opportunity to work on the Scholar's Box in the basement of Barrows Hall many years ago. My hope is that I'll be able to stay in touch with my friends and colleagues.
Once the book is completed, I'll be focused on establishing a consultancy focused on how universities, libraries, museums, and for-profit entities can best deploy remixing and mashup technologies and methodologies. Please contact me if you are interested in working with me on that front!
Permalink
Posted in REST, SOAP, web services at 5:12 am by yee
This afternoon, some of us IST architects are meeting to take up the "SOAP vs REST" question. Some resources that have been put forth as supporting references are:
I found Don Box's Pragmatics helpful — short and insightful.
Permalink
« Previous entries · Next entries »