Skip to content

Notelets: Educause, library catalogue APIs, Roy Tennant moving to OCLC, Citizendium API

I wish I could attend the Educause Western Regional Conference happening the week after next in SF, whose speaker list includes a number of folks I know personally.

It’s great to see more library catalogs with APIs, such as those documented in REST output from Huddersfield’s catalogue.

Congratulations to Roy Tennant on his new position at OCLC:

    With OCLC I have an incredible opportunity to be active on a broader stage. OCLC is big enough to put libraries on the Internet map in a way that none of us could achieve alone. Open WorldCat is but one example of many. I will be working as a Senior Program Manager with the RLG Programs unit of OCLC Research and Programs. I will report to Jim Michalko, who in turn reports to Lorcan Dempsey. I have met virtually all of the top management team at OCLC and I’ve been very impressed. They know where things are heading and they’re determined to position libraries in a way that will do us the most good.

It’s a big loss for CDL — but I’m looking forward to seeing Roy’s influence at work on the larger playing field of OCLC.

I unintentionally deleted all my cookies in Firefox Argh. The interface should have prompted me that I was deleting all my cookies and not just the one I had highlighted. and deleting cookies — should be prompted!

The Citizendium editorial council email list is archived on the web — e.g., The Cz-editcouncil April 2007 Archive by thread

Any reason to use Shelfari instead of LibraryThing?

Citizendium and APIs

I’ll have to write more soon about Citizendium, which is

    an experimental new wiki project. The project, started by a founder of Wikipedia, aims to improve on that model by adding “gentle expert oversight” and requiring contributors to use their real names.

As a member of the Citizendium Editorial Council, I’ve not yet had much time at all to contribute to the Citizendium but hope to have more time once I’m over the big push right now on my book. My interests around Citizendium are as as much about the technological framework as the actual content of the system. For instance, I hope to help shape technical standards and APIs at the Citizendium to allow for better reuse of its content. (See the documentation for the API available at the Wikipedia that allows external programs to access the Wikipedia and recombine and reuse content.)

My “guest expertise” on “Writing for Digital Media”

As a “guest expert” last week in Writing for Digital Media, an online course at Chatham College Online, I had a lot of fun interacting with students in back and forth writing. I thought that I’d capture (in a slightly edited form), what I wrote. (It’s probably even more interesting to write down what the students asked me, but for now, I’ll record one side of a conversation.) Here goes:

In my introduction I wrote:

    I’m currently writing a book on “web mashups.” If you look up mashups on the Wikipedia (http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid)), you’ll find the following definition: “A mashup is a website or application that combines content from more than one source into an integrated experience.” I also teach a course about mashups at the Information School at UC Berkeley. Why should you care about web mashups? With mashups, you are able to easily recombine content and applications to make powerful new creations without much work. We can talk more about mashups if you’d like. Perhaps you’d be interested in discussing the topic of electronic publishing. I’ve been a weblogger for over 7 years and plan to publish my book on my website in a variety of forms: PDF, Microsoft Word, HTML, a Wiki.

How did I get started with writing on the Web?

    I got interesting in writing for the web when I laid eyes on Dave Winer’s Scripting News in early 2000. I remember thinking that what it was that I was looking at. It had a personal voice, sometime obnoxious but mostly engaging. Dave W. was also pioneering an online writing environment (Manila) that involved one-click editing (that is, you see the page and you could hit the edit button). That close feedback loop between reading and writing was addictive. Moreover, the fact that the online writing environment was programmable hooked me for life. (I know I’m writing like a true geek here — I don’t know whether you find a programmable writing environment cool, but I do!) I should say that I love blogging because writing helps to clarify my own thinking. There’s nothing like having to explain stuff to others to work out what’s really going on in my head. I also want to give back to the Web. I’ve learned so much from the Web that it’s a great experience to share what I know online.

    In terms of starting out, in many ways, I’m not the best person to give advice. I say this because I can’t say that I’m a model blogger. I write sporadically. I probably try to write to too varied an audience. However, I will recommend two articles written by my colleague Chris Ashley that convey some of the spirit of the new writing world: http://istpub.berkeley.edu:4201/bcc/Fall2001/feat.weblogging.html and http://istpub.berkeley.edu:4201/bcc/Winter2002/feat.weblogging2.html

    Chris has done an amazing job of actively and continually writing. See, for instance, http://chrisashley.net/weblog/

When asked for more details about mashups, specifically how difficult it is to write them and what are some specific examples, I answered:

    Mashups are becoming easier to create by non-programmers — and the term mashups applies whether the new combination of content is created by a programmer or non-programmer. A good example to look at is http://housingmaps.com — which is a cross between craigslist and Google map. That is, you can look at real estate listings from craigslist on a map. Note that housingmaps.com was created by neither Google nor craigslist but by Paul Rademacher (http://www.technologyreview.com/tr35/Profile.aspx?Cand=T&TRID=437). When Paul R. made housingmaps.com in 2005, it was a really creative act. Google made it easier by releasing an API (an application programmer interface) —http://www.google.com/apis/maps/ — and you can follow the instructions, and it’s not super hard but it does help to be techie. Later, people started building tools (such as http://mapbuilder.net) to help people make maps without any programming background. Finally, Google decided to add features for making maps back into its own product (the new “My Maps” feature in http://maps.google.com) Still, the most powerful mashups will need programming skill to create at this point.

    http://www.programmableweb.com/popular is a good list of mashups. The easiest ones to understand are map-based ones. Chicagocrimes.org is another one in that genre.

I was asked to clarify the difference between mashups and a Google search:

    In most cases, web mashing is about making a web site that pulls data from different source together. When you do a google search, you find things that Google has brought together under a search term. But you are not really joining them together as in a web mashup.

My advice on how to get started with writing on the Web:

    A good start in writing for the Web is to find other people who are writing to the same audience and start engaging those folks in dialog. Link to those people. Comment on their work. There’s a good chance that they will link back if you are constructively engaging. Reflect on the websites (especially weblogs) that you currently like and read. What is it about them that you like? Is the implicit or explicit audience that you are trying to reach similar to those sites?

How new of an operating system is needed?

What’s an integrated experience in a mashup?

    An integrated experience can be either transitory or durable. I often write programs that are “throw-away”, meaning I made them to do this one act of integration and then I use the product and don’t really keep the program around. Other times, I want to create something that lasts and that can be used by others.

How about copyright?

    In terms of copyright, which is a broad and complex topic, (and for which I have little expertise — IANAL (“I am not a lawyer.”), what specifically are you interested in? For DRM, take a look at http://en.wikipedia.org/wiki/Digital_Rights_Management: “Digital Rights Management (DRM) is an umbrella term referring to technologies used by publishers or copyright owners to control access to or usage of digital data or hardware, and to restrictions associated with a specific instance of a digital work or device.” (e.g., copy protection on DVDs or on iTunes songs.) Yes, copyright is always a concern. But there are important provisions of fair use to keep in mind. (http://en.wikipedia.org/wiki/Fair_use)

Can you do web production all alone or should you work with others?

    In terms of collaboration: I would have to agree that there are very, very few people who are good at all the skills needed. I will have to say that I’ve found it difficult to pull together a team with all the skills at a very deep level. But often, you have to do with good-enough and move on.

Do you find other people’s comments useful on your blog?

    I do find people’s comments on my blog as I think aloud helpful. All my new blogs have comments turned on. Unfortunately, I have a bunch of older blogs that I have yet to upgrade — and I turned the comments off there because the spammers were overrunning the blogs!

When to write for online? How about online novels?

    • Lot of the references for my book are URLs.
    • I have a lot of examples which are displayed on the web.
    • My book comes in chunks that can be roughly aligned with a web page.
    • There’s lots of inter-linking of materials.
    • My examples are in danger of going out of date hours after I commit stuff to paper.
    • And yet my publisher Apress and I are still producing a paperback book because we believe that people will want to read a lot of the narrative in book form — even if they are allowed to print the whole thing out on their own printers. I myself own a lot of computer books and used online books at the same time — they serve complementary purposes.
  • You have to consider the genre of what you want to publish in deciding whether to go for electronic (or specifically, web-based) publishing. My book is about web mashups, a subject that is naturally tied to the web. Let me list some reasons why I will publish my book on the web in a variety of forms:

    Novels are different entities. A lot of the reasons I list for publishing computer books online don’t apply to novels (unless you are writing nonlinear novels embedded in the web!) Paper is an amazing medium, especially for novels. I can imagine that I’d like a novel in electronic format if I wanted to interact with it in new ways (for instance, searching for certain texts, being to annotate the novel and share those annotations with others, participate in online discussions and communities around the novel.) — but I don’t know of much work on that front. I’d love to hear of such work — I just don’t know that field. You asked about keeping the traditional audience while gaining a broader audience receptive to e-reading: maybe excerpting your work and putting it out on a blog as one idea?

Leaving IST to work on my book

April 17 is my last day as a data architect in IST-Data Services. I’m stepping down to make more time to finish my book Pro Web 2.0 Mashups: Remixing Data and Web Services , which will be published this fall. I’m sad to leave the close working relationships that I’ve developed over the 8-1/2 years I’ve been with IST. I’ve learned a tremendous amount during that time. I would not be writing about mashups today had it not been for the opportunity to work on the Scholar’s Box in the basement of Barrows Hall many years ago. My hope is that I’ll be able to stay in touch with my friends and colleagues.

Once the book is completed, I’ll be focused on establishing a consultancy focused on how universities, libraries, museums, and for-profit entities can best deploy remixing and mashup technologies and methodologies. Please contact me if you are interested in working with me on that front!

REST vs SOAP or is that SOAP vs REST

This afternoon, some of us IST architects are meeting to take up the “SOAP vs REST” question. Some resources that have been put forth as supporting references are:

I found Don Box’s Pragmatics helpful — short and insightful.

Shun multitasking if you want to be productive!

Slow Down, Brave Multitasker, and Don’t Read This in Traffic – New York Times is making me think twice about having my email window open on my desktop most of the time:

    In a recent study, a group of Microsoft workers took, on average, 15 minutes to return to serious mental tasks, like writing reports or computer code, after responding to incoming e-mail or instant messages. They strayed off to reply to other messages or browse news, sports or entertainment Web sites.

Building the Berkeley Technology Platform: A Proposal

The single greatest challenge for UC Berkeley is retaining its pre-eminence as a world-famous university in the face of not only such traditional competitors as Stanford and Harvard but also the myriad distributed groups of individuals and organizations that use the Web to produce and disseminate information. A big lesson of Web 2.0 is the incredible amount of knowledge and skill–available to be harvested and distributed throughout the Berkeley community — our faculty, our students, our staff, our alumni – as well as the world beyond UC Berkeley. To meet that challenge through technology, I would put my focus on building a collaborative platform (both virtual and “in real life”) to enable all these people to contribute and work together. And because I do not know all the answers of what to do, I would be encouraging experimentation as well as inviting many people to work with me.

Building services for faculty as researchers and teachers

We need to help our faculty apply computational techniques to their cutting-edge research. To that end, I suggest that we assemble teams that combine disciplinary and IT expertise; create a blend of centralized and discipline-specific computational infrastructure to support research and teaching; forge collaborations among IT organizations, libraries, and educational technologists to tackle institution-wide problems such as institutional repositories; create packages of basic commodity hosting to support research and teaching.

Building a Berkeley Technology Platform (BTP) and an underlying SOA

This is a great time for UC Berkeley to develop an information technology architecture to support deep collaboration, specifically an SOA that will work for this context. Because there is little experience of deploying a SOA at the university, we can start with small pilot projects that emphasize the consumption of web services, followed by the deployment of a small set of web services. For example: a web service that gives the roster of course and another web service that lists the courses a professor is currently teaching. I know that such web services would have an immediate audience. Once we gain experience with web services, we can look at building a larger framework for the deployment and consumption of web services and SOA fashion. At that point, I would advocate for the building of a Berkeley Technology Platform (BTP) that exploits XML and XML web services to create an underlying service-oriented architecture for the campus. By the BTP, I mean the equivalent of the Amazon technology platform, a set of services and infrastructure available to both internal programmers to create web interfaces and access data and for external audiences to build complementary services on top of ones provided by the platform. The BTP would be a rallying point for integration. Departments have data that can be reused by other departments. The Berkeley Technology Platform would provide an integrated framework for that data. Moreover, BTP provides a way for internal and external audiences to come together. The Berkeley platform is an opportunity for collaboration around campus, certainly among application infrastructure and data architects within IST.

In developing the BTP, we should invite students to be active co-developers, to use our web services and show us, what can be done with them. If we are doing things right, we will be surprised by how people will use it. Several years ago, I hired a student who made a name for himself in web scraping the Berkeley course catalog system to create an alternative and reportedly superior, interface. Ideally, we can create our systems so that student should not have to web-scrape our systems, but have an API to access the data and wrap their own interface. I hired that student and wanted to get more students like him. Moreover, from teaching my own course “Mixing and Remixing Information,” I know that students who have very little computer skills are capable of building reasonably elaborate systems that bring together disparate elements. There is a lot of talent among students to be tapped.

Building collaboration systems that combine the virtual and the fact that we are also physically co-located

The internet has shown a profound capability for connecting people around the world. I believe that UC Berkeley can better apply networked technologies to supporting collaboration right on campus, where tens of thousands of people are co-located. For example, might it be worthwhile to set up something equivalent to the Stanford Wiki at Berkeley?

Building structures for IT staff to learn from each other

We can do more to enable UC Berkeley IT staff to learn from each other. I myself would like to personally teach a version of the School of Information course I teach on XML and web services to staff on campus. With the right opportunities to learn, mentor, and experiment, the staff will be inspired and empowered to create the elements we need in the BTP.

Large scale IT Trends Facing the University

I identify three trends in IT that will have a large impact on the university:

  • increasingly inexpensive storage, network, and computation power for individuals For $25/year, I am promised unlimited storage and bandwidth for all my photos by Flickr. I can upload all my videos to YouTube or Google Video for free. For $16/month, I have 400 GB of storage and 4TB of monthly bandwidth from dreamhost.com. With this comparatively inexpensive infrastructure, I can create sophisticated web applications that fuse together a vast array of open source libraries and applications, as well as further storage (S3) and computation power (EC2) from amazon.com and a numerous other providers.
  • the rise of peer production/mass collaboration in “Web 2.0”. In naming “You” (that is, all the many, typically nameless, individuals who participate on the Web) as Person of the Year, Time summarizes this trend in the following way: “In 2006, the World Wide Web became a tool for bringing together the small contributions of millions of people and making them matter.” It is easy to spot the plentiful junk emerging from Web 2.0, yet universities will find it increasingly difficult to dismiss the astounding richness of such entities as the Wikipedia and Flickr.
  • the continued deployment of XML web services XML will continue to be used widely by organizations and, more recently, by individual users. Using service-oriented architectures, organizations/enterprises will re-factor their infrastructure in terms of reusable services that will be accessible through XML web services.

After first dismissing these technology trends as merely faddish, the university community will come to terms with them to take advantage of their positive aspects, adapting them to the university environment, while avoiding the negatives (which are very real, because of the difference in priorities between commercial enterprises and the university)

These technology trends will accentuate the computerization of research in academic disciplines. Some pioneers, especially those in disciplines that have a long history of computation, have already taken advantage of commodity hardware and built extensive computer-based collaborations. Many other researchers will be struggling to use the same technology. I argue that it is in the institution’s interests to help all of its members to work at some baseline level. Moreover, there will be challenges, such as the long-term archiving of data, that the university as a whole will have to tackle, creating a demand for architectures and policies to handle these common needs.

The availability of cheap hardware and storage outside the university presents an immediate challenge to university. Many pioneering university members will be tempted to use those systems, because of low prices even if these services are not quite optimized for users’ academic needs. Should people at the university be encouraged to use those outside services? Is there a way for the university to purchase those services and adapt them on behalf of the university community? What policies should be put in place concerning the use of outside services? I predict that the university will figure out a combination of industrial partnerships, system integration, and ways to help individuals cobble together the best solutions that will satisfy their research needs and also handle relevant policy issues.

The university community will have its own large collections of data and digital content to handle. Take, for example, the digitization of the UC library, which will result in a collection of millions of digitized books available to the university community. These data present incredible opportunities for education and research, ones that are best exploited if we work together as a community.

This is a great time for the university to develop an information technology architecture to handle these challenges, specifically an SOA that will work for this context.

UC Berkeley’s new Chief Technology Architect

Shel Waggener, the CIO of the campus, announced last week the appointment of the new CTA:

I am pleased to announce the appointment of Dr. Hébert Díaz-Flores as the campus’s first Chief Technology Architect (CTA). Reporting to me as manager of the Technology Standards, Practices, and Architecture unit, Dr. Díaz-Flores will be the lead architect and evaluator in developing best-practices technology architecture and process assessments for the campus. He will work with Information Services and Technology and campus departments as a key stakeholder to develop and implement appropriate technology solutions.

Notelets for 2007.03.19

On my reading list: ALA Changing Roles of Academic and Research Libraries and Users and Uses of Bibliographic Data Meeting – Meetings – (Library of Congress)

On my viewing list: videos from code4lib 2007

Open Content:

    The searchable indexes below expose public domain ebooks, open access digital repositories, Wikipedia articles, and miscellaneous human-cataloged Internet resources.