<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Data Unbound</title>
	<atom:link href="http://blog.dataunbound.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.dataunbound.com</link>
	<description>Data Architect, Consultant, Trainer, and Author Raymond Yee on data and software in research and education</description>
	<pubDate>Thu, 12 Jun 2008 01:15:14 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
	<image>
  <link>http://blog.dataunbound.com</link>
  <url>http://blog.dataunbound.com/wp-content/plugins/favicon-manager/dataunbound.ico</url>
  <title>Data Unbound</title>
</image>
		<item>
		<title>Sorin Matei on Project Bamboo and the role of mashups</title>
		<link>http://blog.dataunbound.com/2008/06/11/sorin-matei-on-project-bamboo-and-the-role-of-mashups/</link>
		<comments>http://blog.dataunbound.com/2008/06/11/sorin-matei-on-project-bamboo-and-the-role-of-mashups/#comments</comments>
		<pubDate>Thu, 12 Jun 2008 00:58:11 +0000</pubDate>
		<dc:creator>yee</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[mashups]]></category>

		<category><![CDATA[Project Bamboo]]></category>

		<guid isPermaLink="false">http://blog.dataunbound.com/?p=73</guid>
		<description><![CDATA[Project Bamboo has been on my list of stuff to write about for a while.  According to  Project Bamboo website:
 Bamboo is a multi-institutional,  interdisciplinary, and inter-organizational effort that brings together  researchers in arts and humanities, computer scientists, information  scientists, librarians, and campus information technologists to tackle  the question: [...]]]></description>
			<content:encoded><![CDATA[<p>Project Bamboo has been on my list of stuff to write about for a while.  According to  <a class="external" href="http://projectbamboo.org/">Project Bamboo website</a>:</p>
<ul> Bamboo is a multi-institutional,  interdisciplinary, and inter-organizational effort that brings together  researchers in arts and humanities, computer scientists, information  scientists, librarians, and campus information technologists to tackle  the question: How can we advance arts and humanities research through  the development of shared technology services?</ul>
<p>Not only is the project of intellectual interest to me (as someone  deeply interested in the issues of &#8220;shared technology services&#8221;) but  also of great personal interest (since I know quite a few of the  personnel involved with the project, including one of the co-project  directors, David Greenbaum, who used to be my boss.) One particular  angle I hope to explore is answering the question of what are the  implications of Project Bamboo on Zotero and vice-versa?</p>
<p>The immediate prompt for this post is Sorin Matei&#8217;s <a class="external" href="http://matei.org/ithink/2008/06/06/the-bamboo-digital-humanities-initiative-a-modest-proposal/">The Bamboo Digital Humanities Initiative: A Modest Proposal</a>.  Matei&#8217;s post has been of sufficient interest to me that I using it to <a class="external" href="http://thatcamp.org/2008/06/continuing-our-discussions-a-suggested-topic/">prompt some discussion in a community of humanists and technologists</a>.   Matei makes a lot of useful points, but the segment that caught my attention is:</p>
<ul> The role of the Bamboo platform would be to  simplify this task by making access to tools, by enhancing our ability  to connect digital objects and artifacts, our ability to connect with  colleagues and students via simple, directly intuitive and universally  available interfaces that <em>all converge on the scholars’ desktop, preferably in the format of a word processor.</em> [emphasis mine] Moreover, the platform should integrate in the most  straightforward manner the learning and writing processes with those  dedicated to publishing. This should be done in such a manner that  dedicated genres and modus operandi (articles, book monographs, peer  review, scientific validity checks, etc.) would survive, flourish even,  under the new digital regime.</ul>
<p>Amen. That&#8217;s an approach I&#8217;ve been pursuing for a while now (in the  Scholar&#8217;s Box, for example)&#8211; and one I think that Zotero, as a desktop  client, with some capacity <a class="external" href="http://www.zotero.org/extend/">for extensibility</a>, can embody rather deeply.</p>
<p>Matei goes on:</p>
<ul> I stop here, rather abruptly, waiting for  reactions. I am planning, however, to release a sketch of such a  platform, including essential services and affordances. It will also  try to leverage the idea of the mashup editor as basic architecture  strategy, which could be use to support the infrastructure of the  system.</ul>
<p>I&#8217;m naturally intrigued as someone <a class="external" href="http://blog.mashupguide.net/">focused on mashups</a> and interested in developing &#8220;Zotero as a mashup platform&#8221;. Has Sorin  Matei used Zotero? How would Zotero fit in with Matei&#8217;s sketch of such  a platform?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dataunbound.com/2008/06/11/sorin-matei-on-project-bamboo-and-the-role-of-mashups/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Notelets for 2005.05.20</title>
		<link>http://blog.dataunbound.com/2008/05/21/notelets-for-20050520/</link>
		<comments>http://blog.dataunbound.com/2008/05/21/notelets-for-20050520/#comments</comments>
		<pubDate>Wed, 21 May 2008 15:46:20 +0000</pubDate>
		<dc:creator>yee</dc:creator>
		
		<category><![CDATA[notelets]]></category>

		<category><![CDATA[codepad]]></category>

		<category><![CDATA[JCDL]]></category>

		<category><![CDATA[JCDL 2008]]></category>

		<category><![CDATA[workshops]]></category>

		<category><![CDATA[XML in libraries]]></category>

		<guid isPermaLink="false">http://blog.dataunbound.com/?p=72</guid>
		<description><![CDATA[As I prepare my JCDL 2008 Tutorial (Creating and Enabling Data Mashups), might I make use of Eric Lease Morgan&#8217;s XML in libraries: A workshop?
Of note is an upcoming workshop aimed at libraries &#8212; CARL-IT  North: Mashup the Library: An introduction to mashup technology and the  art of remixing library and information resources.
codepad:
 [...]]]></description>
			<content:encoded><![CDATA[<p>As I prepare my JCDL 2008 Tutorial (<a class="external" href="http://jcdl2008.org/tutorials/tutorial1.html">Creating and Enabling Data Mashups</a>), might I make use of Eric Lease Morgan&#8217;s <a class="external" href="http://infomotions.com/musings/xml-in-libraries/">XML in libraries: A workshop</a>?</p>
<p>Of note is an upcoming workshop aimed at libraries &#8212; <a class="external" href="http://carlnit.blogspot.com/2008/05/save-date-carl-n-it-interest-group.html">CARL-IT  North: Mashup the Library: An introduction to mashup technology and the  art of remixing library and information resources</a>.</p>
<p><a class="external" href="http://codepad.org/">codepad</a>:</p>
<ul> codepad.org is an online compiler/interpreter, and  a simple collaboration tool. Paste your code below, and codepad will  run it and give you a short URL you can use to share it in chat or  email.</ul>
<p>I&#8217;d like to experiment with it to see whether it&#8217;s a good place to  paste code fragments that I want to share (and run!). I also want to  think through how it sandboxes runnable code.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dataunbound.com/2008/05/21/notelets-for-20050520/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Data Hosting vs Data Portability</title>
		<link>http://blog.dataunbound.com/2008/05/21/data-hosting-vs-data-portability/</link>
		<comments>http://blog.dataunbound.com/2008/05/21/data-hosting-vs-data-portability/#comments</comments>
		<pubDate>Wed, 21 May 2008 15:38:30 +0000</pubDate>
		<dc:creator>yee</dc:creator>
		
		<category><![CDATA[repositories]]></category>

		<category><![CDATA[data hosting]]></category>

		<category><![CDATA[data portability]]></category>

		<guid isPermaLink="false">http://blog.dataunbound.com/?p=71</guid>
		<description><![CDATA[A friend sent me a link to a recent post by Brad Templeton, Data hosting instead of data portability:
 A data hosting approach has your personal data  stored on a server chosen by you. (You might have that server right in  your own house, or pay for hosting services.) If you pay, that [...]]]></description>
			<content:encoded><![CDATA[<p>A friend sent me a link to a recent post by Brad Templeton, <a class="external" href="http://ideas.4brad.com/data-hosting-instead-data-portability">Data hosting instead of data portability</a>:</p>
<ul> A data hosting approach has your personal data  stored on a server chosen by you. (You might have that server right in  your own house, or pay for hosting services.) If you pay, that server’s  duty is not to exploit your data, but rather to protect it. That’s what  you’re paying for. You can have more than one (with different personas,  if you like) but for now let’s imagine having just one.Your data host’s job is to perform actions on your data. Rather than  giving copies of your data out to a thousand companies (the Facebook  and Data Portability approach) you host the data and perform actions on  it, programmed by those companies who are developing useful social  applications.</ul>
<p>I find data hosting appealing and would like to shift towards hosting  my own data as opposed to having my data hosted elsewhere. It&#8217;s a  matter of making it practical though.</p>
<p>For instance, I&#8217;m a big fan of Flickr because it makes it so easy to  have my photos taken care of. But ideally, I&#8217;d like to host my own  photos and directly control how people access them. I&#8217;d do that if I  could build a good repository and layer services on top of them &#8212; just  like Flickr. But Flickr has an economy of scale that I don&#8217;t have &#8212; it  can solve that problem and provide the solution to many people.</p>
<p>Now, it&#8217;s possible that we can solve that problem too and sell and/orr  share it to lots of people so that they can do more of their own data  hosting. Is that a business that I would want to be in?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dataunbound.com/2008/05/21/data-hosting-vs-data-portability/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Some musings on where I&#8217;d like to go next professionally</title>
		<link>http://blog.dataunbound.com/2008/05/19/some-musings-on-where-id-like-to-go-next-professionally/</link>
		<comments>http://blog.dataunbound.com/2008/05/19/some-musings-on-where-id-like-to-go-next-professionally/#comments</comments>
		<pubDate>Mon, 19 May 2008 21:44:36 +0000</pubDate>
		<dc:creator>yee</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.dataunbound.com/?p=70</guid>
		<description><![CDATA[In January, a correspondent, having heard that I was about to publish a book on mashups, wrote me, saying that he would &#8220;love to find out more what [I'm] thinking&#8221;.    Flattered to be asked, I replied.   Here I quote an edited version of what I wrote.   (I tend [...]]]></description>
			<content:encoded><![CDATA[<p>In January, a correspondent, having heard that I was about to publish a book on mashups, wrote me, saying that he would &#8220;love to find out more what [I'm] thinking&#8221;.    Flattered to be asked, I replied.   Here I quote an edited version of what I wrote.   (I tend to like what I write in email because my writing tends to be energetically conversational.)</p>
<blockquote>
<p style="font-style: normal;">Let me tell you a bit of what I&#8217;m thinking and where I&#8217;m coming from.    Obviously, I think that the topic of mashups is a big deal given my willingness to write a whole book about it.   The element that excites me most is the power that individuals and small groups of people now have to recombine data and services &#8212; to use mashups to make sense of the world &#8212; particularly in the corner of the  world in which I&#8217;m immersed (teaching, learning, and research in the context of higher education, libraries, and museums).   When I first learned about XML and web services, I thought &#8212; wow &#8212; this is going to change the way we do research and way we teach and learn.  I spoke about this topic at the <a href="http://web.archive.org/web/20060523173607/http://conferences.oreillynet.com/cs/et2003/view/e_sess/3668">O&#8217;Reilly ETCon in 2003</a>.</p>
<p>I&#8217;ve built a research prototype (called the <a href="http://scholarsbox.net">Scholar&#8217;s Box</a>) to enable scholars to gather data from different sources, create personal collections, and share them with others.  (I&#8217;m an advisor to a project called Zotero (<a href="http://www.zotero.org/">http://www.zotero.org/</a>) &#8212; which provides a Firefox plugin to enable people to manage bibliographic collections within the web browser &#8212; and ultimately to share their collections.)</p>
<p>I teach a course at the School of Information at UC Berkeley call &#8220;<a href="http://www.ischool.berkeley.edu/programs/courses/290-mri">Mixing and Remixing Information</a>&#8220;.    This semester will be the third term I teach the course.  It&#8217;s a project-based course, in which the focus is on helping students build their own mashups (see <a href="http://blog.mixingandremixing.info/">http://blog.mixingandremixing.info/s08/class-projects/</a> for some mashups from [this] year&#8217;s class) .   A good number of my students have next-to-no experience with web programming.  I have found that showing students the power of mashups &#8212; to get people excited about the possibilities &#8212; and then teach them how to make mashups is an excellent way into web programming.  I&#8217;ve taken this approach with teenagers with some success last summer &#8212; I taught a <a href="http://www-atdp.berkeley.edu/2007/07compsci.html#2738">six week course on the Berkeley campus</a>.</p>
<p>In addition to master&#8217;s students this semester, I&#8217;ll be teaching a six week hands-on course to campus IT staff about building next-generation campus IT services &#8212; again by studying things like Flickr and Google maps and Yahoo! Pipes, getting them to build mashups, and thinking about how we can do things like that on campus &#8212; for administration and for research.</p>
<p>Now that I&#8217;m finished writing my book, I&#8217;m thinking about other opportunities.  Perhaps it&#8217;s just the geek in me, but I really do think that some combination of Web 2.0 mashups, a bit more rigor from SOA, imagination, and some understanding of real problems can transform the worlds of education and research (and other worlds too &#8212; but education and research are something I know about.)  I&#8217;m setting out to build a small company whose goal is to help the educational community effectively use Web 2.0 ideas  (with a specific emphasis on remixability) to change the way we do things in that community.  I will confess that my business plan still needs to be written, however&#8230;. In the meantime, I&#8217;m experimenting with a mix of teaching, consulting, and building software.  (Some collaborators and I have a grant proposal in to enhance the teaching and learning of art history by integrating Flickr into the computational fabric of the classroom.)  Most of all, I believe in the power of ideas &#8212; hence, I wrote a book to teach others.</p></blockquote>
<p>Lots of questions remain however.  (Now that <a href="http://blog.dataunbound.com/2008/05/18/what-ive-been-up-to/">my teaching jobs have come to an end</a>, I now have some serious amounts of time to plot out my next steps.  Writing is a great help to me in sorting out my thoughts, especially when I&#8217;m writing for a public audience.   I would like to build a business but am unclear on exactly what it should look like.   Undoubtedly, there will be details that would be unwise for me to share publicly&#8211; but I believe that a lot of my thinking would benefit from putting my ideas out there.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dataunbound.com/2008/05/19/some-musings-on-where-id-like-to-go-next-professionally/feed/</wfw:commentRss>
		</item>
		<item>
		<title>What I&#8217;ve been up to</title>
		<link>http://blog.dataunbound.com/2008/05/18/what-ive-been-up-to/</link>
		<comments>http://blog.dataunbound.com/2008/05/18/what-ive-been-up-to/#comments</comments>
		<pubDate>Mon, 19 May 2008 04:13:03 +0000</pubDate>
		<dc:creator>yee</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[news]]></category>

		<guid isPermaLink="false">http://blog.dataunbound.com/?p=68</guid>
		<description><![CDATA[Here&#8217;s an update on my current professional activities that I hope will give you, my readers, a sense of where this blog will be heading:


My book  Pro Web 2.0 Mashups:  	Remixing Data and Web Services was published by Apress on 	February 25, 2008.   It&#8217;s gotten some good reviews, and I&#8217;ve heard 	from [...]]]></description>
			<content:encoded><![CDATA[<p style="font-style: normal;">Here&#8217;s an update on my current professional activities that I hope will give you, my readers, a sense of where this blog will be heading:</p>
<ul>
<li>
<p style="font-style: normal;">My book  <em><a href="http://apress.com/book/view/159059858X">Pro Web 2.0 Mashups:  	Remixing Data and Web Services</a> </em>was published by Apress on 	February 25, 2008.   It&#8217;s gotten some good reviews, and I&#8217;ve heard 	from some happy readers.  It&#8217;s time, however, for some more intense 	promotion of my book to make sure it fully reaches  the audience it 	is meant to serve.  (Most of my book-related activities will be 	discussed at my <a href="http://blog.mashupguide.net">MashupGuide blog</a>.)</p>
</li>
<li>
<p style="font-style: normal;">In April, I finished teaching a 	six-week course (“<a href="http://collab-dev.berkeley.edu/misc/Next-genAnnc_wApp.pdf">Building Next-Generation Campus Information 	Services</a>” for IT 	staff on the Berkeley campus.  “The course designed to introduce 	campus professionals to the concepts of Web 2.0, XML, web services, 	and elements of web application development through the lens of 	mashups. While completing a six-week long project, participants will 	advance their knowledge and abilities, and gain insight into 	potential solutions to the information management needs they face on 	the job.”  I plan to post more details about the course, including 	how it was structured, what projects came out of the class, and how 	I think this course can be improved.</p>
</li>
<li>
<p style="font-style: normal;">Last week marked the <a href="http://blog.mixingandremixing.info/2008/05/12/open-house-2008/">culminating 	open house</a> of the <a href="http://www.ischool.berkeley.edu/programs/courses/290-mri">Mixing and Remixing Information</a> course I teach at 	the School of Information at UC Berkeley.  I had a blast teaching 	the course for the third time though I wonder whether it&#8217;s time for 	a total (or at least substantial ) revamp of the course.</p>
</li>
<li>
<p style="font-style: normal;">I&#8217;ve started to contribute 	regularly to <a href="http://www.programmableweb.com">ProgrammableWeb</a>, which I described in my book as 	“the most useful web site for keeping up with the world of 	mashups, specifically, the relationships between all the APIs and 	mashups out there.&#8221;  That was before I started writing for 	it!   See the <a href="http://www.programmableweb.com/profile/RaymondYee">posts I&#8217;ve written for  PW</a> so far.</p>
</li>
<li>
<p style="font-style: normal;">Finally, I&#8217;ve recently become the 	<a href="http://chnm.gmu.edu/staff/raymond-yee/">Integration Advisor</a> for the 	<a href="http://zotero.org">Zotero Project</a>, 	 working on developing developer documentation for them, thinking 	about how to integrate Zotero with other things (in a sense, Zotero 	as a client-side mashup platform) &#8212; specifically in the context of 	<a href="http://www.dancohen.org/2007/12/12/zotero-and-the-internet-archive-join-forces/">Zotero-Internet Archive alliance</a>.  	 My work for Zotero will be a big part of what I&#8217;ll be discussing on 	this blog.</p>
</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.dataunbound.com/2008/05/18/what-ive-been-up-to/feed/</wfw:commentRss>
		</item>
		<item>
		<title>notes from the Open Library developers&#8217; meeting</title>
		<link>http://blog.dataunbound.com/2008/03/13/notes-from-the-open-library-developers-meeting/</link>
		<comments>http://blog.dataunbound.com/2008/03/13/notes-from-the-open-library-developers-meeting/#comments</comments>
		<pubDate>Thu, 13 Mar 2008 21:56:57 +0000</pubDate>
		<dc:creator>yee</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[openlibrary]]></category>

		<guid isPermaLink="false">http://blog.dataunbound.com/2008/03/13/notes-from-the-open-library-developers-meeting/</guid>
		<description><![CDATA[  I wasn&#8217;t able to make it to the Open Library Developers Meeting 2008 (Open Library) because I was in Los Angeles but I look forward to catching up on what happened that day.   I&#8217;m excited to see how far the OpenLibrary project will get in terms of making data about books freely available [...]]]></description>
			<content:encoded><![CDATA[<p>  I wasn&#8217;t able to make it to the <a href="http://demo.openlibrary.org/about/olmeeting2008" class="external">Open Library Developers Meeting 2008 (Open Library)</a> because I was in Los Angeles but I look forward to catching up on what happened that day.   I&#8217;m excited to see how far the <a href="http://www.openlibrary.org/">OpenLibrary </a>project will get in terms of making data about books freely available to the world, not only in terms of a user interface but an API so that people can mashup the data.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dataunbound.com/2008/03/13/notes-from-the-open-library-developers-meeting/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Mashupawards, Symfony and web frameworks</title>
		<link>http://blog.dataunbound.com/2008/01/31/mashupawards-symfony-and-web-frameworks/</link>
		<comments>http://blog.dataunbound.com/2008/01/31/mashupawards-symfony-and-web-frameworks/#comments</comments>
		<pubDate>Fri, 01 Feb 2008 03:09:00 +0000</pubDate>
		<dc:creator>yee</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[mashup symfony Django]]></category>

		<guid isPermaLink="false">http://blog.dataunbound.com/2008/01/31/mashupawards-symfony-and-web-frameworks/</guid>
		<description><![CDATA[  MashupAwards - best mashups on the web is a good list of mashups.
As I learn Django, a Python web programming framework, I&#8217;m starting to think about alternative frameworks, such as Ruby on Rails and Symfony (for PHP5).  Is Symfony something to recommend to my students?
]]></description>
			<content:encoded><![CDATA[<p>  <a href="http://mashupawards.com/" class="external">MashupAwards - best mashups on the web</a> is a good list of mashups.</p>
<p>As I learn <a href="http://www.djangoproject.com/" class="external">Django</a>, a Python web programming framework, I&#8217;m starting to think about alternative frameworks, such as <a href="http://www.rubyonrails.org/" class="external">Ruby on Rails</a> and <a href="http://www.symfony-project.org/about" class="external">Symfony</a> (for PHP5).  Is Symfony something to recommend to my students?</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dataunbound.com/2008/01/31/mashupawards-symfony-and-web-frameworks/feed/</wfw:commentRss>
		</item>
		<item>
		<title>A nice example of how useful Amazon EC2 and S3 can be</title>
		<link>http://blog.dataunbound.com/2008/01/25/a-nice-example-of-how-useful-amazon-ec2-and-s3-can-be/</link>
		<comments>http://blog.dataunbound.com/2008/01/25/a-nice-example-of-how-useful-amazon-ec2-and-s3-can-be/#comments</comments>
		<pubDate>Sat, 26 Jan 2008 02:04:36 +0000</pubDate>
		<dc:creator>yee</dc:creator>
		
		<category><![CDATA[journalism]]></category>

		<category><![CDATA[NYTimes AmazonEC2 AmazonS3]]></category>

		<guid isPermaLink="false">http://blog.dataunbound.com/2008/01/25/a-nice-example-of-how-useful-amazon-ec2-and-s3-can-be/</guid>
		<description><![CDATA[In several weeks, I&#8217;ll be giving a talk to campus IT staff.  I&#8217;ve long wanted to talk up the value of such services as Amazon EC2 and S3.  Whenever I bring them up, I have tended to talk in the abstract of all the possibilities.  I just came across a nice example [...]]]></description>
			<content:encoded><![CDATA[<p>In several weeks, I&#8217;ll be giving a talk to campus IT staff.  I&#8217;ve long wanted to talk up the value of such services as Amazon EC2 and S3.  Whenever I bring them up, I have tended to talk in the abstract of all the possibilities.  I just came across a nice example in a blog that I just learned about:  <a href="http://open.blogs.nytimes.com/2007/11/01/self-service-prorated-super-computing-fun/#comments">Self-service, Prorated Super Computing Fun!</a> on <a href="http://open.blogs.nytimes.com/">open.blogs.nytimes.com</a>, a blog about open source at the NY Times.    The post describes how the author used EC2 and S3 to convert millions of files to PDF files:</p>
<blockquote><p>I then began some rough calculations and determined that if I used only four machines, it could take some time to generate all 11 million article PDFs. But thanks to the swell people at Amazon, I got access to a few more machines and churned through all 11 million articles in just under 24 hours using 100 EC2 instances, and generated another 1.5TB of data to store in S3. (In fact, it work so well that we ran it twice, since after we were done we noticed an error in the PDFs.)</p></blockquote>
<p>Wow, we as individuals have access to more and more computing power at lower prices all the time.  I&#8217;ve long wanted to make use of the EC2 and S3 infrastructure.    I don&#8217;t think that many people on campus know about EC2 and S3.   Researchers who need a lot of computational power might build their own clusters or access the central campus services &#8212; or they may start using things like EC2 and S3.  (That&#8217;s my argument). So far I&#8217;ve not had any need for S3 and EC2  &#8212; but I&#8217;m pretty sure that this year will bring some projects my way that will give me an excuse to use EC2 and S3!</p>
<p>(BTW, I&#8217;m thrilled to learn about open.blogs.nytimes.com, which lets geeks who are also fans of the <em>Times</em>  get a glimpse into the IT technology behind an important online paper.)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dataunbound.com/2008/01/25/a-nice-example-of-how-useful-amazon-ec2-and-s3-can-be/feed/</wfw:commentRss>
		</item>
		<item>
		<title>More technical books on my reading list</title>
		<link>http://blog.dataunbound.com/2008/01/25/more-technical-books-on-my-reading-list/</link>
		<comments>http://blog.dataunbound.com/2008/01/25/more-technical-books-on-my-reading-list/#comments</comments>
		<pubDate>Sat, 26 Jan 2008 01:45:20 +0000</pubDate>
		<dc:creator>yee</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<category><![CDATA[books]]></category>

		<guid isPermaLink="false">http://blog.dataunbound.com/2008/01/25/more-technical-books-on-my-reading-list/</guid>
		<description><![CDATA[It&#8217;ll be fun to work through Visualizing Data &#8212; after I get through reading Programming Collective Intelligence .  But instead of just reading books, I need to have some specific problems in mind &#8212; which I do.  More soon on what those problems are.
]]></description>
			<content:encoded><![CDATA[<p>It&#8217;ll be fun to work through <a href="http://proquest.safaribooksonline.com/9780596514556?tocview=true" class="external">Visualizing Data</a> &#8212; after I get through reading <a href="http://proquest.safaribooksonline.com/9780596529321" class="external">Programming Collective Intelligence </a>.  But instead of just reading books, I need to have some specific problems in mind &#8212; which I do.  More soon on what those problems are.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dataunbound.com/2008/01/25/more-technical-books-on-my-reading-list/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Interesting application of scholarly data mining</title>
		<link>http://blog.dataunbound.com/2008/01/25/interesting-application-of-scholarly-data-mining/</link>
		<comments>http://blog.dataunbound.com/2008/01/25/interesting-application-of-scholarly-data-mining/#comments</comments>
		<pubDate>Sat, 26 Jan 2008 01:39:30 +0000</pubDate>
		<dc:creator>yee</dc:creator>
		
		<category><![CDATA[data mining]]></category>

		<category><![CDATA[open access]]></category>

		<guid isPermaLink="false">http://blog.dataunbound.com/2008/01/25/interesting-application-of-scholarly-data-mining/</guid>
		<description><![CDATA[When I saw Copycat Articles Seem Rife in Science Journals, a Digital Sleuth Finds - Chronicle.com,  I was curious about the technology behind the findings. How did the  researchers figure out the level of duplication in the medical  literature? One aspect was the use of eTBlast.  (To learn more, I can [...]]]></description>
			<content:encoded><![CDATA[<p>When I saw <a href="http://chronicle.com/daily/2008/01/1362n.htm?utm_source=at&amp;utm_medium=en" class="external">Copycat Articles Seem Rife in Science Journals, a Digital Sleuth Finds - Chronicle.com</a>,  I was curious about the technology behind the findings. How did the  researchers figure out the level of duplication in the medical  literature? One aspect was the use of <a href="http://invention.swmed.edu/etblast/index.shtml" class="external">eTBlast</a>.  (To learn more, I can follow up by reading the <em>Nature News</em> article (<a href="http://www.nature.com/news/2008/080123/full/news.2008.520.html" class="external">How many papers are just duplicates?</a>) that in turn points to the full article (<a href="http://www.nature.com/nature/journal/v451/n7177/full/451397a.html" class="external">A tale of two citations : Article : Nature</a>).</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.dataunbound.com/2008/01/25/interesting-application-of-scholarly-data-mining/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
