Skip to content

Adding all tabs to Zotero Version 2 — scraping translatable sites

In a previous post adding all Firefox tabs to Zotero using Chickenfoot, I showed how to write a Chickenfoot script to loop through all Firefox  tabs and add each of them as an item into Zotero.  A limitation of the script was that it used only ZoteroPane.addItemFromPage for every tab, even if a given tab had a translator that could be used to save the item. As explained in Adding items to Zotero with Chickenfoot, you can use Zotero_Browser.scrapeThisPage to invoke the appropriate translator for tab. The reason I didn't use Zotero_Browser.scrapeThisPage in my Chickenfoot script to add items is that I didn't know how to write a function to determine suitable translator exists.

Now, I think I've come up with a way of determining whether a translator exists — though I'm not highly confident that the solution is fullproof.    I'll share my  Chickenfoot script here, explain the logic behind it, and write about its possible limitations.  First the script:

// add_each_tab_to_Zotero_2.js
// R. Yee

var Zotero = chromeWindow.Zotero;
var ZoteroPane = chromeWindow.ZoteroPane;
var Zotero_Browser = chromeWindow.Zotero_Browser;
var tabBrowser = getTabBrowser(chromeWindow);

// getIcon returns a link to the translator icon for the current
// tab and false if there is no suitable
// translator and 'chrome://zotero/skin/treesource-collection.png'
// if there are multiple savable elements on a page

function getIcon() {

  var browser = Zotero_Browser.tabbrowser.selectedBrowser;
  var tab0 = new Zotero_Browser.Tab(browser);

  // need to figure out whether doc is HTMLDocument
  // doc instanceof HTMLDocument doesn't seem to work here

  var doc = browser.contentWindow.document;

  // in emulation of
  // https://www.zotero.org/trac/browser/extension/tags/1.0.7/chrome/content/zotero/browser.js#L311

  var rootDoc = doc;
  if (rootDoc.defaultView) {
    while(rootDoc.defaultView.frameElement) {
      rootDoc = rootDoc.defaultView.frameElement.ownerDocument;
    }
  }

  // detect possible translators and return the corresponding icon
  tab0.detectTranslators(rootDoc,doc);
  return tab0.getCaptureIcon();

} // getIcon()

// create a new collection with current date
var new_Collection = Zotero.Collections.add("_Saved " + (new Date()).toLocaleString());

// output # tabs
output(tabBrowser.browsers.length);

// loop through tabs, selecting each one in turn
for (var i=0; i < tabBrowser.browsers.length; i++) {
  tabBrowser.mTabContainer.advanceSelectedTab(1, true);
  output(tabBrowser.selectedBrowser.contentWindow.location);
  var icon = getIcon();
  // if icon is not false and not representing multiple items -- scrape page
  if (icon && icon != 'chrome://zotero/skin/treesource-collection.png') {
    Zotero_Browser.scrapeThisPage(new_Collection.id);
  // otherwise add item as a generic web page
  } else {
   ZoteroPane.addItemFromPage(new_Collection.id);
  }
}

A few points about the script:

When I ran the script with 3 tabs, it seemed to work fine. When I had 20+ tabs, all the tabs were saved — but only the first one ended up in the right collection. (I don't know why….) Also, the code is not terribly elegant — for example, it depends on creating a new Zotero_Browser.Tab for each tab; I figure there should be some way to read off whether an icon exists in the interface already without having to recalculate possible translators.

{ 3 } Comments