Peter Suber (twitter: oatp), who is one of the central nodes of the network of those who work on open access issues, has given us a peek at his list of the top developments in open access [added later that day:] to bibliographic data this year, in chronological order:

  • JISC released a toolkit to helps librarians share their catalog records.

    http://www.jisc.ac.uk/news/stories/2010/02/podcast98librarycatalogue.aspx

  • Six libraries in Cologne became the first German libraries to commit to OA for their bibliographic data. The libraries used CC0 to assign more than 5.4 million records to the public domain.

    http://www.hbz-nrw.de/projekte/linked_open_data/english_version/

    http://www.hbz-nrw.de/dokumentencenter/presse/pm/datenfreigabe_engl

  • The Open Knowledge Foundation launched a working group on open bibliographic data.

    http://blog.okfn.org/2010/03/03/new-working-group-on-open-bibliographic-data/

  • OCLC released a new draft policy on the use of WorldCat records and welcomes comments until May 20.

    http://www.libraryjournal.com/article/CA6725522.html?nid=2673&source=title&rid=17392268

  • The CERN Library provided OA to its book catalog and assigned the data to the public domain. The goal is to encourage copying and reuse. Said Jens Vigen, Head of the CERN Library: “Books should only be catalogued once.”

    http://gs-service-bookdata.web.cern.ch/gs-service-bookdata/announcement.html

  • The University of Konstanz and Cambridge University libraries announced plans to provide bibliographic data under an open license.

    http://blog.okfn.org/2010/10/05/new-open-bibliographic-data-from-konstanz-and-cambridge/

  • The U of Tübingen provides OA to its bibliographic data, at least since May 2010.

    http://wiki.bsz-bw.de/doku.php?id=v-team:daten:openaccess:tuub

  • The Open Knowledge Foundation gave us a preview of Bibliographica, its new open-source tool to gather and share semantically rich bibliographic information.

    http://blog.okfn.org/2010/05/20/bibliographica-an-introduction/

  • Library Thing launched OverCat, an OA index of bibliographic data second in size only to WorldCat. OverCat data was collected from over 700 sources.

    http://www.librarything.com/blogs/librarything/2010/06/announcing-overcat/

  • WorldCat upgraded its Digital Collection Gateway to libraries, museums, and archives to contribute digital resources and metadata.

    http://www.oclc.org/us/en/news/releases/2010/201044.htm

  • The Rheinisch-Westfälische Technische Hochschule Aachen (RWTH Aachen University) opened up its bibliographic data, using CC0 to assign them to the public domain.

    http://www.bth.rwth-aachen.de/offbibdat.html

  • WorldCat announced that it now has 200 million bibliographic records. OCLC is still in the process of rethinking the access or data-sharing policy for WorldCat records.

    http://www.oclc.org/news/releases/2010/201047.htm

  • The British Library made three million bibliographic records OA under the CC0 Public Domain Dedication Licence. “This dataset consists of the entire British National Bibliography, describing new books published in the UK since 1950; this represents about 20% of the total BL catalogue, and we are working to add further releases.”

    http://openbiblio.net/2010/11/17/jisc-openbibliography-british-library-data-release/

  • Soon after the British Library’s release of open bibliographic data (previous item), the JISC Open Bibliography project announced two ways in which it had made the data more useful. “The data has been loaded into a Virtuoso store that is queriable through the SPARQL Endpoint and the URIs that we have assigned each record use the ORDF software to make them dereferencable, supporting perform content auto-negotiation as well as embedding RDFa in the HTML representation.”

    http://lists.okfn.org/pipermail/open-bibliography/2010-November/000629.html

He also points to two out of chrono order (an artifact of the email back-and-forth that occasioned his sharing the list with us):

1. The Open Knowledge Foundation Working Group on Open Bibliographic Data released a draft version of Principles on Open Bibliographic Data for public comment.

http://openbiblio.net/2010/10/15/principles-for-open-bibliographic-data/

2. JISC released the Open Bibliographic Data Guide for institutions providing OA to library catalogue records. The guide offers advice on how to license data, legal issues to be considered, and potential costs and savings.

http://infteam.jiscinvolve.org/wp/2010/11/15/what-does-open-bibliographic-metadata-mean-for-academic-libraries/


This is an olde post, that I’m coming back to, and adding onto.  Two interesting uses of twitter:

1)  Twitter as Subject Stream – Over on techcrunch there’s a post about how Quora is using Mechanical Turk to automate the creation of twitter accounts.  Quora is a mass Q & A website for anything.  You ask a question: “Where’s the best place to crowdsource an icon?”, and you get a response, for example from user alton sun :”99designs.com….“.

Quora the site is organized into many subject areas which you can subscribe to (UI, Startups etc.).  They are creating a twitter account for each of these subject areas, so those interested Quora users can subscribe to the feed and get the newest message from that subject area.  It’s cool.

2) Twitter with a High-Pass Filter: This is the newer part of the post, Jeff Miller, has created a twitter feed that broadcasts Hacker News stories when they reach a certain point value.  http://twitter.com/newsyc100 was the first one I noticed, it broadcasts stories once they reach 100pts.  But it seems he also set a feed with 2opt, 50pt and 150pt triggers.  I really like the idea that once something has reached a level of community interesting-ness — as manifest in points — you can grant Hacker News the ability to become a verb and reach out and tell YOU about it.

You can decide that anything that is of n interestesting-ness to a community is of interest to me.


Dan Gillmor has a good post at Salon about archiving the Net, spurred by meetings at the Library of Congress. I’m especially interested in his comments — pointing to a post by Dave Winer — about the role of long-lived institutions, including universities.

Have we all concluded at this point that there is no hope of keeping a full and accurate archive? The Net is too vast, too every-changing, too complexly linked. I can’t even keep a full archive of my own computer; the Mac’s TimeMachine makes hourly backups, but not minutely or secondly, and it only preserves daily backups over the long-ish haul. All records are broken to one degree or another, because records require choices about what’s worth recording and energy to do the recording. “Full record” is an oxymoron.

So the question is, what is the right periodicity and scope of the Internet record we want? Usually, questions about archives and records are relative to some use case. A general record of the Net is like a general record of life. So, we’ll just have to make some choices that inevitably will turn out to be wrong for some unanticipated uses. We’ll have to deal with it.

Personally, I’m heartened to see this discussion occurring at an institution with the gravitas of the Library of Congress, and that it includes people like Dan and Dave.


The Twitter hashtag #FailShare is accumulating instances of failed library projects, so that we can learn from them, and also, I imagine, to take the sting out of failure (on the grounds that sting-y failure makes for stingy ideas).

And, a brand new wiki page has gone up on the same topic.