The Royal Society — publisher of the first peer reviewed scientific article in history — has announced that its entire archive (which goes back to 1665) and all future issues will be available online for free. Here’s the searchable index.
The Association of College and Research Libraries has signed the Berlin Declaration. The ACRL is a division of the American Library Association, and has 12,500 members (which is about 20% of the ALA’s membership). The Berlin Declaration was written in 2003 and encourages open access publishing. [via American Libraries Magazine, and a hat tip to David Curry.]
Wouldn’t that be an interesting collection of metadata to pull into LibraryCloud?
For those interested in comparing the demos of the 6 DPLA Beta Sprint finalists (to be presented Oct. 21 in Washington, DC):
Links to the demos:
Robert Darnton, historian and Director of the Harvard Library, talks about the future of books and libraries.
Avi Solomon at BoingBoing has a terrific interview with Michael Greer about the appeal of bookbinding, and about Michael’s “Digital Bible.”
I love the photo:
S. Peter Davis at Cracked explains the disturbing fact that libraries pulp books regularly and in secret.
Snippets of recent happenings in the Lab:
Instead of writing this in my stylesheet
-webkit-transition: margin .15s ; -moz-transition: margin .15s ; -o-transition: margin .15s ; transition: margin .15s ;
I just wrote this
transition: margin .15 ;
How do you find the leading legal cases cited in law review journals throughout their publishing history? This is the goal of an exploratory project now being set up by visiting scholar Richard Leiter in collaboration with the Innovation Lab. The hope is to compile a list of the most frequently cited cases, and, depending on what is discovered, possibly facet these results by subject, law-review clusters, etc. Our initial approach: set up a scripted parser for the inspection of plain-text OCR from sample law journal volumes (generously made available to us by Hein Online). On the basis of these initial results, using the most basic pattern-matching, we identify case-citation passages which in turn allow us to further refine the parser. Checking against the associated PDF’s allows us to determine the degree to which we’re successfully capturing citations and to identify new patterns for inclusion in the parser refinement work. Additional parsing will be necessary to handle initial vs. subsequent case-citation formats, in-text vs. footnoted references, article tagging and textually non-standard citation locations (such as page-spanning citations). The lessons learned here will hopefully scale to examining general corpora of OCR texts for citation data.
The app is about as crude as it can get. The searching for books is done by keyword-matching NYT topics (each NYT piece gets a topic) with LCSH (this crude matching is done in the crudest way). I think we can get much, much better matching with some more work: If we create links from DBPedia topics to LCSH, we can get really good, semantic, matching.
I was very pleased that Dan Brickley this week blogged about the work he’s been doing with the Lab on trying to figure out how to slot Web content into established library categories: How can a system automatically figure out that, say, a TED Talk about space travel ought to be clustered with the right Library of Congress Subject Headings? This is a phenomenally difficult problem because Web content can have very little metadata. Dan, has been exploring linked open data spaces, as well as some open source semantic extraction tools, to see if it can be done. We’ve been working with him all summer on this – which often means watching in amazement as he does his wizardry – which has led to his reporting that he is actually making some progress on this deep problem.
(Kim’s away at the Mobility Shifts conference in NYC, showing off ShelfLife and talking about libraries, education, and other little topics.)
No need for silly text, checkout this video, part of a pitch to the Harvard Library Lab fund:
We’ve been working with the brilliant Dan Brickley all summer (he’s very modest, so now I’ve embarrassed him) trying to figure out how to use all available metadata to slot Web content into library categorization schemes automagically. For example, if we include in our collection — or, more to the point, if the DPLA includes in its collection — library-worthy material such as TED talks, is there a way in which we could automatically categorize those talks within the general mix of library items? We’d like to be able to do this at scale, even if roughly.
No one knows linked open data better than Dan (there, I’ve embarrassed him again!), and he’s been experimenting with all sorts of metadata and connections. Today he posted about what he’s been up to. It’s pretty damn fascinating, and our team has learned a ton working with him on this all summer. We’re looking forward to more!
This is a montage-y video of snippets from various library folk (including users) here at Harvard addressing aspects of the library’s present and future. We put it together as the opener at the first in a year of public conversations about the future of libraries.