I’ve read two articles by an editor over at O’reilly, Mike Loukides that I’ve liked a lot. What’s cool is they offer a layperson’s intro to data topics, but then quickly accelerate to specifics, practicalities and examples.
The first is “What is Data Science?”: http://radar.oreilly.com/2010/06/what-is-data-science.html
The second is “Data as a service”: http://radar.oreilly.com/2010/07/data-as-a-service.html
In this second story, he talks about visualization. There clearly has been an explosion of info visualization out on the web. Much much of it unremarkable. But he cites a super beautiful example that Ben Fry and company did for GE about aging: http://www.ge.com/visualization/aging/
Annie Jo pointed out that it’s a Java Applet. Slide that bar back and forth on the bottom and watch how SMOOOTH it is…..
Peter Sime has posted an 11-slide deck that explains FRBR with Macbeth as his example. (FRBR is a way of expressing the sometimes complex relationships among the Platonic form of the book and all its various manifestations.) (via the frbr blog)
According to an article in the Minneapolis Star Tribune, a St. Louis Park couple had so many books that they bought the house next door and turned it into their own library.
The article doesn’t tell us how many books they own, but a reasonable guess might be, oh, 200 gigabytes worth.
Hi, this is David Weinberger, and I’m thrilled to be able to post that on Monday I’ll be the Lab’s new co-director, along with the fabulous Kim Dulin who has over the past year hired an amazing group of people and guided them toward a set of awesome projects.
I’ve posted a bit about this new job over at my personal blog.
I’ve been working with the Lab as a consultant for quite a while now, and with Kim and the team, for whom I have the highest regard, so this is not a big change for the Lab. But it is a very happy change for me.
With a precision that we can only assume they are winking at, Google has announced that there are 129,864,880 different books in the world.
The post, by Leonid Taycher, explains some of the decisions Google made when deciding what constitutes a book, but there are obviously cans of worms by the truckload waiting to be opened if someone really wanted to pin this number down. Or, put differently, there is no conceivable way of pinning this number down because books are too important and too ancient to be capable of anything except arbitrary definitions. Google does it in part by making one-at-a-time human decisions: “Twice every week we group all those records into ‘tome’ clusters, taking into account nearly all attributes of each record.” It’s dirty work, but someone has to do it.
Actually, it’s dirty, messy work that would seem perfectly suited to an expert-amateur collaboration: Librarians and readers. For example, just think how valuable it would be to know that two books were almost considered to be the same! Not to mention all the other relations among books that we could together could discover and publish.
Paul Gillin blogs about CIThread (while disclosing that he is advising them):
The curator starts by presenting the engine with a basic set of keywords. CIThread scours the Web for relevant content, much like a search engine does. Then the curator combs through the results to make decisions about what to publish, what to promote and what to throw away.
As those decisions are made, the engine analyzes the content to identify patterns. It then applies that learning to delivering a better quality of source content. Connections to popular content management systems make it possible to automatically publish content to a website and even syndicate it to Twitter and Facebook without leaving the CIThread dashboard.
There’s intelligence on the front end, too. CIThread can also tie in to Web analytics engines to fold audience behavior into its decision-making. For example, it can analyze content that generates a lot of views or clicks and deliver more source material just like it to the curator. All of these factors can be weighted and varied via a dashboard.
I haven’t seen the software so I don’t know anything about the actual implementation, but providing ever more clever computer assistance to human curators sounds like an inevitably useful path.
Harvard has announced the creation of the Harvard Library Lab:
The Lab promotes the development of projects in all areas of library activity and leverages the entrepreneurial aspirations of people throughout the library system and beyond. Proposals from faculty and students from anywhere in the university will also be welcomed and the Lab will encourage collaboration with projects being developed at MIT.
This is great news, both in its practical import and as yet another sign of Harvard’s desire to innovate to help make libraries more useful, valuable, and relevant than ever. Thanks to the Arcadia Fund for supporting this. (Our own John Palfrey is one of the members of the new Library Lab. Yay!)
In other news, it looks like our little library lab is going to be changing its name
According to Inside Higher Ed, the US Copyright Office has approved ““sweeping new exemptions to the anti-circumvention provisions of the Digital Millenium Copyright Act” that allow the educational use of clips of movies decrypted from locked DVDs. Previously the act of decrypting the DVDs was itself (arguably) a violation of the DMCA.
The Association of College & Research Libraries’s Planning and Review Committee has posted what a February survey of the literature and of its members reveals as the top ten trends affecting libraries “now and in the near future.” They list them in alphabetical order:
Academic library collection growth is driven by patron demand and will include new resource types.
Budget challenges will continue and libraries will evolve as a result.
Changes in higher education will require that librarians possess diverse skill sets.
Demands for accountability and assessment will increase.
Digitization of unique library collections will increase and require a larger share of resources.
Explosive growth of mobile devices and applications will drive new services.
Increased collaboration will expand the role of the library within the institution and beyond.
Libraries will continue to lead efforts to develop scholarly communication and intellectual property services
Technology will continue to change services and required skills.
The definition of the library will change as physical space is repurposed and virtual space expands.
Hard to argue with anything on that list, beyond alphabetizing on the word “the” Some of the items seem to bury the lede a bit, though. For example, access to digitized, full-text sources shows up at the end of the first point on the list. Under that same point, “the effect of Google Books on library collections” shows up at the end in a comma-separated list of “additional collection development trends.”
One point that the Lab is particularly interested in that didn’t make it explicitly onto the list: The rise in value of library metadata. There’s tons of it around. It can be of incredible and continuing use to anyone trying to find or understand items in (or linked to) collections of all sorts. Library metadata is going to be big! Big, we tell you!
SpokenWord.org aggregates podcasts, almost all of which are free, and makes it easy for users to export them to, say, iTunes. It’s a non-profit site and is all about the openness. (Disclosure: I’m on its board.)
The site is, let’s say, very busy graphically, with a bunch of different ways to find what you want or browse to discover something good to listen to. But, now SpokenWord is looking for volunteers to curate podcast feeds and episodes in topics that interest them. These curated collections will be the main feature at the SpokenWord site, because nothing knows what’s interesting to humans better than other humans do.Details here.