Sourced From SearchDay
By Gary Price


Yahoo, The Internet Archive and several other organizations announced the formation of the Open Content Alliance (OCA) to make thousands of books, multimedia files and other materials freely searchable and accessible online.

Brewster Kahle, founder of The Internet Archive, said today’s announcement is a “call for open participation in the Open Content Alliance.” In other words, the group would love to see other organizations (libraries, publishers, archives, etc.) join the group.

As of today’s launch, The Internet Archive, Yahoo, Adobe Systems, The European Archive, Hewlett Packard Labs, The National Archives (UK), O’Reilly Media Inc, and the University of California are the founding members of the OCA.

Content that members of the Open Content Alliance will digitize a will eventually be accessible in several places including the OCA web site and via Yahoo. All of this goes towards Yahoo’s goal to help users find, share, use and expand (FUSE) human knowledge both via a Yahoo search and elsewhere.

The OCA project differs from other digitization projects in that the database of scanned material will be available for anyone to use on any site. Yes, it’s an open access database! You could even create a focused database (let’s say one on American literature) and use it on your own web site.

Without getting into legal “what if’s,” most of the material in the OCA will be available as full text. There are no limits on how much you can view or download for offline viewing or printing. Kahle said that in some cases you can find content via the Open Content Alliance, print it, and slap a cover on it. Sort of a, “make your own book” type of thing.

The OCA is an opt-in program. An organization digitizing a work must have the permission from the copyright holder if the material is not in the public domain. “At the option of the copyright holder, copyrighted content may be distributed through a Creative Commons license,” says the press release announcing the OCA.

“Creative Commons is a non-profit organization whose licensing encourages personal use, reuse and re-purposing of digital content. Content that is made available on the OCA website will be available in PDF and other widely adopted formats. This approach enables mass media and independent publishers to expand their reach by submitting content that spans categories, file formats and languages while retaining their copyrights.”

The press release also contains a quote from Sally Morris, Chief Executive of the Association of Learned and Professional Society Publishers (ALPSP), and someone who Danny Sullivan has chatted with on the Search Engine Watch blog about digitization and copyright.

“We welcome the launch of the OCA because its approach respects the rights of publishers and other copyright owners,” said Sally Morris, Chief Executive of the ALPSP). “Many publishers already make some of their book and journal content freely available online, and the OCA’s model of allowing rights holders to control which of their works are opened up, when, and where they are hosted may encourage others to do so.”

The OCA plans to offer public access to the database by the end of 2005 with much more coming in 2006.

Final Thoughts

Long before Google took the world by storm by with their plans to digitize some or all of the books in five large libraries, it’s important to remember that many companies, libraries, and other organizations have been busy digitizing new and old materials for searching and viewing online. For example, Project Gutenberg has been around since the 1970’s, and now counts more than 16,000 ebooks in its digital collection.

In the SearchDay article about Google Library last December, we pointed out that Brewster Kahle and The Internet Archive have plans to digitize the full text of over one million books and make them freely accessible and searchable on the web. Since then we’ve posted several other articles about the topic and links to online book tools including the wonderful Online Books Page from the University of Pennsylvania that offers free full text access to over 20,000 titles (both new and old) from a wide variety of sources.

I hope other organizations decide to join the OCA (it will be interesting to see how the opt-in approach is received) and that any legal issues that arise get resolved quickly. Of course, as a searcher, I have to be concerned with the already large Yahoo database getting even larger without people having the skills or the time to get what they need.

I think the idea of allowing people to mine OCA content and create their own database is an exciting one. It’s easy to think how something like this could be of value to the college academic or even a elementary school teacher. Of course, the OCA will have to make it easy for disparate groups to create their own tools, but this will be a business opportunity for those who can help create these types of tools.