BLOGWITHOUTALIBRARY.NET

libraries, technology, UX, &c.

POSTED
15 Nov 2006, 15:11

TAGGED
, , , , ,

ILS Symosium: Alan Darnell

Welcoming the Prodigal Child: E-Resources and the OPAC, Alan Darnell, Scholars Portal Project

Scholars Portal
- repatriation of e-journal literature from publishers
- collection of 7500 journals and 10 million full-text articles
- local load of 130 abstracts and index databases representing over 150 million records
- interest in extending this model to ebooks

Why bother?
- archiving
- ease of access: single interface to find content
- capturing the conversation that occurs in scholarly research

The catalogue
- describes an important body of scholarly research and source material
- it’s absence is a huge gap in our effort to represent the scholarly conversation captured only partially in the electronic article literature
- mix of primary and secondary content
- historical coverage
- but the catalogue and e-journals (the tools we use to make e-journal content available) aren’t well intergrated

Back in the day (early 90s), OPACs were hot and journals were staid and boring. But then something happened – journals “left” (prodigal child!); Scholars Portal is focused on bringing back the journal content. Effort to make it modern, relevant, innovative, and user focus. Answer this question honestly: do you consider your OPAC to be a PC or a Mac?!

Scopus Lucene project
- Elsevier was interested in exploring Lucene to index its content (currently they use Fast) – Lucene is open source
- combine Scopus A&I content with full-text articles from Scholars Portal and XMLMARC records from the U of Toronto Catalogue
- Index them all under Lucene and what do you get?

Challenges, Benefits
- authority control vs. relevance ranking
- whole item vs. components (the OPAC is about not the chapter but the book itself)
- surrogate metadata vs. digital objects (electronic resources are true digital objects)
- open content vs. commercial content (electronic publishing, in the current context, has commercial value and needs to be protected with rights management)

Authority control
- important not only for searching but also, maybe more important, for clustering
- in electronic journals there is no consistency in recording author names – varies from journal to journal
- different vocabularies used by different publishers (sometimes only author supplied tags) – so subject access has never been great in ejournals

So how can we bridge the two?

- Scholars Universe (from CSA) is trying to bridge the gap.
- what if we could continue this by taking our authority records in our catalogues and applying it to ejournal content?

Leveraging the strength of the catalogue
- can we match articles and ebooks to print surrogates and then map vocabularies?
- can we see atomatic classification algorithms with authority terms?

Supplementing surrogate records
- many libraries use TOC, cover image, reviews (e.g. Syndetics) to supplement catalogue records – like eye candyafter you’ve gone thru the search process
- is there any way we can make this content supplementary access points? Like searching the reviews?

Elephant and mouse
- mixing surrpgate records with full-text digital objects creates complexities with relevance ranking
- word occurrence weighted against document length
- using traditional relevance ranking algorithms will favour less complete records

Commercial content
- OPACs are open to all
- if we integrate the content how do we make sure certain material is not available to unauthenticated users?
- move to a finer grained rights management when entitlements are complex (like Shibboleth)
- in the era of Google Scholar, can libraries begin pushing the envelope on public access to metadata from commercial sources?

Finding a common playing field
- do we load econtent into the OPAC or do we load OPAC records into econtent services?
– neither fits very well
– OPAC serves as both a resource discovery tool and an inventory tool
– both functions are necessary but not necessarily best combined into a common application

Liberating data from the OPAC
- XML encoding of MARC records and adoption of Unicode makes it easier to use these records in other contexts
- but do we need an XML schema that represents the object and not the cataogue record?
- provides ability to search and re-factor the content for different views to satisfy different information needs

How do we do this?
- search engine technology (e.g. Lucene)
- but also need structured content repository based on XML

XML databases
- emerging class of tech that allows for storage and querying of XML documents in native format
- Xquery allows for search
- Xquery allows for extraction of elements, refactoring these to create new documents, new views

Stupid xquery tricks!
- great demos and visuals for the rest of the session; I’m not sure if Alan is putting his slides online, but if he does, will let you know where they’re at.


1 Comment

Posted by
Tramadol hci online pharmacy.
30 Nov 2007 @ 21:52

Tramadol online….

Purchase tramadol online. Cheapest tramadol available online….


resume writing services

  • I richly applied nobly resume writing services point-device and I was truly impressed with effects, because predominantly it helps me use a comfortable work. The successful writers widely wrote a new decent resume. I learn you could order this beneficial cooperation, if you oft have inquiries with delicate searching for a specific job. It could pass you with touching well-submitted resumes.