Providing Access to the HathiTrust Titles Without Adding Millions of Objects to the SFX Knowledgebase
Margery Tibbetts
California Digital Library
HathiTrust (pronounced "hod-ee") born of the Google Books project, with items contributed by CID libraries.
Margery Tibbetts
California Digital Library
HathiTrust (pronounced "hod-ee") born of the Google Books project, with items contributed by CID libraries.
HathiTrust has a new Bibliographic API for item lookup on ISSN, ISBN, LCCN, or OCLC number. Returns items and rights to item - important since only ~16% of the approx. 200 TB of data is public domain. Margery and CDL are working with new API to create an SFX target, target parser, and plug in to query HathiTrust titles.
PlugIn calls Hathi API, parses all available identifiers, and gets a JSON response containing the record information and item information.
Margery is currently using the hathiID from the item information, along with the handle URL, to link back to the record. If at some point there's a better way to link in using only the hathiID, will be able to only do a ParseParam change in SFX Admin rather than a software change. Brilliant.
Created a hathi_trust.config to control the timeout on the API - keeps you from waiting forever if it's down, even though she hasn't had timeout problems.
Has limited returned records to public domain, public domain US, and world available items (rights codes return as part of item information); also limited to books even though the Hathi digital library also contains serials - goal to just get something working for now.
No good way to get Google Books search out of menu when HathiTrust is available - will often appear together b/c of origins of Hathi. Google Book Search happens after SFX menu generates, so display logic doesn't work well with it.
Margery did development work with WorldCat because reliably had the metadata she needed for testing. Has working source parser for Aleph so can link directly from item cataloged to Hathi through SFX. CDL will probably do source parser for III, too.
Next steps include support for Voyager parser, lookup of multiple items (Margery is only retrieving one item right now) by upgrading PlugIn; retrieval of items contributed by institution.
Unexpected glitches include WorldCat Local - OCLC is loading HathiTrust titles as a separate digital library with own OCLC numbers - not merged with print OCLC record for these items; the digital item OCLC numbers have not been fed back to Hathi. Margery is working to get this fixed, but doesn't know how long it will take. Since CDL is a Hathi member, she has access to the Hathi metadata for figuring things out and testing.
Rollout:
Hathi is being cautious about hits to API, so wants a phased rollout.
Code will go out to CID libraries (Hathi contributors) within a month or so; put on EL Commons for SFX customers in 4-6 months.
PlugIn calls Hathi API, parses all available identifiers, and gets a JSON response containing the record information and item information.
Margery is currently using the hathiID from the item information, along with the handle URL, to link back to the record. If at some point there's a better way to link in using only the hathiID, will be able to only do a ParseParam change in SFX Admin rather than a software change. Brilliant.
Created a hathi_trust.config to control the timeout on the API - keeps you from waiting forever if it's down, even though she hasn't had timeout problems.
Has limited returned records to public domain, public domain US, and world available items (rights codes return as part of item information); also limited to books even though the Hathi digital library also contains serials - goal to just get something working for now.
No good way to get Google Books search out of menu when HathiTrust is available - will often appear together b/c of origins of Hathi. Google Book Search happens after SFX menu generates, so display logic doesn't work well with it.
Margery did development work with WorldCat because reliably had the metadata she needed for testing. Has working source parser for Aleph so can link directly from item cataloged to Hathi through SFX. CDL will probably do source parser for III, too.
Next steps include support for Voyager parser, lookup of multiple items (Margery is only retrieving one item right now) by upgrading PlugIn; retrieval of items contributed by institution.
Unexpected glitches include WorldCat Local - OCLC is loading HathiTrust titles as a separate digital library with own OCLC numbers - not merged with print OCLC record for these items; the digital item OCLC numbers have not been fed back to Hathi. Margery is working to get this fixed, but doesn't know how long it will take. Since CDL is a Hathi member, she has access to the Hathi metadata for figuring things out and testing.
Rollout:
Hathi is being cautious about hits to API, so wants a phased rollout.
Code will go out to CID libraries (Hathi contributors) within a month or so; put on EL Commons for SFX customers in 4-6 months.
Comments