Thursday, May 19, 2022

MediaFilters for full-text search and thumbnail creation

Thumbnails of documents give more visibility for PDF documents of books, reports, articles and magazines. DSpace filter-media script extracts the PDF files for full-text search and thumbnail creation. and after submitting an item. Here is the view of an item without a thumbnail,


Filter-Media script can run either individually after submission of an item for the instant result. Apply the following command to run filter-media,

sudo /dspace/bin/dspace filter-media

The filter-Media command can be added to Crontab to schedule the filter-media command. Add these cron settings under the Linux user account which is running Tomcat (and owns the dspace installation directory).

crontab -e

If the DSpace run on a 24x7 working server machine, add the following entries into the Crontab,

# Run the media filter at 03:00 am every day.
0 3 * * * /dspace/bin/dspace filter-media

If you would like to change the timing, add a convenient time. See the different Cron time examples available here. See the thumbnails along with the item after applying the filter-media script.

References

Media filters for Transforming DSpace Content
https://wiki.lyrasis.org/display/DSDOC7x/Mediafilters+for+Transforming+DSpace+Content


Wednesday, May 18, 2022

OAI-PMH / OAI-ORE Harvester


OAI-PMH and OAI-ORE are the standards for the description and exchange of metadata and digital objects for archives. DSpace is compatible with OAI-PMH and OAI-ORE. It means, it's possible to import metadata and digital objects (e.g. text, images, data, and video) into DSpace. A DSpace administrator can import metadata from an e-journal/e-book/institutional repository (e.g. arxiv.org, doabooks.org, doaj.org). Harvesting metadata and digital objects from an external source will enrich the institutional repository run on DSpace and it also enhance the user experience. Following are the steps to harvest content from an external source.