dLibra, dMuseion, dArceo and dLab constitute an extended and complex set of tools that support research and cultural heritage institutions in mass digitization including text recognition (OCR), cooperation with external digitization centers, automated conversion of the source data to light version, long-term preservation services and digital library/museum system. While dLibra, dMuseion and dArceo are responsible for specific aspects of digitisation process, dLab makes them cooperate and complement each other, resulting in a complex system for research and cultural heritage institutions.
Case study – Digital Repository of Scientific Institutes
Digital Repository of Scientific Institute (DRSI), as it is written on its web page, aims at creation of open, more than just a regional and multidisciplinary digital repository composed of digitised materials, including archives, scientific publications, research documentation as well as cultural heritage selected from 16 Polish scientific institutes and their libraries, which compose Consortium of Digital Repository of Scientific Institutes.
Institutes involved in the creation of DRSI use during the digitisation process several independent digitisation centers, carriage lists mechanism for monitoring current location of physical assets and unique solutions for handling specific materials, e.g. bound periodicals or books. All these requirements has been taken into consideration during the deployment of the software tools offered by PSNC. It is primarily dLab system for digitisation workflow management, dLibra system for online availability of digital resources and dArceo system for long-term preservation purposes.
The general scheme for the dLab, dLibra and dArceo deployment is presented on the picture below.
The central part of the DRSI deployment is dLab system responsible for monitoring of the digitisation tasks and activities. Among particular activities there are two communicating with dLibra and dArceo systems. In case of dArceo it is archiving master files and in case of dLibra it is publishing digital objects over the internet. Integration of the dLibra and dLab system has also additional characteristic – both systems has been configured in such a way, that as soon as planned digital object is created in the scope of dLibra system, it is automatically propagated to the dLab system and a dedicated task is created. Such an approach makes it possible to set up an efficient and automated environment for digitisation and publishing of digital assets.
Beside communication with dLibra and dArceo, dLab cooperates in scope of DRSI with other external tools, dedicated to create presentation versions of digital objects:
- Document Express in order to prepare DjVu files,
- ABBYY Recognition Server in order to prepare PDF files.
In both cases files can be enriched with the text layer, coming from dedicated OCR engines.
Moreover, dLab system has build-in mechanism and extensions that allow for example:Ponadto system dLab ma wbudowane dodatkowe mechanizmy i rozszerzenia, m.in. pozwalające na:
- mass file renaming according to the specified rules,
- performing a backup copy of master files before the final archiving in the dArceo,
- enforing specific naming of tasks within the dLab environment.
Additionally, in scope of the dLab system there are several reports available, e.g.:
- report with unfinished tasks (since particular data),
- report with tasks of which copyright is not determined,
- report with tasks not permitted to be published (by copyright holder).
Summing up, for the needs of DRSI dLab system has been configured to manage complex digitization process, which consists of 29 types of activities. There are two types of tasks groups, which are used to execute activities on a bound book/periodical or shipping list level. In the workflow there are 5 activities executed fully by the automated tool and 6 activities that are supported by dedicated tools which help to automate the work (e.g. file naming).
In order to get detailed information, please contact us.