1. Fully operational system and available services;
  2. Main flows successfully tested;
  3. Four scientific articles presented at conferences and/or published in ISI journals;
  4. 1 Patent application;
Achievements so far

The main objective of the Smart Search component project is the implementation of software services dedicated to digital libraries. These involve several functionalities: 1). optimized indexing of documents in an unrelated database of Elasticsearch type; 2). search algorithms for relevant documents based on keywords or snippets of text, as well as the ability to narrow down search results using filters; 3). semantic recommendations - providing documents similar to a given reference document.

From a functional perspective, the Smart Search project ensures the systematization / indexing of documents, the semantic search for relevant documents, but also the exploration of intertextuality links between various document collections. These services are based on semantic models for representing the knowledge involved in collections of texts in Romanian and algorithms based on semantic recommendations. The services are offered through an intuitive web application, currently in beta, being installed and running on a server in UPB provided for BCUB.

The Lib2Life web portal incorporates both general functionalities, such as: account creation, authentication, password recovery, these being accessible to any user, as well as functionalities specific to each user role (administrator, librarian, respectively reader) according to the functional specifications defined in the project proposal. . The interaction scheme is shown in Figure 1, while the detailed functionalities are shown in Table 1. In the following green table the functionality developed since the last progress report is marked.

Figure 1. Workflow of various user roles.
Dotted line features are under development.

The document upload process involves processing OCR scans obtained from the P2 subproject, which incorporates metadata, unlike the previous report in which the metadata was provided in a separate XML file. Once the document is loaded, the word processing begins, which includes automatic text extraction and correction, header extraction, footnotes, images, and tables. A final step allows to manually edit the extracted text. Finally, the processed document and metadata are saved in the Elasticsearch database. The interaction scheme with the P2 eLibrary Builder component project is shown in Figure 2.

Table 1. Smart Search functionalities.

Functionality Necessary role Description

Implemented functionalities

Register General Allows the creation of a user account within the application.
Authentification General Allows authenticating within the system.
Password recovery General Allows the generation of a new authentification password.
Search facilities General Allows for simple and advanced (filter-based) searches to find documents saved in the system.
Document view General Allows the view of the original document (PDF).
Document upload Librarian Allows the upload of documents and related metadata, as well as the extraction of the text from them and its processing.
Metadata editing Librarian Allows you to change the metadata of documents saved in the system.
Document erase Librarian Allows the deleting of problem documents.
Ontology view Librarian Allows the view of the OWL ontology.
Semantic recomandations General Allows the provision of recommendation of similar documents based on semantic models.

Functionalities in progress

User management Administrator Allows the management of user accounts and their respective roles.
Ontology editing Administrator Will allow the modification of the OWL ontology through the web application.

Figure 2. Interaction with the P2 project and the PDF document processing process.