Skip to content

Projects metadata

OntoClue

Get JSON-LD Get JSON-LD

Get RO-Crate Get RO Crate

RO-Crate HTML Preview RO-Crate HTML Preview

Started in 2021-01-01

Description

OntoClue aims to provide a framework to optimize and compare document-similarity and doc2doc-relevance approaches based on word-embeddings and document-embeddings. Using the RELISH dataset, each approach creates document-embeddings and calculates the Cosine Similarity. An optimizer finds the best hyperparameter combination that naturally (i.e., with no further tuning or training) resembles better the three document relevance assessments cominf from RELISH. The approaches are compared using Precision (P@N) and Normalized Discounted Cumulative Gain (NDCG@N). TREC 2005 Genomics Track data has also been analyzed using a repurposed version that transforms document-to-topic relevance into document-to-document relevance. The main focus of this project relies on RELISH.

Keywords

word-embeddings, document-embeddings, ontology-embeddings, document similarity, document relevance, doc2doc relevance, ontology enrichment

Url

Current project members

Previous project members

Department

Semantic Technologies team at ZB MED

Visit URL Visit ResearchOrganization

Parent organization, consortium or research project

Deutsche Zentralbibliothek für Medizin (ZB MED) - Informationszentrum Lebenswissenschaften

Visit URL Visit ResearchOrganization

NFDI4DataScience

Visit URL Visit Consortium

STELLA Living Labs Project

Visit URL Visit ResearchProject

Funding

Visit URL Visit Grant

  • Identifier: 460234259
  • Description: Project no. 460234259 (corresponding to the NFDI4DataScience consortium) Visit URL Visit Grant

  • Identifier: 407518790

  • Description: Project no. 407518790 (corresponding to the STELLA project)

Outcomes

A Comparison of Vector-based Approaches for Document Similarity Using the RELISH Corpus

Visit URL Visit ScholarlyArticle

OntoClue, a framework to compare vector-based approaches for document relatedness using the RELISH corpus

Visit URL Visit Poster

OntoClue, a framework to compare vector-based approaches for document relatedness using the RELISH corpus - Poster

Visit URL Visit ScholarlyArticle

Ontology Clustering with OWL2Vec*

Visit URL Visit ScholarlyArticle

Complete Medline abstracts corpus between 2015-2019 annotated Whatizit text annotation tool

Visit URL Visit DataCatalog

Whatizit performance evaluation against CRAFT corpus

Visit URL Visit DataCatalog

External contributors