Skip to content

Datasets metadata

Whatizit performance evaluation against CRAFT corpus

Get JSON-LD Get JSON-LD | Visit URL Visit DataCatalog

identifier DOI:10.5281/zenodo.4903981
name Whatizit performance evaluation against CRAFT corpus
description Whatizit performance evaluation against CRAFT corpus wrt Gene Ontology annotations
keywords
  • Whatizit

  • Semantic annotation

  • CRAFT

  • manual annotation

  • performance

license
url https://zenodo.org/record/4903981
about Performance assessment
datePublished 2021-06-05
encodingFormat text/csv
isAccessibleForFree True
author
publisher

Complete Medline abstracts corpus between 2015-2019 annotated Whatizit text annotation tool

Get JSON-LD Get JSON-LD | Visit URL Visit DataCatalog

identifier DOI:10.5281/zenodo.5035290
name Complete Medline abstracts corpus between 2015-2019 annotated Whatizit text annotation tool
description Gene Ontology annotations for Medline abstracts from 2015 to 2019 using Whatizit
keywords
  • Whatizit

  • Semantic annotation

  • Medline

  • text-mining

license
url https://zenodo.org/record/5035290
about Pattern-matching ontological annotation
datePublished 2021-06-27
encodingFormat xml
isAccessibleForFree True
author
publisher

Document-to-document relevant assessment for TREC Genomics Track 2005

Get JSON-LD Get JSON-LD | Visit URL Visit Dataset

identifier DOI:10.5281/zenodo.7324822
name Document-to-document relevant assessment for TREC Genomics Track 2005
description A CSV table with document-to-document relevance assessment judgements on a subset of the TREC Genomics Track 2005 produced by four annotators. The 'raw data document evaluation' contains six columns, first row consecutive id, second original TREC topic, third PubMed Id used as reference document, fourth PMID used to evaluate the relevance wrt the reference document, fifth the relevance score (2 definitely relevant, 1 partially relevant, 0 non-relevant), and sixth annotator id
keywords
  • Document-to-document relevance

  • TREC GEnomics Track 2005

  • relevance assessment

license
url https://zenodo.org/record/7324822
about Document-to-document relevant assessment for TREC Genomics Track 2005
datePublished 2022-11-15
measurementTechnique Manual curation
variableMeasured Document-to-document relevance assessment
encodingFormat text/csv
isAccessibleForFree True
author
publisher

Fleiss kappa for doc-2-doc relevance assessment

Get JSON-LD Get JSON-LD | Visit URL Visit Dataset

identifier DOI:10.5281/zenodo.7338056
name Fleiss kappa for doc-2-doc relevance assessment
description Fleiss' kappa measuring inter-annotator agreement on a document-to-document relevance assessment task. The table contains 7 columns, the first one presents the topics, 8 in total. The second column shows the “reference articles”, represented by their PubMed-ID and organized by topic. The third column shows the Fleiss’ Kappa results. The fourth column shows the interpretation of the Fleiss' Kappa results being: i) “Poor” results <0.20, ii) “Fair” results within 0.21 - 0.40, and iii) “Moderate” results within 0.41 - 0.60. The fifth column shows the PubMed-IDs of evaluation articles rated by the four annotators as “Relevant” regarding its corresponding “reference article”. The sixth column shows the PubMed-IDs of evaluation articles rated by the four annotators as “Partially relevant” regarding its corresponding “reference article”. The seventh column shows the PubMed-IDs of evaluation articles rated by the four annotators as “Non-relevant” regarding its corresponding “reference article”
keywords
  • Fleiss' Kappa

  • Inter-annoator agreement

  • TREC Genomics Track 2005

  • relevance assessment

license
url https://zenodo.org/record/7338056
about Inter-annotator aggreement for relevance assessment
datePublished 2022-11-19
measurementTechnique Fleiss' kappa
variableMeasured Inter-annotator agreement
encodingFormat text/tsv
isAccessibleForFree True
author
publisher

Protein Function Embeddings: First Beta Release of Datasets

Get JSON-LD Get JSON-LD | Visit URL Visit DataCatalog

identifier DOI:10.5281/zenodo.7793384
name Protein Function Embeddings: First Beta Release of Datasets
description Datasets generated from a thesis work that explores how information for protein functions can be exploited through embeddings so that the produced information can be used to improve protein function annotations
keywords
  • Protein function

  • Protein function embeddings

  • Word embeddings

  • Document embeddings

license
url https://zenodo.org/record/7793384
measurementTechnique
  • Word embeddings

  • Document embeddings

  • Cosine similarity

about Protein Function Embeddings
datePublished 2023-04-02
isAccessibleForFree True
author
publisher