OntoTagMe API Documentation

OntoTagME is an Entity Linker that is built for working on biologically relevant texts. OntoTagME processes biological texts provided by the user and extracts a set of spots, linking them with the relevant page of WikiData. You can annotate snippets of various lengths of text.OntoTagME is also fully integrated with PubTator, a Named Entity Recognition tool specialized for biology. If you need to annotate a biological paper (identified by a PubMed ID or a PubMed Central ID), then OntoTagME will integrate its results with the ones from PubTator, thus increasing the quality of the annotation. You can query OntoTagME through REST APIs as described in the following.

Version

OntoTagME is using a biological subset of the English Wikidata dump, version: 2022-10.

Endpoint URL

https://ontotagme-entity-linker.d4science.org/

Registering to the service

The service is hosted by the D4Science Infrastructure. To obtain access you need to register to the TagMe VRE and get your authorization token by clicking on the "access the VRE". Now you have everything in place to issue a query to OntoTagME RESTful API.

How to Annotate

OntoTagME works in two different ways:

  • PAPER QUERY: in this case, results from OntoTagME and PubTator are integrated together. If the paper is not present in PubTator, then the resulting annotations will be provided by OntoTagME only.
  • TEXT QUERY: in this case, only OntoTagME is used as a backend

A generic annotation looks like the following.

{
    "wid": Identifier of the entity,
    "spot": textual fragment referring to the entity,
    "Word": Title of the Wikidata page,
    "categories": LIST of categories assigned to the entity,
    "start_pos": index of the first character of the spot in the text,
    "end_pos": index of the last character of the spot in the text,
    "section": section where the entity was extracted,
    "annotation_mode": info regarding which annotator found the annotation (PubTator or OntoTagME),
    "wiki_url": url of the wikidata page regarding the entity, if applicable
}

Annotate by Article ID (PubMed and PubMed Central)

In order to get the annotations for a full text on PubMed Central or an abstract on PubMed, you can use the endpoint

https://ontotagme-entity-linker.d4science.org/annotate_by_id

The only supported query is POST.
The annotation will be performed by OntoTagME and enriched with PubTator.

Parameters

  • a_id - required - the article ID to annotate. Formatted like PMCxxxxxx for full-text annotations, and xxxxx for annotating abstracts only.
  • token - required - the TagME VRE authorization token, to get it click on the "access the VRE" button.
  • mode - defaults to 'pmc' - in this field, specify 'pm' if you want to annotate abstracts, or 'pmc' if you want to annotate full-texts.

Example

Here is provided a simple Python script to test the "query by id" endpoint.

import requests
URL_ID = "https://ontotagme-sobigdata.d4science.org/annotate_by_id"
TOKEN = <"your_tagme_vre_token">
headers = {"Content-Type": "application/json", "gcube_token"=TOKEN}
def annotate_by_id(a_id, mode="pmc"):
    payload = {"a_id": a_id, "mode": mode}
    r = requests.post(URL_ID, headers=headers, json=payload)
    if r.status_code != 200:
        raise Exception("Error on article: {}\n{}".format(a_id, r.text))
    return r.json()
print(annotate_by_id("PMC6982432"))
print(annotate_by_id("33403489", mode="pm"))

Annotate by Text

In order to get the annotations for a textual snippet, you can use the endpoint

https://ontotagme-entity-linker.d4science.org/annotate_by_text

The only supported query is POST.
The annotation will be performed by OntoTagME alone. In this case, no PubTator enrichment is performed

Parameters

  • text - required - The snippet of text to annotate.
  • token - required - the TagME VRE authorization token, to get it click on the "access the VRE" button.

Example

Here is provided a simple Python script to test the "query by text" endpoint.

import requests
URL_TEXT = " https://ontotagme-sobigdata.d4science.org/annotate_by_text"
TOKEN = <"your_tagme_vre_token">
headers = {"Content-Type": "application/json", "gcube_token"=TOKEN}
def annotate_by_text(text):
    payload = {"text": text}
    r = requests.post(URL_TEXT, headers=headers, json=payload)
    if r.status_code != 200:
        raise Exception("Error on article: {}\n{}".format(a_id, r.text))
    return r.json()
print(annotate_by_text("Comparison with alkaline phosphatases and 5-nucleotidase"))

Credits and References

OntoTagME is a joint effort between the University of Pisa, Scuola Normale Superiore, and the University of Catania.

Its first version was presented in the Applied Network Science paper "NetME: On-the-fly knowledge network construction from biomedical literature", by Muscolino et. al.

Please cite our work if you decide to use OntoTagME for your research.

Enter TagMe VRE Enter TagMe VRE

Access the TagMe VRE with your SoBigData or D4Science credentials.