PharmaKG Explorer: Knowledge Graph Embedding-Driven Conversational Querying and Real Time Link Prediction in Drug-Gene-Disease Networks

Author ORCID

Shehan Irteza Pranto 0000-0002-9818-4439

Publication Date

5-2-2025

Abstract

In this project, we develop an end-to-end platform for interactive knowledge graph embedding tailored to accelerate drug discovery. Starting from curated biomedical datasets-PharmAlchemy, we extract and filter 311496 triplets linking drugs, genes, diseases, and side effects, then deploy the complete graph including nodes, relations, and properties on Neo4j Aura. A natural‑language interface built with Gradio and LangChain leverages LLM prompting (Gemma2‑9b‑It) to translate user inputs into Cypher queries, enabling seamless, conversational exploration of the graph. We also implemented and compared three state‑of‑the‑art embedding methods - TransE, RotatE, and ComplEx- on link prediction benchmarks, evaluating performance via metrics such as Hits@k and mean reciprocal rank. To make link prediction findings actionable, we also provide an interactive GUI that allows researchers to visualize and test potential new relationships in real time. This integrated framework combines powerful embedding techniques with user‑friendly interfaces to support rapid hypothesis generation and discovery in biomedical research.

Keywords

Knowledge Graph Embedding, Neo4j Aura, Link Prediction, Drug Repurposing, Cypher Query Language, LLM

Repository

Zenodo

Distribution License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Share

COinS