PharmaKG Explorer: Knowledge Graph Embedding-Driven Conversational Querying and Real Time Link Prediction in Drug-Gene-Disease Networks
Author ORCID
Shehan Irteza Pranto 0000-0002-9818-4439
Publication Date
5-2-2025
Abstract
In this project, we develop an end-to-end platform for interactive knowledge graph embedding tailored to accelerate drug discovery. Starting from curated biomedical datasets-PharmAlchemy, we extract and filter 311496 triplets linking drugs, genes, diseases, and side effects, then deploy the complete graph including nodes, relations, and properties on Neo4j Aura. A natural‑language interface built with Gradio and LangChain leverages LLM prompting (Gemma2‑9b‑It) to translate user inputs into Cypher queries, enabling seamless, conversational exploration of the graph. We also implemented and compared three state‑of‑the‑art embedding methods - TransE, RotatE, and ComplEx- on link prediction benchmarks, evaluating performance via metrics such as Hits@k and mean reciprocal rank. To make link prediction findings actionable, we also provide an interactive GUI that allows researchers to visualize and test potential new relationships in real time. This integrated framework combines powerful embedding techniques with user‑friendly interfaces to support rapid hypothesis generation and discovery in biomedical research.
Keywords
Knowledge Graph Embedding, Neo4j Aura, Link Prediction, Drug Repurposing, Cypher Query Language, LLM
Repository
Zenodo
Distribution License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Access Instructions
This data is available under the CC-BY 4.0 License