Tuba Gokhan, Ph.D.

Senior Postdoctoral Researcher in NLP

I am a researcher in Natural Language Processing specialising in Regulatory Natural Language Processing (RegNLP) and Retrieval-Augmented Generation (RAG). My work focuses on regulatory question answering, multi-passage retrieval over long, structured documents, regulation-focused summarisation and explanation, and synthetic dataset generation for high-stakes, confidentiality-sensitive domains with limited labelled data, together with evaluation methods tailored to legal and compliance use cases.

At MBZUAI, I combine academic research with applied projects in collaboration with the Abu Dhabi Global Market (ADGM) and other partners. I design datasets, retrieval pipelines, and evaluation frameworks to make LLM-based systems reliable, auditable, and usable for compliance and supervision tasks.

In parallel, I co-lead the RegNLP community, which brings together researchers and practitioners working on regulation, law, and NLP through shared tasks, workshops, and collaborative projects.

Photo of Tuba Gokhan
RegNLP
RAG & Evaluation

Work & Education

Professional Experience

Sep 2023 – Present
Senior Postdoctoral Associate in Natural Language Processing

Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, United Arab Emirates

Leading research on Regulatory NLP and retrieval-augmented generation (RAG/GraphRAG) for high-stakes compliance use cases. I design datasets, retrieval pipelines and evaluation metrics for confidential regulatory corpora and coordinate collaborations with academic and industry partners.

Regulatory NLP RAG / GraphRAG Synthetic Dataset Generation Evaluation Metrics
Sep 2023 – Present
External Research Collaborator – Regulatory NLP & RegTech

Abu Dhabi Global Market (ADGM), FSRA, Abu Dhabi, United Arab Emirates

Co-developing RegTech prototypes based on Regulatory NLP and RAG, including synthetic and real QA benchmarks, answer evaluation workflows and experimental compliance assistants to support supervisory and policy teams.

RegTech Compliance QA Benchmarking
May 2019 – May 2023
Doctoral Researcher (Ph.D.) & Graduate Teaching Associate – NLP

University of Birmingham, Birmingham, United Kingdom

Conducted research on graph-based and long-document summarisation, information retrieval and evaluation, with applications to financial and regulatory narratives. Taught core Computer Science / NLP modules and supervised undergraduate and master’s projects.

Long-Document Summarisation Information Retrieval Teaching & Supervision
Dec 2013 – Jul 2017
Research Assistant – Computer Engineering

Gazi University, Ankara, Türkiye

Supported teaching and research in the Computer Engineering Department, contributing to data mining and machine learning projects and assisting with laboratory sessions and student supervision.

Data Mining Machine Learning Teaching Support
Jul 2012 – Dec 2013
Senior Software Engineer – iOS Developer

ESTER, Eskişehir, Türkiye

Developed and maintained iOS applications for Turkish ministries and public-sector clients, covering requirements analysis, UX implementation, testing and deployment to production environments.

iOS Development Public Sector End-to-End Delivery

Education

Doctor of Philosophy (Ph.D.), Computer Science

University of Birmingham · Birmingham, United Kingdom

2019 – 2024

Research on graph-based and unsupervised methods for long-document summarisation, with applications to financial and regulatory texts.

Master of Science (M.Sc.), Advanced Computer Science

Newcastle University · Newcastle upon Tyne, United Kingdom

2017 – 2018

Focus on data-intensive computing and visualisation, including stream data analytics and machine learning.

Master of Science (M.Sc.), Computer Engineering

Gazi University · Ankara, Türkiye

2014 – 2016

Research on data warehousing and big data platforms for nutrition and health research.

Bachelor of Science (B.Sc.), Computer Engineering

Eskişehir Osmangazi University · Eskişehir, Türkiye

2008 – 2012

Foundations in computer engineering, algorithms, and software development.

Contact

If you would like to discuss research collaborations, RegTech projects, shared tasks, or invited talks and tutorials, you can reach me via email or LinkedIn.