Computational Epidemiologist • AI Researcher • ML Specialist • Data Scientist

Dr Reju Sam John

Transforming complex data into evidence that drives decisions through |

0+ Citations
0 Publications
0+ Years Experience
0 GitHub Repos
Scroll to explore

About Me

I'm a computational epidemiologist and AI researcher based in Auckland, New Zealand. With a PhD in Physics (Computational Modelling) and 7+ years of postdoctoral research, I bridge the gap between advanced computational methods and real-world health outcomes.

My work spans the full research lifecycle — from study design and ethics applications through to data collection, analysis, and publication in high-impact journals like Nature Communications and the Journal of the Royal Society Interface.

I'm passionate about how AI and machine learning can improve diagnostic services and advance health equity, particularly within Aotearoa New Zealand's health system. I bring a strong commitment to Te Tiriti o Waitangi principles, ensuring Māori data sovereignty is embedded in research design and practice.

Currently seeking data scientist and research collaboration opportunities where I can apply my expertise in computational modelling, ML pipelines, and health informatics to drive evidence-based decisions.

AI & ML in Health

Designing and implementing AI/ML pipelines for infection dynamics modelling, diagnostic evaluation, and health equity analysis.

Data Science

Scalable ETL pipelines, ensemble ML models with 90% forecasting accuracy, and end-to-end reproducible analytical workflows.

Epidemiology

Metapopulation SEIR models, zoonotic disease dynamics, and computational frameworks published in top-tier journals.

Health Equity

Equity-focused AI evaluation frameworks with culturally grounded research practice and Māori data sovereignty principles.

Work Experience

AI, Machine Learning & Data Science Consultant

Molecular Epidemiology and Public Health Laboratory

Current Mar 2025 – Present
  • Design and implement AI/ML pipelines (Python, R) to model infection dynamics, evaluate transmission risk, and produce evidence for public health recommendations.
  • Translate analytical outputs into reports and presentations for clinical, policy, and non-technical audiences; mentor PhD researchers in reproducible workflows.
  • Embed Te Tiriti o Waitangi principles and Māori data sovereignty considerations in study design, data-sharing agreements, and dissemination planning.
PythonRML PipelinesPublic HealthMentoring

Production Operator — Regulated Healthcare Manufacturing

Fisher & Paykel Healthcare

Sep 2024 – Feb 2026
  • Operated within an ISO-regulated medical device manufacturing environment, gaining direct operational knowledge of quality systems, risk documentation, and compliance requirements applicable to health technology evaluation.
ISO ComplianceQuality SystemsMedical Devices

Postdoctoral Fellow — Data Science, Network Theory & Health Informatics

The University of Auckland

Jan 2023 – Mar 2024
  • Built and managed a longitudinal participant database for a clinical study tracking infection dynamics in vulnerable populations.
  • Designed large-scale computational models of disease propagation coupled with (mis)information diffusion — pragmatic research designs relevant to AI-enabled health interventions.
  • Developed reproducible, version-controlled analytical workflows in Linux environments and delivered results to academic, clinical, and policy audiences.
Network TheoryHealth InformaticsClinical StudiesLinux

Postdoctoral Fellow — Computational Epidemiologist & Health Data Scientist

Massey University

Nov 2020 – Jan 2023
  • Led the design and delivery of metapopulation SEIR models across 340 cities, producing a first-author paper in Journal of the Royal Society Interface.
  • Built scalable ETL pipelines integrating multi-source public health datasets, achieving 90% accuracy in epidemic trend forecasting using ensemble ML models.
  • Collaborated with international multidisciplinary teams across New Zealand, USA, and Europe.
SEIR ModelsETL PipelinesEnsemble MLForecasting

Postdoctoral Fellow — Astrophysics Simulations & Computational Data Science

Inter-University Centre for Astronomy and Astrophysics (IUCAA), India

Aug 2018 – Nov 2020
  • Designed automated processing pipelines for large-scale (1 TB+) simulation datasets on HPC clusters; applied statistical inference and ML to high-dimensional data.
HPCSimulationsStatistical InferenceBig Data

Technical Skills

Programming & Tools

Python R SQL Bash/Shell C Git/GitHub Linux LaTeX

AI & Machine Learning

Scikit-learn Deep Learning NLP Ensemble Methods Statistical Modelling Time Series Classification

Data Engineering

ETL Pipelines REDCap Web Scraping APIs Data Validation Pandas NumPy

Visualization & BI

Power BI Tableau Streamlit Matplotlib Seaborn Plotly

Cloud & DevOps

AWS Google Cloud GitHub Actions CI/CD HPC Clusters Docker

Domain Expertise

Epidemiology Health Informatics Clinical Trials ICH GCP R3 Health Equity Te Tiriti o Waitangi

Selected Publications

11 peer-reviewed articles • 169+ citations • Google ScholarORCID

2024

High connectivity and human movement limits the impact of travel time on infectious disease transmission

John, R.S. et al.

Journal of the Royal Society Interface, 21(210)

First Author
2024

Modelling Lassa virus dynamics in West African Mastomys natalensis

John, R.S., Fatoyinbo, H.O., & Hayman, D.T.S.

Journal of the Royal Society Interface, 21(216)

First Author
2023

Identifying SARS-like coronavirus spillover risk hotspots

Muylaert, R.L. et al. (incl. John, R.S.)

Nature Communications

Nature Comms
2022

Transmission models indicate Ebola virus persistence in non-human primate populations is unlikely

Hayman, D.T.S., John, R.S., & Rohani, P.

Journal of the Royal Society Interface, 19(187)

Featured Projects

Open-source research code and analytics platforms

Respiratory Biometrics Analytics

End-to-end clinical analytics pipeline processing 34,000+ ICU vital-sign measurements with four-gate ETL, audit logging, and CI/CD via GitHub Actions.

PythonETLPower BICI/CD

SEIR Metapopulation Model

Open-source code accompanying first-author publication. Metapopulation SEIR model across 340 Chinese cities — end-to-end reproducible research at publication standard.

JupyterEpidemiologyPublished

Lassa Virus Dynamics Model

Zoonotic Lassa transmission modelling in West African rodent populations. Open-source research code accompanying Journal of the Royal Society Interface publication.

JupyterZoonotic DiseasePublished

Bakery Sales Intelligence

Market basket analysis and temporal pattern detection for bakery sales data with weather integration. Demonstrates applied data science for business analytics.

PythonMarket BasketAnalytics

Education & Certifications

Education

Ph.D. in Physics
Computational Modelling, Data Analysis & Simulation
Pondicherry University, India
2011 – 2018
M.Sc. in Physics
Mahatma Gandhi University, India
2006 – 2008

Certifications

Understanding Te Tiriti o Waitangi
Groundwork • Sep 2025
ICH Good Clinical Practice R3
Global Health Training Centre • Jul 2025
AWS Machine Learning Essential Training
Mar 2025
Google Cloud: Building Data Pipelines
Apr 2025
Deep Learning • NLP with Python • Ensemble Learning
Apr 2025
Power BI • SQL, Tableau, Python & Spark
2024

Get in Touch

Open to data science roles, research collaborations, and consulting opportunities

Looking for a Data Scientist?

I bring 7+ years of research experience, a strong publication record, and hands-on expertise in AI/ML, ETL pipelines, and health informatics. Let's talk about how I can contribute to your team.

Send Me an Email

Research Collaboration?

My expertise spans computational epidemiology, disease modelling, AI diagnostic evaluation, and health equity analytics. I'm eager to collaborate on impactful health research.

Propose a Collaboration