Irmasari Hafidz

MSc, BSc

Pronouns: She/her
  • Doctoral Researcher


Irma is a full time PhD student at the Advanced VR Research Centre. She obtained an MSc in Business and Information Technology from Universiteit Twente, Netherlands in 2011 and BSc in Information Systems from Institut Teknologi Sepuluh Nopember, Indonesia in 2007.



  • OxML2022 – Oxford Machine Learning 2022, St Catherine College, Oxford University, Oxford, 7-10 August 2022 (in person)
  • RSECon 2022 - The 6th Research Software Engineering Conference, the Frederick Douglass Centre at Newcastle University, 6-8th Sept 2022, travel grant award from RSECon2022  
  • CardiffNLP, Cardiff University, Wales – UK, 30th June – 1 July 2022 – travel grant from LU Women in Engineering Society Travel Bursary, Loughborough University


  • Collaborations Workshop 2021 (CW21), Software Sustainability Institute, London/online, UK Tuesday, 30 March to Thursday, 1 April 2021 - Runners Up CW21 - Collaborative Ideas Groups and Winners, “Using Raspberry Pis to deliver Carpentries training in remote locations”, Team Member: Becca Wilson, Irma Hafidz, Alison Clarke, Talia Caplan, Jannetta Steyn
  • Data Study Group – The Alan Turing Institute – OSNI Challenge 12-23 July 2021, PhD Student Researcher for OSNI Challenge, PI: Dr Andrew Elliott (University of Glasgow), Dr Stephen Law (UCL/ Turing Fellow), Dr Griffith Rees (Postdoc, Alan Turing Institute)

Title of thesis: Health/Clinical Text Processing in Social Media using NLP 

The use of social media has made it a valuable source of information for various domains, including healthcare. However, the unstructured nature of social media data makes collecting relevant information for healthcare studies difficult. Social media sites such as Twitter, Facebook, and Instagram have been popular in Indonesia for discussing health-related concerns such as illness outbreaks, drug usage, and healthcare experiences.

The goal of this PhD thesis is to use NLP techniques to address the issues of health and clinical text processing in non-english languange, in particular using Indonesian languange. The goal of this research is to create and test an NLP pipeline for analysing Indonesian social media data relevant to health and clinical issues. To extract meaningful information from the text, the pipeline will use multiple NLP techniques such as tokenization, part-of-speech tagging, named entity identification, and sentiment analysis using CLAMP method (Soysal et al 2018).

The second objective is to apply the developed pipeline to a large dataset of Indonesian social media posts related to health and clinical issues. The dataset will be compiled from tweets with the hashtags #longcovid and #covid and will include a variety of COVID-related health concerns. The processed data will subsequently be used for a variety of purposes, including illness surveillance, detection of adverse drug events, and study of sentiment analysis for COVID-19.

Supervisors: Professor Roy Kalawsky and Dr Melanie King