The dengue dataset has two parts:

  1. PDF files in a public Google Drive folder. There are two types of files — "clusters" record the location and size of dengue clusters, whereas "cases" show the daily and weekly reports of new dengue cases.
  2. PDF converted to CSV files on best effort basis. Disclaimer: we cannot guarantee perfect conversion because of dirty data e.g. wrong / misspelled / duplicate addresses.

Each file is a snapshot of the National Environment Agency's (NEA) webpage that was taken on a certain date. Hence, filenames contain YYMMDD date format to denote when data was collected (e.g. 12 August 2014 is written as 140812). This file-naming convention allows the files to be sorted in chronological order of data capture.

Data is collected twice a week since May 2013 (except for a gap in October 2013). To date, we had collected 250+ snapshots. We are sharing this dataset because detailed historical data is not available on NEA's website (only current information is shown).

The dataset is also available in a machine-friendly format known as Comma-Separated Values (CSV). Every PDF file is converted into CSV format. Each row in the CSV file represents a single location where dengue cases are reported.

To plot dengue locations on a map, the CSV files provide the latitude and longitude, which are not available in the PDF file. The following table shows the CSV schema (sample CSV file). For enquiries, please contact admin (at) sgcharts (dot) com.

Number Of Cases
Number of reported dengue cases at this location
Street Address
Street address where dengue cases are reported (down to the apartment block level)
Latitude of the street address
Longitude of the street address
Cluster Number
Every dengue cluster is labelled with a serial number. However, this serial number cannot be used as an unique identifier because (1) the serial number is reused in other snapshots and (2) the serial number will change throughout the cluster's lifetime.
Recent Cases In Cluster**
Number of dengue cases with onset in the last 2 weeks
Total Cases In Cluster
otal number of dengue cases reported in this cluster
Date string in YYMMDD format
Month Number
Index number of the month, where 1=January and 12=December

**NEA published the count of recent cases per cluster only from December 2013 onwards. For prior data, this field is substituted with a placeholder value of -1.

If you would like to use this dataset, please ensure proper attribution to the National Environment Agency's website ( Acknowledgement of SG Outbreak with a link back to this site is appreciated :)