Dengue fever is a disease caused by the mosquito-borne dengue virus. Like other Flaviviruses, it is primarily transmitted by the Aedes species of mosquitos. There are four common serotypes of dengue with a fifth announced in 2013. Typically, exposure to the dengue virus induces lifelong immunity to that serotype. However, it is hypothesized that exposure to a second serotype can result in Antibody-Dependent Enhancement (ADE) and the more severe disease, dengue hemorrhagic fever. Regardless of the cause, infection with a second serotype is more likely to produce severe disease, which has further complicated the development of a safe vaccine. Dengue is common in most tropical countries and descriptions of outbreaks date back to at least 1779. It is estimated that dengue virus infects 50-100 million per year and results in approximately 40,000 deaths.

Our Contribution

We applied a time-series based approach [Seasonal Autoregressive Integrated Moving Average (SARIMA)] to the forecasting of the monthly number of Dengue cases in Brazil, Mexico, Singapore, Sri Lanka and Thailand. We tested the approach using both state-level and national-level data and also looked into using climatic information (precipitation and temperature) to improve the forecast. We found that SARIMA models generally outperform a "NULL" model which is based on a ten-year historic average, out to the longest forecast time horizon considered of 4 months. Additionally, we saw that modest improvements are generally seen when including precipitation covariate data in the model. Our investigations demonstrated the importance of timely reporting in terms of the degradation in the quality of the forecast with increased delays in reporting. These findings are generally consistent with previous studies on smaller data sets.

Figure 1: One-month SARIMA predictions for Brazil Dengue incidence shown over a duration of two years. Each panel shows a different state (ordered by size from largest to smallest). Data are shown in black, predictions in red, and 5/95% confidence intervals are marked by the purple shaded regions.


Data for Dengue fever tends to be temporally coarse (monthly), but spatially more refined than other diseases. Most countries have data for their first level of administrative boundaries, and Brazil data goes all the way down to individual neighborhoods. Our dataset incorporates countries from South America, North America, Oceania, and Asia, but has the most coverage in Southeast Asia. We use many sources, but primarily data are pulled directly from national ministries of health. Data can be viewed, analyzed, and downloaded from here:

Future Plans

Currently, our team is re-focusing its efforts on better understanding the spread and severity of COVID-19.