We cross climate and epidemiological data through Machine Learning to forecast dengue outbreaks, offering public health managers precise insights to direct interventions and health policies more efficiently.
Since 1981, dengue has established itself as endemic in Brazil. The complexity of factors makes control and prediction extremely challenging.
The serotypes DENV-1 through DENV-4 circulate across the country. Infection by one type does not provide immunity against the others, keeping the population vulnerable.
In 2019, a new DENV-2 variant caused an increase of up to 149% in cases in some Brazilian states.
The tropical climate provides ideal conditions for Aedes aegypti reproduction, intensified by improper waste disposal and lack of sanitation.
We apply the most modern in Machine Learning. Recent studies already demonstrate that crossing environmental data can forecast dengue outbreaks.
We incorporate tools like AutoML and TPOT, which uses genetic programming to explore thousands of possibilities and find the ideal forecasting configuration.
Beyond accelerating predictive model production, our approach aims to democratize AI use, making Machine Learning accessible to a broader audience.
Central objective: to offer a faster, optimized and efficient method to forecast dengue outbreaks before they occur.
Our project doesn't work with guesswork; it's grounded in a decade of rigorously processed historical data.
Complete history of dengue cases in Presidente Prudente (SP) between 2014 and 2024, analyzed week by week through epidemiological weeks.
Hourly meteorological data from the National Institute of Meteorology for the same period of 10 years: temperature, humidity, precipitation and atmospheric pressure.
Climate data was carefully treated to fill gaps and adjusted to weekly averages to perfectly synchronize with dengue medical records.
All results presented were obtained exclusively from publicly available data from conventional INMET weather stations — with no proprietary stations deployed in the field. This means the 73.5% accuracy represents a conservative baseline: with dedicated local sensors capable of capturing microclimates and hyperlocal variations, the model's predictive performance is expected to improve significantly.
With thousands of data points, we use Principal Component Analysis to reduce complexity and reveal the variables with the greatest impact.
Dew point temperature and accumulated rainfall over several weeks (4-8 weeks) are the strongest factors. The climate from weeks ago dictates today's outbreak.
Marked fluctuations in temperature and atmospheric pressure dictate the daily and seasonal rhythm of the vector.
Abrupt variations (intense droughts or sudden torrential rains) function as rapid triggers for the disease.
Nuances in wind behavior and relative humidity complete the puzzle of Aedes aegypti proliferation.
Average wind speed and maximum gusts reveal how ventilation affects mosquito dynamics.
The true value of data lies in how it is used. Our tool transforms raw records into strategic intelligence.
More precise and timely insights for public health authorities, enabling action before the outbreak spreads.
Awareness campaigns and control measures applied in the right locations and times, before the outbreak occurs.
Integration between epidemiology, climatology and data science to mitigate the impact of vector-borne diseases.
Any disease whose incidence varies with climate can be forecast using the same approach. The engine is generic; the data changes. For the pharmaceutical industry, this means anticipating demand weeks in advance.
Dengue, zika, chikungunya, urban yellow fever. Same vector, same climate logic.
Flu, bronchitis, seasonal asthma, pneumonia. Winter peaks predictable by climate patterns.
Rhinitis, conjunctivitis, atopic dermatitis. Humidity and temperature dictate demand for antihistamines.