Open and Accurate Air Quality Monitors
We design professional, accurate and long-lasting air quality monitors that are open-source and open-hardware so that you have full control on how you want to use the monitor.
Learn Moreby Anika Krause & Siriel Saladin on March 13, 2025
In September 2024, we published an article about PM2.5 and how the EPA correction algorithm improves the accuracy of AirGradient’s monitors. We observed that our sensors, the PMS5003 from Plantower, systematically overestimate PM2.5 when concentrations are above 10 to 20 µg/m3. The algorithm from the EPA ( equation 4) does a great job of bringing down the high readings. Meanwhile, we have implemented the algorithm for our outdoor and indoor monitors, as explained here. Today’s article looks again into the algorithm’s performance based on additional data from our global colocation project.
We started testing our outdoor OpenAir monitors in 2023 with various reference instruments in various locations around the globe. This effort is only possible thanks to the generous support of numerous local research partners. We are extremely grateful for these ongoing collaborations!
The last blog article (September 2024) covered nine outdoor locations in 6 countries with a total of 19 years of AirGradient PM2.5 data. Since then, our colocation data has doubled: today’s article covers 17 outdoor locations in 11 countries with 37 years of PM2.5 colocation data.
The table below provides information about the individual locations and reference instruments. The new locations are highlighted in blue.
Abbreviation | City | Country | Reference |
---|---|---|---|
anac | Anacortes | United States | BAM 1020 |
bell | Bellingham | United States | BAM 1020 |
bogo | Bogota | Colombia | Thermo FH62C14-DHS |
brus | Brussels | Belgium | Thermo 1405-DF |
chen | Chennai | India | BAM 1022 |
chia | Chiang Mai | Thailand | BAM |
edm1 | Edmonton (Greenwood) | Canada | Grimm 180 |
empa | Dübendorf | Switzerland | Palas Fidas |
guat | Guatemala City | Guatemala | Teledyne? (not verified) |
lon1 | London (Marylebone Road) | United Kingdom | BAM |
lon2 | London (Honor Oak Park) | United Kingdom | Palas Fidas 200E |
man0 | Manchester | United Kingdom | Quant Modular |
newy | New York | United States | Thermo TEOM 1405 |
otta | Ottawa | Canada | Not disclosed |
payn | Paynes Prairie | United States | Teledyne T640x |
sydn | Sydney | Australia | Thermo 5014i |
ucam | Cambridge | United Kingdom | Palas Fidas 200S |
vand | Vanderbijlpark | South Africa | Grimm |
* Collocation in Manchester UK took place indoors with six ‘One’ monitors from AirGradient and one ‘Modulair’ from QuantAQ as a reference.
The appendix of this article shows individual scatter plots for every location. It further includes R2, intercept, slope, MAE, RMSE, nRMSE, and mean for every single sensor. The raw reference and AirGradient data for all locations will also be available soon in three versions: hourly and daily averages and higher time-resolution where available.
Below are all 12513 daily PM2.5 data points visualized in a scatter plot. Raw data is shown on the left, and the EPA-corrected data is on the right. This data excludes the Manchester data due to the lack of a FEM reference.
Note that the correlation results shown in the two scatter plots above consider every single data point equally, meaning that locations with more data have more weight on these numbers.
The table below describes the accuracy of our outdoor OpenAir monitors before (‘raw’) and after (‘corrected’) applying the EPA correction algorithm for PM2.5. It is based on daily averages. The R2 value describes how well the AirGradient monitors correlate with the collocated reference instrument – the closer R2 is to 1, the stronger the correlation. nRMSE indicates the relative error of the AirGradient monitors, assuming the reference instruments are 100% correct – the lower the nRMSE, the better the agreement. The last column shows whether and by how much the EPA correction algorithm has improved the accuracy of our monitors.
Improvement factor = nRMSE (raw) / nRMSE (corrected)
If the factor is greater than 1, the error of the EPA-corrected measurements is lower than the error of the raw data, i.e. the accuracy has improved, and the larger the improvement factor, the better the improvement. If the factor is below 1 the EPA formula introduces a larger error, i.e. a reduced accuracy.
Location | N | R2 (Raw) | R2 (Corrected) | nRMSE (%) (Raw) | nRMSE (%) (Corrected) | nRMSE improvement (factor) |
---|---|---|---|---|---|---|
anac | 528 | 0.85 | 0.89 | 89 | 34 | 2.6 |
bell | 899 | 0.775 | 0.755 | 67 | 44 | 1.5 |
bogo | 69 | 0.93 | 0.955 | 18 | 31 | 0.6 |
brus | 156 | 0.907 | 0.953 | 136 | 46 | 3.0 |
chen | 100 | 0.97 | 0.98 | 52 | 15 | 3.5 |
chia | 348 | 0.99 | 0.982 | 34 | 14 | 2.4 |
edm1 | 251 | 0.97 | 0.97 | 89 | 39 | 2.3 |
empa | 831 | 0.783 | 0.79 | 105 | 39 | 2.7 |
guat | 42 | 0.93 | 0.975 | 79 | 100 | 0.8 |
lon1 | 2191 | 0.798 | 0.845 | 56 | 32 | 1.8 |
lon2 | 1682 | 0.83 | 0.845 | 107 | 37 | 2.9 |
newy | 132 | 0.727 | 0.833 | 96 | 33 | 2.9 |
otta | 567 | 0.93 | 0.97 | 105 | 29 | 3.6 |
payn | 10 | 0.78 | 0.79 | 84 | 14 | 6.0 |
sydn | 122 | 0.7 | 0.68 | 41 | 46 | 0.9 |
ucam | 2358 | 0.875 | 0.91 | 77 | 26 | 3.0 |
vand | 2227 | 0.893 | 0.908 | 108 | 59 | 1.8 |
Average | 0.861 | 0.883 | ~79 | ~38 | ~2.5 |
The mean absolute error (MAE, shown in the appendix) improved from 7 to 3 µg/m3 as a result of the algorithm (average across all sensors). In other words, the differences between the reference and AirGradient readings were, on average 3 µg/m3 (after correction). Note that this metric is greatly influenced by episodes with high PM2.5 concentrations, so it is not necessarily accurate for low PM2.5 readings.
In the table above, we find that the average nRMSE improved from 79 % to 38 % upon correction with the EPA algorithm. This is similar to last year’s results (86 % to 34 %). Again, an average improvement factor of 2.5 was found.
While we observed an improved nRMSE in 9 of 9 locations last year, we have found improved factors in only 14 of 17 locations this time. In three locations (Sydney, Guatemala City, and Bogota), we have observed a worsening of the monitor’s accuracy when applying the EPA algorithm and using nRMSE as a metric (daily averages). This prompted us to investigate these locations in more detail.
We have 122 days of data where an AirGradient OpenAir monitor is collocated in Sydney (Australia) with a Thermo 5014i reference instrument. The scatter plot below shows the 122 days of data (raw – no correction applied).
The reported PM2.5 concentrations in Sydney were mainly below 10 µg/m3.
The reference instrument is based on beta attenuation, which has been reported to be subject to increased noise when PM2.5 is low. We observed this behavior for all locations where we have beta attenuation references at such low concentrations (see scatter plots for Anacortes, Bellingham, and London Marylebone Rd in the appendix of this article). At these concentrations, a substantial fraction of the difference between reference and low-cost sensor is caused by the reference’s noise. If only looking at PM2.5 concentrations higher than 10 µg/m3, we find an improvement through the EPA algorithm in Sydney.
We have two sensors collocated with a reference in Guatemala City. Each sensor gave us 21 days of colocation data, where reference data was available. The plot below shows a timeline with hourly data of the 2 x 21 days. The reference is shown in red, while the two AirGradient monitors are shown in blue. We currently cannot verify what reference instrument is in place here, so this location should be interpreted with care.
At the beginning of the 22 days, we can see that the AirGradient monitors tended to report higher readings than the reference. The opposite is the case at the end of the 21 days, where very high readings of >100 µg/m3 were reported by all three instruments. We find it hard to evaluate which of the two behaviors is more representative of this location, given that only 21 days are available. Note that both metrics MAE and RMSE are more sensitive to high than low PM2.5 concentrations, implying that the end of the 21 days dominate MAE and RMSE.
Two OpenAir monitors are collocated with a Thermo FH62C14-DHS in Bogota (Colombia). We have 38 and 31 days of collocated PM2.5 data, partly shown in the timeline below (hourly data). The reference is in red, and the raw readings of the two AirGradient monitors are shown in blue.
In Bogota, we already observe a good agreement between the raw AirGradient readings and the reference. The EPA algorithm decreases the PM readings and leads to an increased difference between AirGradient and reference. We have no verified explanation for that. It could be related to the reference instrument or the ambient aerosol. Note that Bogota is the only location where we test with a Thermo FH62C14-DHS, so we do not know how this reference normally compares to our monitors.
We have found a document from the US EPA where daily PM2.5 readings from a Thermo FH 62 C14 were compared with FRM results. FRM refers to ‘Federal Reference Method’ and can be considered to be the most accurate. According to that document, the Thermo instrument overestimates the PM2.5 concentrations by 60% in a chamber test with ammonium sulphate (Phase I, cf. Landis et al.). In other words, the Thermo instrument itself seems to overestimate, which needs to be considered.
Fun fact: We interpret the document above as the Thermo having overestimated PM2.5 on average by a factor of 1.6 for ammonium sulphate. We normally observe such an error for our own monitors, which would explain the good agreement with the Thermo.
It could be that the Plantower PMS5003 sensors are factory calibrated using a Thermo FH 62 C14 or another reference that overestimates ambient PM in a similar fashion. This could explain the systematic overestimation of PM readings from the Plantower sensors.
Today’s article repeated the analysis from last September to assess the performance of EPA’s correction algorithm for raw PM2.5 readings from AirGradient. This time, we have used more data from more locations.
The findings from last year could be confirmed: the algorithm significantly improved our PM2.5 sensor accuracy, at least in 14 of 17 outdoor locations. In three locations (Sydney, Bogota and Guatemala City), an increased average difference between AirGradient and reference was observed as a result of the algorithm. However, this may be attributed to limitations of the reference instruments (high noise in the reference signal in Sydney, unknown reference instrument in Guatemala, and overestimating reference instrument in Bogota).
Today’s analysis emphasizes three aspects:
The Appendix can be downloaded here.
The data this article is based on will be available for download soon.
A big thank you goes to everyone who supported this study by co-locating our sensors and providing reference data:
Curious about upcoming webinars, company updates, and the latest air quality trends? Sign up for our weekly newsletter and get the inside scoop delivered straight to your inbox.
Join our NewsletterWe design professional, accurate and long-lasting air quality monitors that are open-source and open-hardware so that you have full control on how you want to use the monitor.
Learn More