~/writing/wsrepeater-sensor-calibration
My weather station lies about the sun
A cheap Ecowitt station reports UV and solar radiation that read high and jump around. The repeater that forwards it to Weather Underground smooths and scales those channels first, and never lets a slow upstream API stall the station.
I run an Ecowitt WS2320 weather station. It POSTs its readings on a fixed cadence, and a small Go service I wrote sits in the middle: it translates Ecowitt's format to the Weather Underground one and forwards it, and it serves a local dashboard so I'm not dependent on anyone else's site to see my own backyard. Two things needed solving before the data was worth publishing.
The sun is not that bright
The temperature, humidity, pressure, and wind off this station are fine. The two channels that aren't are UV index and solar radiation. The cheap photodiode reads high, and it jitters: minute to minute it swings more than the actual sky does. Forward the raw numbers and you're telling Weather Underground it's a brighter day than it is, with noise on top.
Two corrections, in order. First a moving average over the last few samples to take the jitter out, then a flat scale factor to bring the magnitude back to what the sky was actually doing.
const movingAverageWindow = 5
smoothedUV := utils.SmoothValue(uvValue, &uvValues, &uvMutex)
smoothedSolarRadiation := utils.SmoothValue(solarRadiationValue, &solarValues, &solarMutex)
// The sensor reads high. 0.94 was tuned until the numbers matched reality.
correctedUV := math.Round(smoothedUV * 0.94)
correctedSolarRadiation := smoothedSolarRadiation * 0.94SmoothValue is the boring half: keep the last five readings per channel, return their mean, drop the oldest as new ones arrive. The 0.94 is the part that only exists because I compared the station against reality and it was overreading by roughly six percent across the range. It's not a calibration curve, it's a single empirical constant, and that's the accurate description of it: a number I measured into place, not one I derived. UV gets rounded because Weather Underground wants an integer index; solar radiation stays fractional.
The ordering matters. Smooth first, then scale. Scale a noisy value and you've scaled the noise too; smoothing first means the constant is correcting a stable number.
Don't make the station wait
The second problem isn't the data, it's the plumbing. The station POSTs on its own schedule and expects a quick answer. If I forward to Weather Underground synchronously inside that request handler, then the station's POST is only as fast as Weather Underground's API on its worst day. A slow upstream, or a momentary stall, and the station is sitting there blocked on something that has nothing to do with it.
So the handler does only the cheap, local work: parse, smooth, correct, translate. Then it hands the finished payload to a buffered queue and returns. Workers drain the queue and deal with the slow remote POST on their own time.
const workerCount = 5
var jobQueue = make(chan url.Values, 100)
// ...inside the request handler, after correcting the values:
jobQueue <- wundergroundData // returns immediately; never blocks on WU
// started once at boot:
for i := 0; i < workerCount; i++ {
go worker()
}
func worker() {
for job := range jobQueue {
// the slow part: POST to Weather Underground, retry, log
}
}The buffer is a hundred deep, which is far more than the station will ever queue at its posting rate, so a Weather Underground hiccup gets absorbed instead of backing up into the station. The station always gets its fast 200; the slow path lives behind the channel.
Do the cheap work on the hot path, defer the slow
A request handler that calls a third-party API inline inherits that API's latency and its bad days. Split the work: whatever is local and fast (parsing, the smoothing, the scale) happens on the request; whatever is remote and slow goes on a queue with workers behind it. The caller you actually care about, here a weather station that just wants to be acknowledged, never pays for someone else's outage.
The dashboard borrows, it doesn't hammer
The local dashboard shows more than the station knows: moon phase, sunrise and sunset, an RSS feed, recent history pulled back from Weather Underground. Every one of those is someone else's API with its own rate limits, so each is cached on a TTL that matches how fast it actually changes. Moon phase refreshes hourly, the RSS feed every fifteen minutes, sunrise and sunset once and then held until midnight. Nothing re-fetches on a page load. The dashboard reads from cache and the cache refills on its own clock.
None of this is exotic. It's a translator with two measured corrections in front of it and a queue keeping a slow API off the critical path. But the result is a station whose published numbers match the sky, and a hub that stays responsive no matter what the services it leans on are doing.
The code is on GitHub.