Remio is built by a small team in public. This is the working log: capture paths we tore apart, encoder settings that surprised us, latency budgets we busted and rebuilt. Newer entries are at the top. Older entries stay as they were written, even when we have since changed our minds.
Last updated · 2026-05-19 · 7 entries
Research notes
The engineering log, in chronological order.
Each entry below is dated, statused, and self-contained. Status reads as shipping when the change is live in the latest build, experiment when we are still measuring, and archived when we have moved on.
Shipping
Virtual display now follows the client window.
For weeks the macOS host kept its virtual display fixed at the client screen resolution, which meant the host cursor and the client cursor disagreed about size the moment either changed. We rewrote the virtual-display manager so that the host workspace follows the client window canvas in real time. Cursor sizes now match across the boundary at any window size, the way Apple Screen Sharing has always behaved.
Why we capped the virtual-display envelope at 3840 pixels.
The internal macOS framework that builds display modes for virtual displays has a quietly destructive behaviour: when given a max dimension of 4096 or above, it invents a phantom 2× Retina mode at the maximum and steals the active flag. Requested pixel count doubles. We confirmed this empirically over four nights of probe-and-log. Three thousand eight hundred and forty fits every Retina MacBook fullscreen plus a 4K external, and the framework stops synthesising. We are still watching the long tail.
empirical breakage at ≥ 4096 · cap = 3840 · covers MacBook Pro 16″ Retina (3456 × 2234) and 4K external (3840 × 2160)
Shipping
Reverting to the previous streaming engine build after a one-week experiment.
We rebuilt our streaming engine against a newer branch hoping the updated pacing logic would unlock a few milliseconds on the local-network path. Instead the receive buffer regressed on every remote session we measured: median end-to-end climbed by 18 ms and the variance widened. We reverted to the stable build across iOS, macOS, Android, and Windows. The new-branch patches are archived in case the upstream behaviour changes.
new build median latency · 64 ms remote · previous build median · 46 ms remote · tuning flags unchanged
Shipping
Adaptive local-network versus remote bitrate detection from the first round-trip sample.
The host now waits for the first round-trip-time measurement after the connection comes up, then picks one of two bitrate envelopes. Below 15 ms the connection is treated as local-network direct: start 8 Mbps, minimum 8 Mbps, maximum 30 Mbps. Above 15 ms it is treated as relayed or distant: start 2 Mbps, minimum 300 Kbps, maximum 15 Mbps. The threshold sits between same-network WiFi and any plausible cellular or cross-region relay, so false positives have been rare in beta telemetry.
local direct · 1 – 5 ms · WiFi same-net · 5 – 10 ms · regional relay · 30 – 80 ms · 4G · 40 – 100 ms
Shipping
Hardware H.265 on AMD as the Windows-host primary encoder.
After a long bake-off against competing encoders, hardware H.265 on AMD won every Windows-host benchmark we cared about: lower median encode time, lower keyframe spike, and the ultra-low-latency constant-bitrate preset behaves predictably under buffer pressure. H.264 stays as the fallback when the GPU disagrees. B-frames are off. Low-delay flag is on. The encoder always emits exactly one frame per capture tick.
A remote desktop is not a video call. Every frame is the latest screen state. An old frame is, by definition, a lie. We rebuilt the receive path so that there is no playout buffer: a frame arrives, it decodes, it renders, in that order, on the same frame interval. Lost frames request a fresh keyframe rather than a retransmit of the old one. We accept the occasional brief freeze. We do not accept the steady millisecond tax of a queue.
forced playout delay · min 0 / max 0 · minimum playout delay · 0 · keyframe request on loss · no packet retransmits
Archived
Latency budget, written down for the first time.
We sat in a room and worked backwards from the number we wanted users to feel — under twenty milliseconds, glass-to-glass, on a typical local-network session. Capture, encode, pace, send, receive buffer, decode, render. We wrote each one down and gave it a millisecond budget we could measure against. Most of the entries in this log come from defending those numbers in production traffic.
capture · 1 ms · encode · 3 ms · pacing burst · ≤ 10 ms · receive buffer · 0 ms · decode · 2 ms · render · 1 ms
Signal envelope
What we measure, session after session.
These four numbers travel in every release note because they describe what the connection actually feels like — not what it claims on a spec sheet.
Stream resolution
4Knative pixel envelope, 1× or 2× Retina chosen per window
Frame rate
60 fpsfixed cadence, B-frames disabled, one capture tick per frame
Glass-to-glass latency, local-network
< 5 msmeasured median on a direct device-to-device connection
Transport encryption
End-to-endbank-grade encryption with a fresh session key every time
Follow the build, not the brochure.
The lab updates when something lands or something breaks. If that sounds more useful than a quarterly newsletter, the download link is one click away — and so is the security whitepaper if you want the long version.