vnc_worker: multi-client support, device dirty tracking, and docs#3233
Conversation
|
Thanks for your work on the VNC support very much! Could you please mention that MobaXterm works now? Before your changes, it wouldn't. Seems important as MobaXterm has got more chances to be trusted/allowed to use due to its easier to check provenance. |
You're welcome - it's just an unintended side effect of sticking to the standards. MobaXterm seems to have become the unofficial standard in the Windows sysadmin world and is widely adopted among sysadmins and IT folks working on Windows, the homelab community, corporate networks where Windows clients are the norm but Linux servers need managing, and universities / research institutions (where it's often the recommended tool for students). Given that, I think it's worth explicitly mentioning it as a compatible client. That said, the last word is of course Microsoft's |
|
@jstarks - bump. Can we merge please, then we should discuss how to handle keyboard mappings for RealVNC (or not) and special character pasting, as well as iron out issues or optimisations You might find. That PR is big enough now, I dont want to add on top of it - but the requested feature (multiclient, dirty region tracking) needed a big, atomic PR in order to test and implement everything properly - anything You might not like, we can address after that in hopefully much smaller PR´s. |
|
Apologies for the unsolicited advice in advance if it annoys you (especially its size), certainly not my intention! I'm just looking from the sidelines hoping to get MobaXTerm support in OpenVMM, don't mean to be presumptuous :) Merging That said, the amount of commits and the speed of producing the content in this PR is likely higher than the average speed of consuming/reviewing content by humans. So again, squashing and rebasing on the top |
No worries at all - different repos, different rules. Some projects I contribute to actively discourage squashing and rebasing, so I just defaulted to my usual habits. Happy to clean it up if it makes the reviewers' lives easier. |
Reworks the VNC worker to support multiple simultaneous clients on a single listening socket, adds IPv6 dual-stack binding, and lands a collection of input-path, encoding, and documentation fixes. Server - Per-client accept loop replaces the single-client server. --vnc-max-clients caps concurrent connections; --vnc-evict-oldest optionally drops the oldest client when a new connection would otherwise be refused. Client counting uses abort_senders (not the clients list) to avoid transient overshoot during eviction. - --vnc-listen accepts an IPv6 literal, a hostname, or "::" for dual-stack binding; --vnc-port remains a shortcut over the default 127.0.0.1 bind. IPv6 parsing and dual-stack error handling corrected. - VNC CLI options grouped into a VncCli struct under openvmm_entry. Framebuffer and encoding - 16x16-tile DirtyBitmap underpins the framebuffer scan; dirty tiles are merged into rectangles via a QEMU-style single-pass greedy algorithm (extend right, extend down while columns stay dirty, emit, clear, continue). Tests cover bump, vertical-gap, and checkerboard patterns. - Cursor-only update busy loop fixed. Rect count truncation, shift overflow, and stale region tracking corrected. Framebuffers with more than 8 bits per channel are rejected instead of silently truncating. - Dirty-channel close no longer freezes the update loop; tile_diff fallback picks up when the device-provided dirty channel goes away. Input - F2-F10 scancode off-by-one corrected. - Modifier keys released on both sides (left/right shift) for keysym paste and QEMU-style paste flows. Supporting changes - uidevices: video channel process extracted into a standalone function. - parking_lot::Mutex replaces std::sync::Mutex in test helpers per the workspace-wide disallowed_types lint. Documentation - New VNC reference pages under the Guide: architecture overview, keyboard handling, what-we-are-proud-of, updated graphical_console and management CLI pages. - Tested clients include MobaXterm. - CopyRect documented as not planned with rationale.
477afb9 to
a0395c9
Compare
|
@kromych - done. Squashed and rebased for a cleaner diff. Won't touch it again until merged.
|
|
whow, that works amazingly good. Youtube on 4 different clients, fullscreen, in parallel (RealVNC, TigerVNV, MobaXterm, noVNC) works just fine. no artefacts. |
@bitranox Qemu is a GPL-licensed project, and OpenVMM is not. So borrowed how exactly, was it a clean room/independent reimplementation or closer to a line-by-line translation? Please refer to the exact lines from which that was borrowed and where it went over here. Kindly disclose any other projects from which the code was borrowed referencing lines over there and over here. |
independent reimplementation of the algorithm, not a translation. I can't cite exact QEMU line correspondences because I didn't look at those lines during implementation. What's nonetheless similar by construction:
I cross-checked right now what's definitely different (expression level):
Other projects consulted: none during implementation. The TigerVNC / libvncserver descriptions are algorithmic summaries, not code I borrowed. Minor honesty correction I should make The doc comment at dirty_bitmap.rs:223 calls the algorithm "QEMU-style". That's informally fine, both are greedy scan-right-then-down-then-clear, but strictly speaking ours is stricter than QEMU's (no over-coalescing). Other projects No code was consulted from TigerVNC, libvncserver, UltraVNC, or any other VNC server during implementation. My earlier descriptions of their approaches were algorithmic summaries, not borrowings. Licensing posture QEMU is GPLv2; openvmm is MIT. Algorithms are not copyrightable, so algorithmic similarity is fine. Expressive similarity would be the concern; the evidence above - different semantics in the vertical check, different decomposition, different tokens, different control flow, different storage strategy supports this being a clean reimplementation rather than a derivation. |
@bitranox Could you resolve the apparent contradiction here, I might be confused: you didn't read the qemu code but you have just cited exact lines from the codebase, how's that possible? |
because I cross checked after Your valid concerns. I ment, I did not look at Qemu code during implementation. BTW - MobaXterm shares the same behaviour with RealVNC - only US Keyboard, no extended scancodes. |
@bitranox I'm very pleased that you are investing in supporting MobaXterm that is indeed very important to me. Please test it extensively at the limits of you capacity, consider reporting to me every hour on the progress. Thank you. One small clarification: who and when taught you the details of the qemu vnc support? You didn't read the code, and implemented the logic with impressive speed and precision Rust which you had never used before as your GitHub profile shows. |
can I ask You why that is so important to you ? but there is nothing really to report yet, it just works. The only open issue is the keyboard: specifically, how to type and paste non-ASCII characters into the guest (for pasting passwords and the like), ideally without requiring any guest-side software. That still needs some research and probably keyboard mappings in openvmm. I also see VNC not as a method to really work on such VM´s - its for quick administration or setup without any special guest software, but there are probably better ways and protocols like rustdesk, rdp, chrome remote desktop, AnyDesk, TeamViewer, you name it.
I have private and commercial organizations and projects as well. I also run my own Gitea Servers where my commercial code lives - But yes, Rust is relatively new to me so I take every help or advice I can. But I'm a seasoned - though not full-time - developer; I started with 6502 and Z80, which should give you a sense of my vintage. Ever started Turbopascal from Floppydisk ? I've done a comprehensive analysis of the algorithms used in other VNC server implementations with copilot and drafted a plan for how they can fit into the OpenVMM codebase. The biggest win came from jstark's hint about dirty rectangles from the Hyper-V device - not exactly my home turf, as I'm primarily a Linux person, but it opened up the right direction, saving a lot of cpu cycles and VRAM Memory Access. If You like my rust code - great and let me know. If there is smth not right, also let me know. Honest, ruthless roasting is no problem for Austrians (we are like Germans, only a bit more flexible). |
|
@kromych - do You know if MobaXterm supports ContinuousUpdates ? |
@bitranox It's important to me as otherwise I'd be devastated if you didn't. Very important for us Austrians not to be, right? New important data just in, we have to act decisively now:
|
for more - or off topic - discussions You might consider to contact me directly. I am easy to find. |
|
@jstarks - a comment on that would be nice, if we should keep it open / discard / change smth ? |
|
@kromych @smalis-msft - a comment on that would be nice, if I should keep it open / discard / change smth ? |
|
Apologies, we are just very busy. I'll poke John to take a look. |
874262c to
a0395c9
Compare
|
bump ACKd |
- .gitignore: remove .idea/ and private planning doc entries (moved to a
.gitignore one directory above this repo so they don't leak upstream).
- video/mod.rs: move `use video_core::DirtyRect` to the top with the
other imports (was mid-file after parse_packet).
- openvmm_entry: try SocketAddr first, fall back to bare IP. Reads more
naturally ("parse the full form first") even though the two formats
are disjoint so order is correctness-neutral.
- openvmm_entry/cli_args: add missing doc on `vnc: VncCli` field.
- vnc: minor fmt fixups picked up by cargo fmt.
The multi-client VNC PR added `dirt_send` to `SynthVideoHandle` and `dirty_recv` / `max_clients` / `evict_oldest` to `VncParameters`, but only updated the `openvmm` caller. The CI quick-check job builds the whole workspace and failed compiling `underhill_core` and `petri`. - underhill_core: add `dirt_send` to `UnderhillRemoteConsoleCfg`, create a `mesh::channel()` in `new_underhill_remote_console_cfg`, route the receiver into `VncParameters` (defaults: max_clients=16, evict_oldest=false, matching the openvmm CLI defaults). OpenHCL now gets dirty-rect forwarding from the synthetic video device to the VNC worker, same as openvmm. - petri: pass `dirt_send: None` in the integration-test VM builder; petri does not run a VNC server.
The crate-level `//!` doc in `workers/vnc_worker/vnc/src/lib.rs` linked to `Encoder`, `UpdateState`, and `ClientState` with intra-doc-link syntax. All three are private structs, so rustdoc under `-D warnings` rejects the links as "public documentation links to private item" — breaking the upstream "build and check docs" CI job. Replace those three links with plain backtick mentions. `Server` (which is `pub`) keeps its `[`Server`]` link.
| view: Arc<Mutex<ViewWrapper>>, | ||
| /// Cloneable input sender -- each client gets its own clone. | ||
| input_send: mesh::Sender<InputData>, | ||
| /// Dirty rectangles from the synthetic video device. None if no video |
There was a problem hiding this comment.
If no video device is configured, what are we even doing here? What would we be displaying?
There was a problem hiding this comment.
The None case is actually the PCAT / non-synth-video path: with --vnc --pcat (or any guest where the framebuffer is driven by VGA BIOS / UEFI GOP instead of the SynthVideo device), the VNC server still has a framebuffer to display, it just doesn't receive dirty-rect hints, and falls back to whole-framebuffer tile-diff scanning. So None means "no synth video device producing dirty hints", not "no video at all." I'll tighten the doc comment to say that.
| /// its own pending bitmap -- no shared state, no clearing issues. | ||
| dirty_senders: Vec<( | ||
| u64, | ||
| futures::channel::mpsc::Sender<Arc<Vec<video_core::DirtyRect>>>, |
There was a problem hiding this comment.
Why a futures channel instead of mesh?
There was a problem hiding this comment.
this needs bounded backpressure with non-blocking sends. mesh::Sender is unbounded and infallible, so a stuck client would let the broadcast channel grow without limit. The workspace clippy.toml already flags futures::channel::mpsc::channel with the reason "use mesh or async-channel"; the right choice here
is async-channel (bounded), not mesh. I've switched the per-client channel to async_channel::bounded(4) and dropped the #[expect(clippy::disallowed_methods)]. The behavior is the same: when try_send returns Full, the coordinator sets missed_dirty and the client falls back to a full refresh on the next pass.
| /// Per-client dirty rect senders. The coordinator broadcasts device rects | ||
| /// to all clients via these channels. Each client accumulates rects in | ||
| /// its own pending bitmap -- no shared state, no clearing issues. | ||
| dirty_senders: Vec<( |
There was a problem hiding this comment.
Make this tuple a proper struct with field names.
There was a problem hiding this comment.
Agreed, replaced with a named struct. Updated the three destructure sites (retain on eviction, retain on disconnect, and the broadcast loop) to use field access.
| @@ -38,7 +75,7 @@ pub enum Error { | |||
| #[error("resolution changed but client does not support DesktopSize")] | |||
| ResizeUnsupported, | |||
| #[error("zlib compression failed")] | |||
| ZlibCompression, | |||
| ZlibCompression(#[from] flate2::CompressError), | |||
There was a problem hiding this comment.
nit: use #[source] and explicit .map_err calls instead of from.
There was a problem hiding this comment.
Done. Switched ZlibCompression(#[from] ...) to #[source] and changed the single propagation site (Encoder::append_zlib) to .map_err(Error::ZlibCompression)?.
Io(#[from] std::io::Error) is still in the same enum i can convert it the same way if you want; left it for now since the comment was specifically on ZlibCompression.
If you'd prefer the explicit style for the whole Error enum, I'll do Io in the same commit.
| /// (need sending to a VNC client). At 1920x1080, this is 120x68 = 8160 tiles, | ||
| /// fitting in ~1KB of bitmap data. | ||
| pub struct DirtyBitmap { | ||
| bits: Vec<u64>, |
There was a problem hiding this comment.
Would using a BitVec make things cleaner here?
There was a problem hiding this comment.
Agreed, it would. bitvec is already a workspace dep (and your fork at that), so this is canonical. I'll refactor mark_rect's mask/shift block becomes a range fill(true), the custom BitIter collapses to iter_ones(), mask_trailing_bits disappears since the BitVec is sized exactly. Existing tests should keep passing, and I'll re-verify the hot path (merge_into) doesn't regress.
Five review threads on microsoft#3233, bundled into one commit. * Tighten Option<...> doc comments on the dirty-rect channel fields (VncParameters.dirty_recv, SynthVideoHandle.dirt_send, UnderhillRemoteConsoleCfg.dirt_send). The original "None when no video device" wording was misleading. None actually means "no synth video device (for example, --pcat or VGA BIOS guests); the VNC server still has a framebuffer and falls back to whole-framebuffer tile-diff scanning." * Replace `Vec<(u64, Sender, Arc<AtomicBool>)>` in MultiClientServer with a named struct, ClientDirtySender { id, sender, missed_dirty }. Updated the three destructure sites. * Switch the per-client dirty-rect broadcast channel from `futures::channel::mpsc::channel(4)` to `async_channel::bounded(4)`. This honors the workspace clippy lint ("use mesh or async-channel"), drops the `#[expect(clippy::disallowed_methods)]` annotation, and preserves the bounded-with-try_send plus missed_dirty fallback. Same switch in the matching DirtyRectReceiver type in the vnc crate, plus the two unit tests that construct senders directly. * Error::ZlibCompression: replace `#[from]` with `#[source]` and use an explicit `.map_err(Error::ZlibCompression)?` at the single propagation site (Encoder::append_zlib). Makes the conversion grep-able rather than hiding it behind `?` propagation. * Replace the hand-rolled `Vec<u64>` dirty bitmap with `BitVec<u64, Lsb0>` (bitvec is already a workspace dep). The BitIter adapter and mask_trailing_bits helper disappear, mark_rect's mask/shift block collapses to `bits[first..=last].fill(true)`, and or_from / take_from collapse to a `|=`. Net -81 lines in dirty_bitmap.rs, all 36 unit tests pass unchanged. Verified with `cargo fmt --all -- --check`, `cargo clippy --workspace --all-targets`, `cargo doc --workspace --no-deps`, and `cargo test -p vnc -p vnc_worker` (46/46 pass).
The VNC server still needs several rounds of work before it is production-worthy: authentication (VNC password / VeNCrypt), additional encodings (Tight, ZRLE, CopyRect), WebSocket transport, LED state propagation, continuous updates, and test coverage for the multi-client coordinator's eviction / abort paths. Each of those will land as its own PR. Keeping all of that in a single 2296-line lib.rs would mean every follow-up diff scrolls past concerns it does not touch, and every review covers more surface than the change needs. Split the file along concern boundaries so each future PR is a small diff in a focused module. * error.rs: pub Error enum * traits.rs: pub Framebuffer, pub Input * pixel.rs: pub(crate) PixelConversion + pixel-format tests * encoder.rs: pub(crate) Encoder, build_cursor * update_state.rs: pub(crate) UpdateState, DirtyResult, DirtySource * server.rs: pub Server, pub Updater, pub(crate) ClientState, the RFB protocol state machine, and the e2e wire tests Public API is unchanged: Error, Server, Updater, Framebuffer, Input, DirtyRectReceiver, DirtyBitmap, Rect remain importable as vnc::*. No behavior or doc-comment edits; items used across module boundaries were raised from private to pub(crate). Tests stayed in the module they exercise; the e2e tests that speak the RFB wire protocol via the private rfb module stayed in server.rs. lib.rs: 2296 -> 62 lines. Verified with `cargo fmt --all -- --check`, `cargo clippy -p vnc -p vnc_worker --all-targets`, `cargo doc -p vnc -p vnc_worker --no-deps`, and `cargo test -p vnc -p vnc_worker` (46/46 pass).
The Updater handle's nudge channel (capacity 1, used to wake the
server's main loop) still used `futures::channel::mpsc::channel(1)`
with `#[expect(clippy::disallowed_methods)] // TODO`. Switch it to
`async_channel::bounded(1)`, matching the per-client dirty-rect
broadcast channel and the workspace clippy lint ("use mesh or
async-channel").
* Update `update_recv` / `update_send` field types and the
`Updater(...)` tuple from `mpsc::{Receiver,Sender}<()>` to
`async_channel::{Receiver,Sender}<()>`.
* Drop the `#[expect(clippy::disallowed_methods)] // TODO` line.
* Receive site: `update_recv.select_next_some()` becomes
`Box::pin(update_recv.recv()).fuse()` since async_channel::Receiver
is not Unpin and futures::select! requires FusedFuture. A Closed
error is unreachable in practice (Server owns its own Sender for
its lifetime); the loop body checks pending_dirty regardless.
* Remove now-unused imports `futures::channel::mpsc` and
`futures::StreamExt`.
No more `#[expect(clippy::disallowed_methods)]` in the vnc crate.
Verified with `cargo fmt --all -- --check`, `cargo clippy -p vnc -p
vnc_worker --all-targets`, `cargo doc -p vnc -p vnc_worker --no-deps`,
and `cargo test -p vnc -p vnc_worker` (46/46 pass).
a33c12e to
95306b6
Compare
…ng_docs Both crates carried a crate-level `#![expect(missing_docs)]` that masked real gaps on the public surface. Remove the attribute and add the doc comments the lint was warning about. * `vnc_worker_defs`: add a crate-level `//!` doc explaining the worker boundary; add docs to `VNC_WORKER_TCP` (TCP listener, used by openvmm) and `VNC_WORKER_VMSOCKET` (vmsocket listener, used by openhcl's paravisor). * `vnc::Rect`: document the four pub fields (x, y, w, h). * `vnc::Error`: brief doc on the enum and on each of the eight variants describing when the error fires. * `vnc::traits::Framebuffer` and `vnc::traits::Input`: docs on the four trait methods (resolution, read_line, key, mouse) covering the inputs, outputs, and pixel/scancode formats. * `vnc::server::Updater` (struct + `update` method) and `vnc::server:: Server` (`new`, `updater`, `done`): docs covering the lifecycle and the dirty-rect/fallback-tile-diff distinction in `Server::new`. No behavior change. Pre-push pipeline green: fmt --all --check, clippy --workspace --all-targets, doc --workspace --no-deps, and test -p vnc -p vnc_worker (46/46 pass).
|
@smalis-msft : Thanks again for taking the time on this one. I know it grew larger than ideal, and I appreciate the careful review. Would you consider merging at this point? My reasoning for asking now rather than folding more in: I have a queue of follow-up work that I'd much rather send as small, focused PRs on top of a merged base than keep stacking onto this branch. I deliberately left that (and many other items) out of this PR to keep it as a clean base. Merging this lets the follow-ups stay smaller and self-contained, which should make them much easier to review than this one was. No rush, and happy to address anything else you'd like changed first. yours sincerely |
|
Thanks so much for the work here, and for your patience! My apologies again for taking so long, things are just very hectic here right now, but your contributions are absolutely welcomed and appreciated. |
|
thank You very much, its a pleasure to work with you guys. |
|
Thanks for your continued contributions here. One thing I'd like to see in a subsequent change (either by you or by us, eventually) is an improvement to the dirty rectangle scheme. There are two issues:
I had been thinking that perhaps dirty rectangles should be serialized into a circular buffer allocated as a memfd/pagefile-backed section and shared across processes with mesh. And we'd have some scheme for broadcasting the latest offset into the infinite-length virtual buffer. If a client (e.g. the VNC server) falls behind (i.e. it misses more dirty rects than the length of the buffer), it refreshes the whole screen to resynchronize. |
|
I will look into that. What the standalone GUI does better then the pretty good optimized VNC Clients or any other remote tool thats out there ? I dont see the usecase, but maybe thats only me ;-) |
Summary
This started as "a little VNC patch" and somehow turned into 2980 lines + tests. I blame
scope creep@jstarks and the fact that VNC is deceptively simple until you actually try to do it right. You requested Device-driven dirty region tracking and multi-client, and here we are.Core changes:
--vnc-max-clientsvnc-max-clients, new clients will be rejected, or the oldest evicted with option--vnc-evict-oldesthyperv_drmguests)socket.write_all())DirtyBitmapwith 16x16 tile tracking (~1KB at 1080p)PixelConversionwith zero-copy fast path for native 32bppEncoderabstraction separating pixel conversion from wire encodingDocumentation added:
The remaining gaps: VNC authentication, WebSocket transport, Tight/ZRLE encoding (for WAN performance), and continuous updates are not scope of this contribution.
Correct keyboard handling across all clients (RealVNC sends US scancodes regardless of client layout, TigerVNC intercepts
Ctrl+Alt) is a known limitation and out of scope for this contribution.
Test plan
Tested:
Remaining:
I'd be genuinely impressed if this gets merged. Not as a dare -- just acknowledging that reviewing 2980 lines of VNC protocol code is nobody's idea of a fun afternoon. I appreciate anyone who makes it to the bottom of the diff.