XMPP: Support XEP-0461: Message Replies#134
XMPP: Support XEP-0461: Message Replies#134sh4sh wants to merge 14 commits intomatterbridge-org:masterfrom
Conversation
|
Neat! I will try to find some time to test this by the end of this week (need to fix my test server first). |
|
I got this building and looked at it some more and ended up adding two more features:
So now this covers all four directions:
|
bridge/xmpp/xmpp.go
Outdated
| if !b.GetBool("keepquotedreply") { | ||
| for strings.HasPrefix(body, "> ") { | ||
| lineIdx := strings.IndexRune(body, '\n') | ||
| if lineIdx == -1 { | ||
| body = "" | ||
| } else { | ||
| body = body[(lineIdx + 1):] | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Copied from the matrix bridge. Seems janky.
XMPP has a roundabout answer for this. When quoting a message, it also notes the slice that is the quote, e.g. body[0:19]:
<message id='6199b314-bda7-4b2f-a117-9844bf06a90b' lang='en' type='groupchat' to='nick@kousu.ca/dino.ed452c99' from='test@example.im/kousu'>
<body>
> <kousu_> Discord
XMPP!!
</body>
<reply id='019c6e2c-ef6a-7b98-adec-bfeebdb9ee35' to='test@example.im/kousu' xmlns='urn:xmpp:reply:0' />
<fallback for='urn:xmpp:reply:0' xmlns='urn:xmpp:fallback:0'>
<body start='0' end='19' />
</fallback>
<stanza-id by='test@example.im' xmlns='urn:xmpp:sid:0' id='019c6e2d-1774-7791-8965-c578c132fe9d' />
</message>If <fallback> could be bridged that would be better than stripping the quote blindly, because whether or not the destination should see the quote depends on the destination, not the source: e.g. IRC doesn't support replies, so it should include the quote; Discord does support replies, so it should not see the redundant quote. But I don't know how to do that, that seems complicated.
There was a problem hiding this comment.
IRC doesn't support replies, so it should include the quote
I'll let @poVoq and others express opinions but i don't think this is true. Bridging other network features transparently is usually very poorly received by IRC users. That was one of the main complaints against the matrix<->libera bridge.
When you format a reply with a manual quotation, you will select a relevant part of the text to reply to, not quote entire lines or strip the message meaninglessly. For example, i user A says: hello i've started implementing XEP-0461 for matterbridge and i'm looking for feedback. What's the best way to display replies on IRC side?, a manual reply would probably be something like:
B: > best way to display replies on IRC
You should consider FOOBAR
C: RE IRC: this should be configurable by the room moderators
Any form of automatic reply formatting produces unbearable results such as:
B: Hey let's have a meeting about this (re @A hello i've started implementing...)
This example is taken from a real-world situation with matterbridge doing telegram-xmpp bridging and it's completely unreadable to me because it does not provide the actual context of the origin message (especially since in this example the message being replied to was on another Telegram channel).
Differing opinions welcome :)
There was a problem hiding this comment.
Very few popular IRC servers support the IRCv3 message tag feature that is required to make replies on IRC. There is some very recent news that Libera.chat and Hackint might start supporting this soon, but even then there are only 2-3 rather niche IRC clients that can display replies.
I agree that workarounds in the message body are typically poorly received on IRC networks, especially if they result in lengthy messages that get split because of no multiline support (which I believe is lacking from the gIRC library that Matterbridge uses).
I think if we would support IRCv3 replies via message tags in the future, then clients that support them would display them and clients that don't would silently drop the reply reference, so that would probably satisfy both types of IRC users. But since gIRC also doesn't seem to support message tags, that is probably still a bit away.
There was a problem hiding this comment.
You're both very correct. IRC was a bad example.
Are there any protocols that do multiline messages but not replies? I wouldn't want people on those protocols confused.
If this parsed <fallback> too it could strip the quote cleanly before passing to the gateway. We might do that anyway instead of using the string parsing.
There was a problem hiding this comment.
Are there any protocols that do multiline messages but not replies? I wouldn't want people on those protocols confused.
IRC and Mumble come to mind. Probably many more if we actually dig.
Personal opinion: the reply formatting should be handled by the destination bridge depending on "local" customs. In our XMPP case, i believe the proper way would be to strip the fallback text before sending it to other bridges, but using the built-in reply mechanism (such as you're doing with the stanza id mapping). Does that sound good to you?
There was a problem hiding this comment.
I think maybe I'm misunderstanding you. The loop here is stripping before sending, it's just not doing it particularly safely. But after these messages we had this discussion where I think we decided that <fallback> should be parsed and used to strip the quote on incoming messages, and when sending outgoing XMPP there should be a AddMucReply(to StanzaID, fallback string). I just don't understand if you think the built in reply mechanism can look up reply text; it's just ParentID and gw.Messages which are also IDs. nothing in matterbridge stores text long-term as far as I can tell.
By the way I poked around the public XMPP chats on jabber.network and https://xmpp.org/community/chat/ and I didn't see you, but maybe I just missed you! My xmpp DMs are the email on my commits if you want to reach me, and I've joined matterbridge for somewhere reliable to find me 🌟 🐋 . @sh4sh is in there too 🐢
There was a problem hiding this comment.
nothing in matterbridge stores text long-term as far as I can tell.
You are correct, and i think it's part of the problem!
I just don't understand if you think
Sorry, my wording was really bad, and we're having parallel conversations on different issues. confusion emerges. I think we now have the same understanding: a bridge receiving a message on its native protocol should do its best to separate reply text from quote text, and when sending a message should decide whether to add the original quote, and somewhere in the middle we need to rework the matterbridge internals to store message contents in addition to related IDs. Is that correct?
There was a problem hiding this comment.
Yes i agree with all that!
What does that mean for this PR?
It would make sense to implement <fallback> in xmppo/go-xmpp#226 and replace this loop with it before merging. If so, I propose we'll have to add a new config.Message.ParentText and fill it in in this PR. Then it would be good enough to merge.
Then i propose we would:
- In the short term, adapt all the other bridges to read and write
ParentText. (e.g. adjust Support Discord Reply (fix #43) #124 to write intoParentTextinstead of preparing a quote) - In the medium term, add a local message cache to gateway.go, and replace
ParentTextwith a helper methodParentText(), so that there's no need for blocking calls to look it up like in Support Discord Reply (fix #43) #124 .
There was a problem hiding this comment.
Added parentText to config.Message in de23394 and took out the keepquotedreply thing
I think in xmppo/go-xmpp#226 we wanted to give the go-xmpp maintainers some time to review, it's only been 2 days since our last replies so perhaps we give them a few before implementing fallback?
There was a problem hiding this comment.
I'll read the commits in here again and try start a review. Yes, let's give upstream some time and keep experimenting here. No pressure on them :)
bridge/xmpp/xmpp.go
Outdated
| xc *xmpp.Client | ||
| xmppMap map[string]string | ||
| connected bool | ||
| stanzaIDs *lru.Cache |
There was a problem hiding this comment.
This maps StanzaID -> matterbridge ID (which is also the ID used on the <message> element).
This name could use some work. "stanzaIDs" makes it sound like the contents are StanzaIDs, but actually it's the reverse, the contents are IDs and the keys are StanzaIDs.
| // Generate a dummy ID because to avoid collision with other internal messages | ||
| // However this does not provide proper Edits/Replies integration on XMPP side. | ||
| msgID := xid.New().String() | ||
| return msgID, nil |
There was a problem hiding this comment.
Currently matterbridge insists on knowing a message ID right after bridge.Send. I believe this may be a wrong architecture designed around synchronous operations, but we really want to make matterbridge asynchronous by nature.
I think your interpretation of the codebase is correct and "should work". I'm ok to merge this (after more review etc), but i would also appreciate if we took some time in the coming month to document the reply system entirely and make it fully asynchronous.
bridge/xmpp/xmpp.go
Outdated
| xc *xmpp.Client | ||
| xmppMap map[string]string | ||
| connected bool | ||
| stanzaIDs *lru.Cache |
There was a problem hiding this comment.
Can we specify the type here so that we don't have to downcast later? I see in lru README they do something like this:
lru.New[int, any](128)I'm wondering if we could declare it like (i don't know about the actual syntax):
*lru.Cache[StanzaID, MatterbridgeID](this would remove the type assert linter errors further down)
There was a problem hiding this comment.
that's super, I didn't realize it was a generic type. That's clearly an improvement.
| body := v.Text | ||
| // Capture quoted lines into parentText so destination bridges can decide | ||
| // how they should be displayed. | ||
| for strings.HasPrefix(body, "> ") { |
There was a problem hiding this comment.
My understanding of the spec is that > as a leading quote character is unspecified. So first thing, the stripping of the fallback should be done upstream in go-xmpp in a dedicated method. Second thing, if we actually keep the original message cache, we don't need to rely on that fallback text at all, I believe and can use that cache as source of truth for what quote to pass around to other bridges.
There was a problem hiding this comment.
Actually no, it's very specified! In XEP-0393. Still we probably don't need to parse it at all in our case (message cache).
There was a problem hiding this comment.
After some discussion on jdev@muc.xmpp.org, archived here, it appears there's really no guarantee the fallback has to contain a XEP-0393 quote and so the fallback should not be edited at all.
That's ok because we will start storing the message contents on matterbridge side so if i understand correctly we should just drop the fallback entirely, right?
There was a problem hiding this comment.
Yes, caching the full log simplifies this greatly :)
Would you be up for taking a look at this @sh4sh ? You'll have to add to the go-xmpp patch again.
…bbed from prior art on kousu's fork, this is still buggy tho
Previously, the xmpp bridge tried to be helpful by storing the XEP-0359 StanzaID as the matterbridge ID; it's logical but subtly broke XEP-0461 replies when actually implemented: messages _received_ from XMPP would have a StanzaID and could be looked up when any other bridge replied to them, but messages _sent_ from other bridges would not know their StanzaID and an attempt to reply to them would just get lost. It is simpler to stick with tracking the internal mapping between messages using the internally generated message IDs, and instead keep StanzaIDs in a cache private to xmpp. Plus, the old code that assumed they existed without checking, and that is wrong; StanzaIDs are an optional XEP, just a very common one. If this code ever ran into a bare-bones server without them it would have completely lost track of all XMPP messages.
I think this is more understandable? I was hoping to maybe generate the xmpp.clientReply directly but that is a private struct
…ply, now public) This was redundant. We might as well just share the struct between the two modules.
This adds 'xmpp.KeepQuotedReply' config option that needs to be documented. I took the code from the matrix bridge; it's buggy if there are multiple. I would prefer not to add that option at all; the destination bridge should be able to figure out if the quote should be stripped, based on whether it supports replies on its side. This also adds a second lru.Cache. I feel like maybe there's a way to avoid that, but maybe there's not. I would like some feedback on that.
…formatting replies/quotes
<reply> is optional, so it should be optional in our datamodel too.
I think .ID is a better model of the underlying XMPP data structure but go-xmpp already defined OriginID and in practice ID == OriginID so using it is a less invasive change.
| if v.StanzaID.ID != "" { | ||
| // Here the stanza-id has been set by the server and can be used to provide replies | ||
| // as explained in XEP-0461 https://xmpp.org/extensions/xep-0461.html#business-id | ||
| b.stanzaIDs.Add(v.StanzaID.ID, v.ID) | ||
| b.replyHeaders.Add(v.ID, xmpp.Reply{ID: v.StanzaID.ID, To: v.Remote}) | ||
| } |
There was a problem hiding this comment.
I don't want to be noisy in another project's repo so I'm following-up on this here, although the issue crosses both projects.
Why ID and not OriginID
@kousu: origin-id is not set by all clients (Dino, for example)
@selfhoster1312: I don't think this is the case. dino definitely sets origin-id (there's even old issues from 2018 about that feature on their github). And if a client doesn't set origin-id, it just cannot support message replies. message
idfield is only guaranteed unique on a C2S level… for a message federated with a third party (private message or MUC) onlyorigin-idshould be used.
Well I'm not sure about 2018 but Dino's not sending origin-id in 2026:
Dino log
XMPP OUT [test@kousu.ca stream:0x55ef0733e550 thread:0x55ef0656e570 2026-03-06T11:34:59-0500]
<message id='ce3f4b97-aab1-4908-a1fb-3bf3a55ab0df' to='bridge-test@example.im' type='groupchat'>
<body>
XMPP Original
</body>
</message>
XMPP OUT [test@kousu.ca stream:0x55ef0733e550 thread:0x55ef0656e570 2026-03-06T11:34:59-0500]
<r xmlns='urn:xmpp:sm:3' />
XMPP IN [test@kousu.ca stream:0x55ef0733e550 thread:0x55ef0656e570 2026-03-06T11:35:00-0500]
<a h='113' xmlns='urn:xmpp:sm:3' />
XMPP IN [test@kousu.ca stream:0x55ef0733e550 thread:0x55ef0656e570 2026-03-06T11:35:00-0500]
<message type='groupchat' from='bridge-test@example.im/test' lang='en' to='test@kousu.ca/dino.a5466526' id='ce3f4b97-aab1-4908-a1fb-3bf3a55ab0df'>
<body>
XMPP Original
</body>
<occupant-id id='r6pg036dNqOJs9TanMiowwrEGgZP4nr8ivU+W7pE0Og=' xmlns='urn:xmpp:occupant-id:0' />
<stanza-id by='bridge-test@example.im' id='2026-03-06-e30b84344d0ead48' xmlns='urn:xmpp:sid:0' />
</message>
I made a branch that uses OriginID. Maybe you could take some time to experiment with it and see why it doesn't work:
cd $(mktemp -d)
git clone --depth 1 -b xmpp-reply-origin-id https://github.com/kousu/matterbridge
cd matterbridge
go build
./matterbridge -version
With that in place these are the logs:
matterbridge log
[0020] INFO xmpp: RECV <message type='groupchat' from='bridge-test@example.im/test' xml:lang='en' to='matterbridge@kousu.ca/_AYBJoz1aVnZ' id='ce3f4b97-aab1-4908-a1fb-3bf3a55ab0df'><body>XMPP Original</body><occupant-id id='r6pg036dNqOJs9TanMiowwrEGgZP4nr8ivU+W7pE0Og=' xmlns='urn:xmpp:occupant-id:0'/><stanza-id by='bridge-test@example.im' id='2026-03-06-e30b84344d0ead48' xmlns='urn:xmpp:sid:0'/></message>
[0020] DEBUG xmpp: [handleXMPP:bridge/xmpp/xmpp.go:323] == Receiving xmpp.Chat{ID:"ce3f4b97-aab1-4908-a1fb-3bf3a55ab0df", Remote:"bridge-test@example.im/test", Type:"groupchat", Text:"XMPP Original", Subject:"", Thread:"", Oob:xmpp.Oob{XMLName:xml.Name{Space:"", Local:""}, Url:"", Desc:""}, Ooburl:"", Oobdesc:"", Lang:"en", StanzaID:xmpp.StanzaID{XMLName:xml.Name{Space:"urn:xmpp:sid:0", Local:"stanza-id"}, Text:"", Xmlns:"urn:xmpp:sid:0", ID:"2026-03-06-e30b84344d0ead48", By:"bridge-test@example.im"}, OriginID:"", Reply:(*xmpp.Reply)(nil), Roster:xmpp.Roster(nil), Other:[]string{""}, OtherElem:[]xmpp.XMLElement{xmpp.XMLElement{XMLName:xml.Name{Space:"urn:xmpp:occupant-id:0", Local:"occupant-id"}, Attr:[]xml.Attr{xml.Attr{Name:xml.Name{Space:"", Local:"id"}, Value:"r6pg036dNqOJs9TanMiowwrEGgZP4nr8ivU+W7pE0Og="}, xml.Attr{Name:xml.Name{Space:"", Local:"xmlns"}, Value:"urn:xmpp:occupant-id:0"}}, InnerXML:""}}, Stamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC)}
[0020] DEBUG xmpp: [handleXMPP:bridge/xmpp/xmpp.go:331] Caching StanzaID: 2026-03-06-e30b84344d0ead48 -> -> {{ } 2026-03-06-e30b84344d0ead48 bridge-test@example.im/test}
The
matterbridge/bridge/xmpp/xmpp.go
Lines 329 to 331 in e9584fa
Caching StanzaID: 2026-03-06-e30b84344d0ead48 -> -> {{ } 2026-03-06-e30b84344d0ead48 bridge-test@example.im/test}
line confirms OriginID is nil.
and the result is that replies don't reply:
is rendered as:
I would love to make only minimal changes to go-xmpp but we tried and found it didn't work 🥲, so this is where we ended up.
We're talking past each other. I'm sure you have an important concern that I'm totally overlooking. Something broad maybe? What is it? Are you worried that id isn't long-term stable? Is it because you prefer to use StanzaID as the canonical ID? Are you worried about putting too much pressure for changes on go-xmpp?
Avoiding ID
I have thought about trying to do this without exposing ID in go-xmpp. It's possible but super awkward and I'm sure it will lead to confusion and bugs.
The task: we need to pick IDs that the gateway can use to refer to each message permanently as soon as the message is sent/received.
The incoming case is easy: set config.Message.ID = xmpp.Chat.StanzaID.ID (which is in fact the actual situation), and then an incoming reply will be tagged with the same StanzaID and can pass it directly up to the gateway as ParentID, so that's great ⭐ .
But the outgoing case is doubly complicated to compensate for the simplicity on that side 😭🧅🧅🧅: it needs to find out the OriginID and/or ID set by go-xmpp (always necessitating some patch to go-xmpp) -- let's say we use OriginID -- and then return that to the gateway as the stable ID. When the echoed copy comes in we need to record it in a outgoingMessageIDs lru.Cache[StanzaID, OriginID] but make sure to detect it was an echo to one of our outgoing messages; how would you do that 1? And then when a reply comes in, we need to check for OriginID , _ := outgoingMessageIDs.Get(StanzaID) and return the OriginID in that case.
So then there's two sets of IDs intermixed in the gateway's cache.
Using .ID everywhere is simpler to reason about.
Using StanzaID
Just to be clear, I am for using StanzaID as the stable ID, but until #159 is done that is really hard because we don't know it until the echo. And that's the kind of patch that can go on for months or years. Please don't let it block XMPP replies; I want to use replies now, and I want to build reactions on it soon.
Having Send() -> xmpp.Chat
mdosch: I think in general it would make sense to return xmpp.Chat as there might also be other stuff generated in the future that might not be of interest today.
I made a POC commit for the Send() function there: xmppo/go-xmpp@1089118
@kousu : I like the idea of xmppo/go-xmpp@1089118 a lot, I can definitely work with that instead
@selfhoster1312 : Please do, as it seems to be the way the go-xmpp maintainer has chosen :)
Having go-xmpp pick the outgoing IDs is cleaner and I will use that, but we still need it to expose .ID because otherwise the code is split-brained between the outgoing and incoming paths for all the reasons above.
🏴☠️ 🥇 🤝 🥇 🏴☠️
I'm sorry this is dragging on so long 😅 . I don't want to turn this into a lumbering monster. What will it take to move it forward?
Footnotes
-
Naively: keep a list
outstandingIDs []OriginID, so that's one extra data structure to juggle. Or avoid it by examiningfrom == "matterbridge@example.im"but I'm worried that's forgeable; it surely risks breaking in weird ways if multiple bridges ran simultaneously by accident or if someone logged in manually as the bridge for testing. So it's hard to see how to avoid juggling that extra list. ↩
To be removed upon completion of xmppo/go-xmpp#226










Adding support for XEP-0461: Message Replies for the xmpp bridge
Depends on xmppo/go-xmpp#226 before it can be merged here.