Skip to content

Vendor stringdiff to fix pathological poor performance#412

Merged
lihaoyi merged 8 commits intomasterfrom
diff-perf
Jan 9, 2026
Merged

Vendor stringdiff to fix pathological poor performance#412
lihaoyi merged 8 commits intomasterfrom
diff-perf

Conversation

@lihaoyi
Copy link
Copy Markdown
Member

@lihaoyi lihaoyi commented Jan 9, 2026

Assertions like assert(Seq.tabulate(500)(identity) == Nil) would hang when rendering the diff due to non-linear performance slowdowns when comparing elements. It seems likely caused by usage of Views throughout app.tulz::stringdiff:0.3.4. stringdiff seems unmaintained, and fixing it required more changes than could be done easily when shading in the build.mill file, so I just vendored the whole thing in utest/src/utest/shaded/stringdiff/ and fixed it up in place

@lihaoyi lihaoyi merged commit ae97549 into master Jan 9, 2026
4 checks passed
@lihaoyi lihaoyi deleted the diff-perf branch January 9, 2026 04:52
@lefou
Copy link
Copy Markdown
Member

lefou commented Jan 9, 2026

I can't see a single open issue or pull request in the now vendored stringdiff project. In fact, I can't find any report of poor performance there at all. So how did you came to the conclusion, it's no longer maintained?

@yurique
Copy link
Copy Markdown

yurique commented Jan 10, 2026

oh, wow!

I had no idea stringdiff was a dependency in utest 🙀 (or, anywhere, to be honest - I don't think I ever got any feedback about it since I released it :) ).

I definitely can apply these performance fixes upstream.

@lihaoyi
Copy link
Copy Markdown
Member Author

lihaoyi commented Jan 10, 2026

@lefou @yurique I came to the conclusion in like 30s of looking at the upstream repo haha, it wasn't some big decision. This whole PR was vibe coded in like 10 minutes

@lihaoyi
Copy link
Copy Markdown
Member Author

lihaoyi commented Jan 10, 2026

@yurique don't worry about it, I'm happy to take OSS code and patch/maintain it as necessary. You don't need to feel obliged to support it unless you really want to!

@lefou
Copy link
Copy Markdown
Member

lefou commented Jan 10, 2026

@yurique don't worry about it, I'm happy to take OSS code and patch/maintain it as necessary. You don't need to feel obliged to support it unless you really want to!

TBH, that's a worst case usage scenario an OSS author and maintainer can face.

@yurique
Copy link
Copy Markdown

yurique commented Jan 10, 2026

I applied the changes and one of the tests is failing, although it's a subtle difference (the white-space):

                                    ⌄                         ⌄
[ InFirst(prefix-1 ), InBoth(match-1 ), InSecond(diff-1 diff-2 ), InBoth(match-2) ]
was not equal to 
[ InFirst(prefix-1 ), InBoth(match-1), InSecond( diff-1 diff-2), InBoth( match-2) ]
                                                ^                       ^

Now I'm trying to find why. (I even reverted those lastOptions back to maxOptions)
Interesting 🤔

@yurique
Copy link
Copy Markdown

yurique commented Jan 11, 2026

If anyone is curious (and it took me a LOT of printlns :) ) - the difference was happening because the old version (with views) had a hidden subtle bug.

There is a piece that moves whitespaces around when normalizing diffs:

  def moveWhitespace(
    diff: List[DiffElement[IndexedSeqView[String]]]
  ): List[DiffElement[IndexedSeqView[String]]] =
    ListScan(diff) {
      case InSecond(second) :: InBoth(Whitespace(both)) :: InFirst(first) :: tail if first == second => // <---
        Nil -> (
          InFirst(both) :: InBoth(second) :: InSecond(both) :: tail
        )

      case InFirst(first) :: InBoth(Whitespace(both)) :: InSecond(second) :: tail if first== second => // <---
        Nil -> (
          InSecond(both) :: InBoth(second) :: InFirst(both) :: tail
        )
...

It was comparing views with == - which always returned false.

@yurique
Copy link
Copy Markdown

yurique commented Jan 11, 2026

@lihaoyi @lefou

Version 0.4.0 is out: https://github.com/tulz-app/stringdiff/releases/tag/v0.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants