WIP: Reduce use of Selenium in favour of rack-test by pablobm · Pull Request #6497 · openstreetmap/openstreetmap-website

pablobm · 2025-11-03T16:13:03Z

Just a proposal: would it make sense to use rack-test for system tests, where possible?

If we can reduce the use of Selenium, the test suite will run faster and more reliably. The code in this PR is just a quick experiment, showing that only 40% of system tests need Selenium. The relevant output is the following:

272 runs, 1285 assertions, 53 failures, 55 errors, 0 skips

tomhughes · 2025-11-03T17:09:11Z

I'm not familiar with rack-test so can you explain a bit about how it works and how it differs from using selenium?

Presumably it's not actually using a browser so all you're really doing is checking that the expected HTML is returned? So what does it offer over a controller or integration test?

pablobm · 2025-11-03T17:18:38Z

Right, that's exactly it. The difference with controller tests is that you get to browse across different controllers, not just the one. The difference with integration tests is that you browse via clicking links, submitting forms, etc, as opposed as sending low-level HTTP requests.

In general, my view is that it's best to avoid Selenium unless you really need JS. In this case, the tests involved typically are about the profile section and similar, which don't need JS most of the time. Anywhere we think the JS is relevant, we can add back.

The implementation would look something like "Selenium by default, except in these tests where we use rack-test" or the other way around.

gravitystorm · 2025-11-04T09:41:38Z

I'd be happy to see the tests updated to use rack-test by default, and a full browser only when needed for full javascript execution. I don't think there's any noticeable gap between the html generation of rack-test and a browser, whereas there were often issues back in the poltergeist-vs-real-browser days for JS execution.

For me, the "visit link" style testing is the most important bit. I'd prefer if the controller-style and integration-style tests were all rewritten away from raw POSTs etc (apart from the tests that deliberately try to inject incorrect form submissions to validate the error handling) but that's such a huge project I've never thought it was worthwhile.

I'm used to using rspec in other projects, and only tagging the specific system tests that need javascript with an option, e.g. it "does something", :js do. I don't know if there's a way to do similar tagging with minitest, or whether trying to do that becomes more complex (especially for novice developers) than it's worth.

tomhughes · 2025-11-04T09:53:07Z

So as @gravitystorm says I know he's long been keen to move away from controller and integration tests to system tests and I understand the logic that you're actually properly testing the behaviour of the site as the user interacts with it but the downside has always been the cost in time/resources of running everything through selenium.

So it sounds like this is a solution to that but with the downside that you're really testing how the site will behave in a browser any more.

Is there a tension here with more use of turbo? Presumably rack-test won't be doing any additional loads that turbo might call for either or will it?

pablobm · 2025-11-04T10:56:42Z

100% agree with leaving controller/integration tests for low-level testing. My experience is that these are typically used for testing specific technical details, variants of "happy paths" reflected in system tests, and malicious usage of APIs.

My main experience is also with RSpec with the syntax shared by @gravitystorm. I want to think that with minitest it can be done with something like driven_by :rack_test similar to how there are driven_by declarations on https://github.com/openstreetmap/openstreetmap-website/blob/master/test/system/embed_test.rb, but I haven't gone that far into testing this idea yet.

Re: Turbo, yes, that's a tradeoff. With rack-test, links and form submissions will always trigger full-page loads. This creates a difference between what the tests do and real users do. At this point the question is: how much do we trust Rails to do what it promises to do?

When using an external tool, we want to think that we don't need to test the tool itself, but just the things we do with the tool, if that makes sense. Having said that, things can go wrong for example if something is misconfigured, etc. Eg: if we don't label a turbo frame correctly by mistake. Still we can go on a case-by-case basis. For example using Selenium only for the happy paths, or for specific situations where there's a known risk.

In exchange, we gain a faster execution of the test suite, a reduction of flaky tests, and more easily debuggable system tests.

gravitystorm · 2025-11-04T11:12:21Z

So it sounds like this is a solution to that but with the downside that you're really testing how the site will behave in a browser any more.

I have the "behave like a browser" split into two aspects in my mind:

With regards to the manner that the tests are written in, I prefer "click_on link, fill_in foo, click submit" as more closely mimicking what a user is doing. The alternative is "assert page has_link(foo), get foo, assert_dom has_input_field, post bar" and I find that more fragile and generally less like mimicking the user experience. So that's in favour of the system test syntax.
Then when it comes to the manner the tests are run, I'd love to just push everything through a real (albeit headless) browser, but for practical reasons it can be painfully slow. I console (delude?) myself that using e.g. rack-test (or similar) is like "using the site with javascript disabled". It's not ideal, but I haven't come across many actual problems with doing that.

I live in hope that someone can create something that's much closer to a real browser, in terms of layout engines and js execution, without having to actually be an entire browser and all the headaches that brings.

gravitystorm · 2025-12-03T19:18:20Z

I don't know if there's a way to do similar tagging with minitest

https://github.com/ordinaryzelig/minispec-metadata - but this only works if we're already using the spec syntax which would be a ton of work - almost at the "might as well use rspec" level of effort, perhaps.
https://github.com/bernardoamc/minitag - as far as I can tell, this is only useful for choosing which tests to run, rather than being a source of metadata that e.g. before blocks can use to switch drivers. I could be wrong on this. I'm not keen on the syntax.

So I think the only approach is to be more explicit with choosing the driver in each class or in specific tests. Something like:

class MixedDriverTest < ApplicationSystemTestCase
  driven_by :rack_test # default

  test "simple non-JS test" do
    visit "/status"
    assert_text "OK"
  end

  test "switch to selenium just for this test" do
    using_driver(:selenium, using: :chrome) do
      visit "/dashboard"
      # JS interactions here
      assert_selector ".map"
    end
  end
end

I would probably default to :rack_test at the ApplicationSystemTestCase level, and then provide a custom method in place of the using_driver stuff, for ease of refactoring and being more explicit as to why we're doing it e.g.

 test "switch to selenium just for this test" do
    using_javascript do
      visit "/dashboard"
      # JS interactions here
      assert_selector ".map"
    end
  end

I'd be interested in a comparison of test suite runtimes, to see if it's worthwhile.

github-actions · 2026-01-29T14:47:07Z

	1 Warning
⚠️	Number of updated lines of code is too large to be in one PR. Perhaps it should be separated into two or more?

Generated by 🚫 Danger

pablobm · 2026-01-30T11:07:35Z

I had another pass at this, converting a few tests for comparison. At the moment I'm seeing the combination of Selenium+rack-test running in ~2/3 of the time it takes to run with Selenium only. For example: in my machine I ran a handful of converted tests (see tests.yml here) 10 times. The Selenium+rack-test strategy ran in ~11min, compared to ~16min for running them with Selenium only.

The DX story is not great at the moment. For each test class, the individual tests need to be grouped as follows:

class AccountDeletionTest < ApplicationSystemTestCase
  class JsTest < AccountDeletionTest
    driven_by_selenium
    
    test "this requires selenium" do
      # ...
    end
  end

  # These need to go into its own subclass. Otherwise
  # JsTest above will also run the "html tests",
  # because as a subclass it will contain them
  # in addition to its own "js tests".
  class HtmlTest < AccountDeletionTest
    # `driven_by :rack_test` is already default
    
    test "this will work with just rack-test" do
      # ...
    end
  end
  
  # And in addition this will run three times: two
  # with rack-test and one with Selenium, because
  # minitest detects the superclass and both subclasses
  # as containining it.
  test "some test" do
    # ...
  end
end

Not a huge deal, we could come up with something more ergonomic. For example we could have test "a rack-test test" and js_test "a selenium test" or something like that. But before spending more time here, I want to check that this is looking compelling enough to the maintainers.

tomhughes · 2026-01-30T19:12:57Z

One option would be to split up tests so that any one file was all selenium or all rack test and then you just need to add driven_by_selenium to the selenium ones, or maybe have a SeleniumTestCase class they can inherit from.

The down side of that is that it might mean splitting up tests which are related and which would make more sense being grouped together?

pablobm · 2026-01-30T20:06:45Z

The down side of that is that it might mean splitting up tests which are related and which would make more sense being grouped together?

Yeah, that was my thought.

pablobm · 2026-02-03T10:46:20Z

Tried to put together some metaprogramming to provide a js_test method that defines Selenium-driven tests, separate from the rack-test ones. No success yet. I'm trying to get this method to put these Selenium tests in a separate class, basically automating what I'm doing with HtmlTest/JsTest, however I'm finding a problem:

If the new class inherits from the one where the tests are actually defined in code, then the tests will run twice: one for the new class and one for the parent.
If the new class doesn't inherit from there, then the tests will not have access to setup/teardown or private helpers.

Putting to rest for now until I can come up with better ideas that are not overly clever.

pablobm · 2026-04-16T11:25:21Z

Closing this as I'm not liking the options.

This was referenced Dec 3, 2025

Allow meta description and search title to be translated #6569

Merged

Flakey Docker workflow #6585

Open

pablobm force-pushed the rack-test-in-system-tests branch from 3408925 to 9a7d689 Compare January 29, 2026 14:46

github-actions bot added the big-pr label Jan 29, 2026

pablobm force-pushed the rack-test-in-system-tests branch 2 times, most recently from 87c1de7 to 4f43265 Compare January 29, 2026 15:36

pablobm mentioned this pull request Feb 2, 2026

Replacing Selenium with a more reliable alternative #6768

Open

gravitystorm mentioned this pull request Mar 11, 2026

Use rack by default for system tests #6878

Closed

pablobm added 4 commits April 16, 2026 12:06

EXPERIMENT: how many system tests require Capybara?

cc0e743

Porting some tests to Rack Test to measure

1ec6c7e

EXPERIMENT: where are we at the moment?

b952526

EXPERIMENT: Attempt at better DX

152aa19

pablobm force-pushed the rack-test-in-system-tests branch from 4f43265 to 152aa19 Compare April 16, 2026 11:22

pablobm closed this Apr 16, 2026

Conversation

pablobm commented Nov 3, 2025

Uh oh!

tomhughes commented Nov 3, 2025

Uh oh!

pablobm commented Nov 3, 2025

Uh oh!

gravitystorm commented Nov 4, 2025

Uh oh!

tomhughes commented Nov 4, 2025

Uh oh!

pablobm commented Nov 4, 2025

Uh oh!

gravitystorm commented Nov 4, 2025

Uh oh!

gravitystorm commented Dec 3, 2025

Uh oh!

github-actions bot commented Jan 29, 2026

Uh oh!

pablobm commented Jan 30, 2026

Uh oh!

tomhughes commented Jan 30, 2026

Uh oh!

pablobm commented Jan 30, 2026

Uh oh!

pablobm commented Feb 3, 2026

Uh oh!

pablobm commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants