Extending test coverage for 2D visualization#5000
Extending test coverage for 2D visualization#5000orbeckst merged 11 commits intoMDAnalysis:developfrom
Conversation
There was a problem hiding this comment.
Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on GitHub Discussions so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS as part of this PR.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #5000 +/- ##
===========================================
+ Coverage 93.43% 93.58% +0.15%
===========================================
Files 177 189 +12
Lines 21894 22960 +1066
Branches 3095 3095
===========================================
+ Hits 20457 21488 +1031
- Misses 986 1019 +33
- Partials 451 453 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
orbeckst
left a comment
There was a problem hiding this comment.
Quick surface-level review: Given that these functions were previously private, we should keep them out of the public API by naming them with a leading underscore:
_produce_list_indices_point_in_polygon_this_frame()_produce_list_centroids_this_frame()
Then they also don't have to show up in docs.
Also add yourselves to AUTHORS.
We don't need a CHANGELOG entry for tests/internal refactoring.
|
@lilyminium @tylerjereddy would you be able to add a review — should be quick. Please leave an accept or changes requested to be clear about your intentions. This is GSOC relevant so needs to get done soon and also needs to be unambiguous. Thank you! |
By end of day Monday or sooner? |
|
End of Monday works! |
5c5e5a5 to
a293f3a
Compare
Made the notified changes ! |
lilyminium
left a comment
There was a problem hiding this comment.
Great additions to the test coverage @TRY-ER, I've added some suggestions below.
| indices_tuple = (np.array([1, 3]),) | ||
| list_indices = [indices_tuple] | ||
| result = streamlines._produce_list_centroids_this_frame(list_indices, pts) | ||
| expected = np.average(pts[[1, 3]], axis=0) |
There was a problem hiding this comment.
| expected = np.average(pts[[1, 3]], axis=0) | |
| expected = np.array([2., 2.]) |
It's better to test actual values instead of re-running through the function logic.
There was a problem hiding this comment.
Added actual values directly!
| expected1 = np.average(pts[[0, 2]], axis=0) | ||
| expected2 = np.average(pts[[1, 3, 4]], axis=0) |
There was a problem hiding this comment.
Here as well could you please use the actual values?
There was a problem hiding this comment.
Added actual values !
| (np.array([0, 0]), [0.7999992370605469, 0.5399990081787109]), | ||
| (np.array([1, 0]), [0.8000001907348633, 0.5399971008300781]), | ||
| (np.array([2, 0]), [0.8000020980834961, 0.5400047302246094]), | ||
| (np.array([3, 0]), [0.8000001907348633, 0.5400009155273438]), | ||
| (np.array([4, 0]), [0.7999982833862305, 0.5400009155273438]), | ||
| (np.array([0, 1]), [0.7999992370605469, 0.5399999618530273]), | ||
| (np.array([1, 1]), [0.7999954223632812, 0.5400009155273438]), | ||
| (np.array([2, 1]), [0.7999992370605469, 0.5400047302246094]), | ||
| (np.array([3, 1]), [0.7999954223632812, 0.5399932861328125]), | ||
| (np.array([4, 1]), [0.8000030517578125, 0.5399932861328125]), | ||
| (np.array([0, 2]), [0.8000068664550781, 0.5399999618530273]), | ||
| (np.array([1, 2]), [0.7999992370605469, 0.5400009155273438]), | ||
| (np.array([2, 2]), [0.8000106811523438, 0.5400009155273438]), | ||
| (np.array([3, 2]), [0.8000106811523438, 0.5399932861328125]), | ||
| (np.array([4, 2]), [0.8000030517578125, 0.5399932861328125]), | ||
| (np.array([0, 3]), [0.7999954223632812, 0.5399999618530273]), | ||
| (np.array([1, 3]), [0.7999954223632812, 0.5400009155273438]), | ||
| (np.array([2, 3]), [0.8000030517578125, 0.5399971008300781]), | ||
| (np.array([3, 3]), [0.8000030517578125, 0.5399932861328125]), | ||
| (np.array([4, 3]), [0.8000030517578125, 0.5400009155273438]), | ||
| (np.array([0, 4]), [0.79998779296875, 0.5400009155273438]), | ||
| (np.array([1, 4]), [0.8000106811523438, 0.5399971008300781]), | ||
| (np.array([2, 4]), [0.7999954223632812, 0.5400047302246094]), | ||
| (np.array([3, 4]), [0.8000030517578125, 0.5400009155273438]), | ||
| (np.array([4, 4]), [0.7999954223632812, 0.5399932861328125]), |
There was a problem hiding this comment.
It looks like these are all roughly (0.8, 0.54) -- could you use the atol parameter to check for that instead of including such fine precision?
There was a problem hiding this comment.
As suggested, a check logic was assigned to check the approximate values to 0.8 and 0.5. (rather checking whole output)
tylerjereddy
left a comment
There was a problem hiding this comment.
At first glance, I think I agree with Lily--this cleans up some nested functions I never should have written that way (the code is very old, I was still learning) and exposes them to some tests.
The tests may be mostly around basic functionality, but that still has some value of course. I added a few inline comments/suggestions.
One more thing--the description of this PR cites gh-597 and I don't see the connection--that issue is about checking that errors are raised properly in our source code by testing faulty inputs, etc. The only relationship to this PR is the theme of improving test coverage, but that particular issue is about that specific type of test coverage, for ensuring that errors are raised as expected.
| list_indices = [indices_tuple] | ||
| result = streamlines._produce_list_centroids_this_frame(list_indices, pts) | ||
| expected = np.array([2.0, 2.0]) | ||
| np.testing.assert_array_almost_equal(result[0], expected) |
There was a problem hiding this comment.
We should prefer assert_allclose in new code per the docstring of this NumPy function I think.
There was a problem hiding this comment.
Used assert_allclose in recent changes.
| [0.5, 0.5], | ||
| [1.5, 0.5], | ||
| ] | ||
| ) |
There was a problem hiding this comment.
minor: I think we could write this more compactly
There was a problem hiding this comment.
Not sure why, but the black linting was giving errors, hence wrote the list in this manner.
There was a problem hiding this comment.
Even though I agree with @tylerjereddy , I am going to succumb to black's opinion here and just leave it in whichever way black wants it to be. Not worthwhile to mark it as an exception.
| vertex_list, points | ||
| ) | ||
| expected = [(np.array([0], dtype=int),)] | ||
| np.testing.assert_array_equal(result[0][0], expected[0][0]) |
There was a problem hiding this comment.
Maybe I'm misunderstanding something, but above it says:
matplotlib.path.Path.contains_points does not include boundary points.
But expected has the index-0 point detected as interior, no?
There was a problem hiding this comment.
My bad, I just checked the boundary points that are included in the Matplotlib path and returns a 1D zero Numpy array. Made changes to the comment.
There was a problem hiding this comment.
I think you're just checking the 0-index here, which wouldn't check if result had more than one entry -- could you convert both result and expected to arrays?
There was a problem hiding this comment.
In this specific test, I was checking a single point on the shape's boundary! As there will be one expected element in the array, I tested only the first element in the initial test. As you commented, I have added to check all the items in the list rather than the first one only.
| for entry in values: | ||
| res = entry[1] | ||
| assert res[0] == pytest.approx(0.8, abs=1e-1) | ||
| assert res[1] == pytest.approx(0.5, abs=1e-1) |
There was a problem hiding this comment.
probably minor, but I suspect using assert_allclose here with a slice of the first two indices will tell us if only one of these is failing while as written the first failure could mask the success/failure of the second assertion
There was a problem hiding this comment.
Done it with the numpy.assert_allclose() !
| ymin=univ.atoms.positions[..., 1].min(), | ||
| ymax=univ.atoms.positions[..., 1].max(), | ||
| maximum_delta_magnitude=2.0, | ||
| num_cores="maximum", |
There was a problem hiding this comment.
I might be a little hesitant to test num_cores="maximum" by default, at least in the CI. One option might be to have a slow test marker for the multi-core test and only do that locally/when needed, but I'm not sure if we have such a marker at the moment (it is quite common upstream for resource-intensive tests where you occasionally may want to check/bisect, but don't want to/can't run continuously).
I suspect the pragmatic thing for now may be to just use a single core, or perhaps 2 so that we get concurrency with reduced risk of locking things up.
There was a problem hiding this comment.
The test case was added to consider the core inclusion case of "maximum". I have modified the cores to 2. If there is such a marker implemented later, I will modify the test case accordingly to test if the "maximum" case works as expected.
You are right that the notified issue does not entirely align with the contribution. But among the issues tagged as |
orbeckst
left a comment
There was a problem hiding this comment.
Overall looks good to me. As formalities
- add your GH handle to the author list for the 2.10 release in CHANGELOG – we don't need an entry but you should be visible as a contributor
- fix author order in AUTHORS
@lilyminium and @tylerjereddy please let me know if you have any remaining concerns/comments. Thank you for reviewing, much appreciated!!
| - Matthew Davies | ||
| - Jia-Xin Zhu | ||
| - Tanish Yelgoe | ||
| 2025 |
package/AUTHORS
Outdated
| - Namir Oues | ||
| - Lexi Xu | ||
| - BHM-Bob G | ||
| - Debasish Mohanty |
There was a problem hiding this comment.
Please add your name at the end of the list.
| [0.5, 0.5], | ||
| [1.5, 0.5], | ||
| ] | ||
| ) |
There was a problem hiding this comment.
Even though I agree with @tylerjereddy , I am going to succumb to black's opinion here and just leave it in whichever way black wants it to be. Not worthwhile to mark it as an exception.
|
@TRY-ER sorry I said in my earlier review
I should have been more precise: we don't need a bullet point in CHANGELOG but we do want your name in there as a contributor. I realize that my earlier review may have come across as not valuing your contribution here. I am sorry for that. ALL contributions are important for the success of the project and we want to give credit to all contributors. |
|
Just for information! My name is Debasish Mohanty (In case my Github handle confuses anyone). [ I am proposing the MDAKit for polymers for GSoC 2025. ] |
orbeckst
left a comment
There was a problem hiding this comment.
LGTM from my vantage point! Thanks. I am waiting for opinions from @tylerjereddy and @lilyminium .
| square = [(0, 0), (1, 0), (1, 1), (0, 1)] | ||
| vertex_list = [square] | ||
| points = np.array([[1, 0.5]]) # exactly on the boundary |
There was a problem hiding this comment.
Thanks for adding this test! Just to be rigorous (and because I'm curious), what happens if vertex_list contains two adjacent squares both including the boundary that points lies on? Does _produce_list_indices_point_in_polygon_this_frame identify the point as being in both squares?
There was a problem hiding this comment.
Yes, as it is implemented in the _produce_list_indices_point_in_polygon_this_frame function, the index of the point lying on the boundary of two of the adjacent squares will be included for both vertices. (identifying the point is on both squares)
There was a problem hiding this comment.
I see, thanks. Sorry, I should have been clearer -- could you please update the test to include this test and make it clearer? Otherwise the PR looks good!
There was a problem hiding this comment.
A test has been added to check if the indices of the shared points on adjacent squares return expected data from _produce_list_indices_point_in_polygon_this_frame or not.
lilyminium
left a comment
There was a problem hiding this comment.
LGTM -- thanks you @TRY-ER!
|
Thank you for your contribution @TRY-ER ! Congratulations on the merged PR 🎉 ! |
|
Thank you @orbeckst , @lilyminium , @tylerjereddy for your kind guidance and support. |
Issue 597
GSoC Primary Contribution for test coverage!
Changes made in this Pull Request:
visualization.streamlines.pyshowed 54% of the test coverage.maximumcore utilized were handled.Increase by 38 % code coverage in the specific file
PR Checklist
package/CHANGELOGfile updated?package/AUTHORS? (If it is not, add it!)Developers Certificate of Origin
I certify that I can submit this code contribution as described in the Developer Certificate of Origin, under the MDAnalysis LICENSE.
📚 Documentation preview 📚: https://mdanalysis--5000.org.readthedocs.build/en/5000/