Running Google Test with MPI #85

benegee · 2023-08-25T11:38:48Z

Some experimental code to run a gtest with MPI.

The good thing is, it seems to work within our CI.

The bad thing is that it fails.
What actually fails is however instructive.

The number of elements is 128 instead of 256. Given the fact that 2 processes were used, this seems plausible.
Some computed cell average value is just 0. So it seems the solution data was not synchronized, which is plausible as I did not care about it.

sloede · 2023-08-28T12:02:56Z

The number of elements is 128 instead of 256. Given the fact that 2 processes were used, this seems plausible.

Yes, this is good news 👍 However, it also shows that we should update this information from Trixi.jl in the AnalysisCallback to show the global number of elements, and not just the local number (which it does right now). It also shows that you need to make GTest MPI aware. Thus, you should possibly link it against MPI and initialize MPI within GTest, such that you also have access to MPI_Comm_rank, MPI_Comm_size etc. to figure out how to adjust your expectations (e.g., for the number of local elements).

Another option would be to set a special environment variable (e.g., LIBTRIXI_NUM_RANKS) and if it is set, get the information from there. This might be easier in the beginning, but eventually you will have to use actual MPI support, so maybe just go down this road immediately.

Some computed cell average value is just 0. So it seems the solution data was not synchronized, which is plausible as I did not care about it.

I'm not sure it's a necessarily synchronization problem: Each rank is supposed to compute the cell averages of their own rank. So when you're checking for the average in cell 929 (928 + 1) in https://github.com/trixi-framework/libtrixi/pull/85/files#diff-bf51675732cd01fd6ac5d3c6f3645b10163b540a95927b3c2b8dc15a529ee673R55, not only does the it give you the value of a different cell, but also of a different variable:

In the original case, you had 1600 entries per variable (Structure of Arrays layout), with 2 ranks you only have 800 on each rank. Thus, before you were checking the 929th density value, with 2 ranks you are checking the 129th x-velocity value.

Here's a good resource on using Google Test with MPI: https://elib.dlr.de/144193/1/Testing%20HPC%20C%2B%2B%20software%20with%20GoogleTest.pdf
Maybe this will give you some ideas how to add parallel testing with GTest?

It seems like also the t8code folks use it (see, e.g., https://github.com/DLR-AMR/t8code/blob/main/test/t8_gtest_main.cxx), so maybe this could be something to bring up at the next ADAPTEX meeting where t8code folks are in attendance?

benegee · 2023-08-28T15:42:07Z

Thanks for the detailed analysis and the helpful hints!

benegee · 2023-08-28T15:42:27Z

Just for reference:
https://github.com/DLR-SC/googletest_mpi
It seems a little outdated though.

LibTrixi.jl/src/LibTrixi.jl

sloede · 2023-08-31T14:15:32Z

I don't get the errors: For example, https://github.com/trixi-framework/libtrixi/actions/runs/6038153366/job/16383897333?pr=85#step:19:382 compares two values, 0.0028566952356658794 and 0.0028566952356658751, which have a difference of -4.336808689942018e-18. Maybe your tolerances are too tight?

codecov · 2023-08-31T15:13:40Z

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.03% 🎉

Comparison is base (dd260de) 98.20% compared to head (e2e8102) 98.23%.

❗ Current head e2e8102 differs from pull request most recent head 28ef707. Consider uploading reports for the commit 28ef707 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #85      +/-   ##
==========================================
+ Coverage   98.20%   98.23%   +0.03%     
==========================================
  Files          12       12              
  Lines         500      510      +10     
==========================================
+ Hits          491      501      +10     
  Misses          9        9

Flag	Coverage Δ
unittests	`98.23% <100.00%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed	Coverage Δ
LibTrixi.jl/src/LibTrixi.jl	`100.00% <ø> (ø)`
src/api.f90	`97.36% <ø> (ø)`
LibTrixi.jl/src/api_c.jl	`100.00% <100.00%> (ø)`
LibTrixi.jl/src/api_jl.jl	`98.38% <100.00%> (+0.08%)`	⬆️
src/api.c	`100.00% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

benegee · 2023-08-31T15:23:21Z

Maybe your tolerances are too tight?

Exactly! I had used EXPECT_DOUBLE_EQ, which according to the docs

Verifies that the two double values val1 and val2 are approximately equal, to within 4 ULPs from each other.

Units in the Last Place (ULPs), I had not heard of before.

I now use EXPECT_NEAR with an absolute tolerance of 1e-14 (values to compare are in the range [0.1, 0.9])

benegee · 2023-08-31T15:24:33Z

Julia test for the new trixi_nelemens_global was missing.
Coverage should recover.

sloede

Minor suggestions, but other than that, this looks great!

test/c/simulation.cpp

experiment for running a gtest via mpi

47dd9c3

benegee mentioned this pull request Aug 25, 2023

Testing #74

Closed

5 tasks

Merge branch 'main' into bg/gtest-mpi

cf96931

benegee added 6 commits August 30, 2023 16:46

replace nelement with nelement_local and introduce nelements_global

d41ce4e

Merge branch 'main' into bg/gtest-mpi

5111a4c

separate file for simulation test with and without MPI

46ba42d

missing adaptation to nelements_local

9523170

use C++ MPI for Google Tests

1e8ef0d

take MPI parallelization with 2 processes into account

a09b233

sloede reviewed Aug 31, 2023

View reviewed changes

LibTrixi.jl/src/LibTrixi.jl Outdated Show resolved Hide resolved

benegee added 2 commits August 31, 2023 15:40

update target solution values, including symmetries

3efcdb0

Merge branch 'main' into bg/gtest-mpi

1f97140

benegee added 3 commits August 31, 2023 16:38

use trixi_nelements for the local version

31dc985

add trixi_nelements_global to Fortran interface

9dd8a98

add tolerances for test

19c7a71

test nelement_global on Julia side

e2e8102

benegee marked this pull request as ready for review August 31, 2023 15:24

benegee requested a review from sloede August 31, 2023 15:24

sloede reviewed Aug 31, 2023

View reviewed changes

test/c/simulation.cpp Outdated Show resolved Hide resolved

test/c/simulation.cpp Outdated Show resolved Hide resolved

Apply suggestions from code review

28ef707

sloede approved these changes Aug 31, 2023

View reviewed changes

sloede enabled auto-merge (squash) August 31, 2023 16:17

sloede merged commit 734c787 into main Aug 31, 2023

sloede deleted the bg/gtest-mpi branch August 31, 2023 16:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running Google Test with MPI #85

Running Google Test with MPI #85

Uh oh!

benegee commented Aug 25, 2023

Uh oh!

sloede commented Aug 28, 2023

Uh oh!

benegee commented Aug 28, 2023

Uh oh!

benegee commented Aug 28, 2023 •

edited

Loading

Uh oh!

Uh oh!

sloede commented Aug 31, 2023 •

edited

Loading

Uh oh!

codecov bot commented Aug 31, 2023 •

edited

Loading

Uh oh!

benegee commented Aug 31, 2023 •

edited

Loading

Uh oh!

benegee commented Aug 31, 2023

Uh oh!

sloede left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Running Google Test with MPI #85

Running Google Test with MPI #85

Uh oh!

Conversation

benegee commented Aug 25, 2023

Uh oh!

sloede commented Aug 28, 2023

Uh oh!

benegee commented Aug 28, 2023

Uh oh!

benegee commented Aug 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

sloede commented Aug 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

benegee commented Aug 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benegee commented Aug 31, 2023

Uh oh!

sloede left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

benegee commented Aug 28, 2023 •

edited

Loading

sloede commented Aug 31, 2023 •

edited

Loading

codecov bot commented Aug 31, 2023 •

edited

Loading

benegee commented Aug 31, 2023 •

edited

Loading