Skip to content

Probability Distribution -- Gamma Distribution #1099

Merged
sebastian-mutz merged 56 commits intofortran-lang:masterfrom
Calluumm:master
Feb 26, 2026
Merged

Probability Distribution -- Gamma Distribution #1099
sebastian-mutz merged 56 commits intofortran-lang:masterfrom
Calluumm:master

Conversation

@Calluumm
Copy link
Copy Markdown
Contributor

Adds a Gamma Distribution to Stats
This is a modernisation/continuance of #278 all credit to the original author.

Comparatively to #278;

  • Tests have been standardised to current stdlib standards
  • Documentation lightly amended
  • Filepaths adjusted for current structure

ctest distribution_gamma passed fine
the function itself was previously tested against SciPy so I did again it's still accurate, I further tested against FSML's function and again it was identical.
specialfunctions did have to be linked into stats for the lower incomplete gamma function, if this isn't standard it could be duplicated into stats but I'm not sure what the practice is there for this lib

Not sure if having the specialfunctions dependancy is standard
Almost untouched from previous gamma distribution PR
Re-written to fit current stdlib standards for tests compared to older PR
@codecov
Copy link
Copy Markdown

codecov bot commented Jan 24, 2026

Codecov Report

❌ Patch coverage is 0% with 51 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.28%. Comparing base (de3e59f) to head (23f2fd2).
⚠️ Report is 116 commits behind head on master.

Files with missing lines Patch % Lines
...ple/stats_distribution_gamma/example_gamma_cdf.f90 0.00% 18 Missing ⚠️
...ple/stats_distribution_gamma/example_gamma_pdf.f90 0.00% 18 Missing ⚠️
...ple/stats_distribution_gamma/example_gamma_rvs.f90 0.00% 15 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1099      +/-   ##
==========================================
- Coverage   68.55%   68.28%   -0.28%     
==========================================
  Files         396      399       +3     
  Lines       12746    12797      +51     
  Branches     1376     1376              
==========================================
  Hits         8738     8738              
- Misses       4008     4059      +51     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Gamma distribution implementation to the stats component, including build integration, tests, and user-facing specifications.

Changes:

  • Introduces stdlib_stats_distribution_gamma with rvs_gamma, pdf_gamma, and cdf_gamma.
  • Updates build system to compile/link the new stats module (including linking specialfunctions).
  • Adds a stats test target and FORD spec documentation + index entry.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
src/stats/stdlib_stats_distribution_gamma.fypp Implements Gamma RNG/PDF/CDF (uses specialfunctions for incomplete gamma).
src/stats/CMakeLists.txt Adds the module to stats sources and links stats against specialfunctions.
test/stats/test_distribution_gamma.fypp Adds regression tests for RNG/PDF/CDF and a chi-squared RNG check.
test/stats/CMakeLists.txt Registers the new gamma distribution test.
doc/specs/stdlib_stats_distribution_gamma.md Adds module specification and examples.
doc/specs/index.md Adds Gamma distribution spec to the documentation index.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Calluumm and others added 12 commits January 25, 2026 09:08
fixes variable mixup and comment spelling mistake

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
spelling mistake

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
fixes comment inconsistency

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
fixed comment mistake

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
yeah 0 should return at <= 0 in gamma CDF, potentially could add a a different shape/rate > 0 check

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Makes sense hadn't considered that, clamping the index properly

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Better numerical stability at larger shapes
spelling mistake

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
further relying on specialfunction's gamma; suggested by copilot
@jalvesz
Copy link
Copy Markdown
Contributor

jalvesz commented Jan 26, 2026

Hi @Calluumm, thanks for taking care of this upgrade!
Regarding skipping some of the real kinds, instead of

#:set WITH_QP = False

You should prefer

#:for k, t, s in ...
#:if k in ['sp', 'dp']
...
#:endif
#:endfor

Please notice that the variable you were touching is a global preprocessing variable, which means that if you change it here, whichever other file is treated after will inherit that value. This is very dangerous and difficult to track when fypp preprocessing in parallel.

made gamma complex type to force sp dp xdp throughout
opted for same solution as the module, force using gamma complex type
@Calluumm
Copy link
Copy Markdown
Contributor Author

Thanks,
They were relics of the original I wasn't sure how to fix;
I've given that solution a go and it tested fine

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Calluumm
Copy link
Copy Markdown
Contributor Author

Calluumm commented Feb 5, 2026

Docs should be largely better, templating the existing ones highlighted what gamma was missing

@jalvesz
Copy link
Copy Markdown
Contributor

jalvesz commented Feb 6, 2026

@Calluumm I saw that you put the exampled as embedded text in the documentation. Examples should be runnable standalone program units which are embedded in the markdown for documenting. Could you please make sure to put the examples in their own files within the corresponding folder in example and embed the files in the markdown documentation?

@Calluumm
Copy link
Copy Markdown
Contributor Author

Calluumm commented Feb 6, 2026

Absolutely @jalvesz I had set them up yesterday but hadn't comitted while I worked on descriptions, I think theyre setup correct now

Copy link
Copy Markdown
Member

@sebastian-mutz sebastian-mutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Calluumm. Some more comments on the specs below.


Cumulative distribution function (cdf) of the single real variable gamma distribution:

$$ F(x)= \frac{\gamma (k, \lambda (x-\text{loc}))}{\Gamma (k)},\quad x>\text{loc},\ k>0,\ \lambda>0 $$
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For PDF and CDF, you currently include the loc shift in the equations.

A case can be made for or against that. On one hand, it is one of the arguments. On the other, it's not a standard parameter for the distributions and lengthens equations (perhaps needlessly?) For reference, SciPy lists the equations without loc and I used the same approach with FSML. Whatever is decided on, it should then be kept consistent throughout stdlib stats.

Any opinions on how this should be done for stdlib? @jalvesz

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing ones don't so it's probably best i put this in line with those, but I'll wait to see; I should've cleaned up the other doc fixes in the commit underneath

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the description is a more high-level introduction to the method/mathematics rather than the specific implementation (the numerically more stable implementation using logs also isn't part of the description), I lean more towards skipping the loc parameter here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Calluumm @sebastian-mutz, I skimmed through the docs of matlab https://www.mathworks.com/help/stats/gamma-distribution.html and also on wikipedia https://en.wikipedia.org/wiki/Gamma_distribution and indeed I didn't see immediately references to the shift loc parameter. The clearest doc was indeed the SciPy one cited by @sebastian-mutz

Now, going through the current implementation, loc is a mandatory argument, I didn't see any optional formulation. So, in that case it does deserve to appear clearly in the docs and refer to the common use case.

This opens another door: would making these parameters optional make things even more clear for both the implementer and consumer of the procedure? It seems that SciPy does handle it as optional arguments. For instance, when I see cdf(x, a, loc=0, scale=1) in their doc, I understand that x and a are mandatory but loc and scale are optional with clear defaults.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Calluumm and @jalvesz.

As a non-optional parameter, it does need to be stated clearly what it does, but I'd follow SciPy's approach and leave it out of the equation in the description. This could easily create some confusion.

The loc parameter isn't something you'd typically come across in maths/stats books/resources; not as "loc" anyway. It's a parameter added for computational convenience and consistency across distribution. For example, normal distribution functions usually include loc and scale with their common mathematical notations (e.g., mu and sigma in Matlab's normpdf). Giving it more general names and consistently making them (optional) arguments for different distribution seems to be SciPy's approach and I see the merit int this.

For stdlib, making it optional and falling back to defaults (standardised version with loc=0; scale=1) if not passed is arguably the best path in the long term. It allows users with a maths/stats background to use it more intuitively without having to bother with loc, while it's still there for convenience and those used to working with it (from SciPy, for example). This would mean making the loc and scale parameter optional for other distributions too (for consistency).

Copy link
Copy Markdown
Member

@sebastian-mutz sebastian-mutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Calluumm. Nice work. A few comments below. After that, it looks pretty rounded off to me.

@sebastian-mutz sebastian-mutz self-assigned this Feb 23, 2026
Copy link
Copy Markdown
Member

@sebastian-mutz sebastian-mutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @Calluumm. Let's give it a few days in case @jalvesz or anyone else has any comments.

@sebastian-mutz sebastian-mutz merged commit 48c3a19 into fortran-lang:master Feb 26, 2026
75 of 77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants