Truncnormal by benjamin-lieser · Pull Request #32 · rust-random/rand_distr

benjamin-lieser · 2025-10-27T15:54:05Z

Added a CHANGELOG.md entry

Summary

Added a TruncatedNormal distribution

Motivation

#7

Details

It follows Robert, Christian P. (1995). "Simulation of truncated normal variables"

The test still needs to be improved. The code seems to be working, but it is still a draft.

benjamin-lieser · 2025-10-28T12:09:00Z

The two sided algorithm can fail (very low acceptance probability) when the range is big but far away from the mean.

There is a method missing, which uses the one sided algorithm and does rejection sampling to get to the two sided case. (I will implement this)

The problem is how to determine efficiently which of the 4 methods to use.

Naive rejection sampling
One sided
Two sided with the uniform proposal
Two sided with rejection sampling on the one sided case

The paper gives useful probabilities, but they are all quite expensive to evaluate. This would not matter if a lot of samples are drawn but could be quite a bit if only a few are used.
Maybe we actually put a tuning option to spend more upfront to get better sampling later.

I think we could also use the inverse cdf approach implemented by @Caellian with some numerical optimizations. This has the advantage that it does not use rejection sampling and would be vectorized well (maybe?)

Caellian · 2025-10-29T13:45:48Z

In the paper you're referencing (Robert, Christian P. (1995)) and later in Chopin 2012, a cutoff is computed analytically as 0.47 σ (or 0.477 σ) for finite intervals and 0.5 σ for semi-finite ones, so if "a ≥ 0.477 σ" switch to exponential-tail sampler (Robert/Marsaglia/Chopin).

So you can compare both bounds to figure out where they fall in the constructor, and then just use whichever algorithm is most efficient for that specific range. Branching is unavoidable - it will always have to be either a vtable lookup or a jump (preferable), rust is more likely to optimize away a simple jump if the range is constant, but CPU branch prediction should really remove this cost in most cases.

let a = (start - mean) / std;
let b = (end - mean) / std;
match () {
    // Extremely narrow interval: treat as degenerate
    _ if (b - a).abs() < 1e-6 => 0.5 * (a + b)

    // Narrow interval
        // Inverse CDF works best here, with f64
        // Use log-space for extreme a,b (e.g., > 8 sigma or < -8 sigma)
    _ if (b - a) < 1e-3 => sample_inverse_cdf(a, b)

    // Both tails positive (left cutoff above mean)
    _ if a >= 0.47 => sample_exponential_tail(a, b)
    // Both tails negative (right cutoff below mean)
        // symmetric: flip and reuse upper tail sampler
    _ if b <= -0.47 => -sample_exponential_tail(-b, -a)

    // Straddling zero (typical central case)
        // Standard rejection sampler works efficiently
    _ if a < 0.47 && b > -0.47 => sample_rejection(a, b)

    // Asymmetric truncation (one tail + near-zero cutoff)
        // mixed region: tail on one side, cutoff on the other
        // use exponential or hybrid rejection
    _ if a >= 0.0 => sample_rejection(a, b)
    _ if b <= 0.0 => -sample_rejection(-b, -a)
    _ => panic!("Invalid truncation bounds or NaN"),
}

benjamin-lieser · 2025-10-31T10:15:02Z

Thanks, this looks already like a good approach :)
I don't think this works well in all cases. I guess with sample_exponential_tail you mean doing the sampling from the one sided distribution and then doing rejection sampling. Depending on the bounds you have poor acceptance rates and the uniform proposal would be better (It is also significantly faster per proposal). The narrow range would probably also better served with the uniform proposal.

benjamin-lieser · 2025-10-31T10:19:58Z

a = 0.45, b = inf uses the naive rejection sampling and has a acceptance rate of 32% which seems like it could be improved.

Caellian · 2025-10-31T21:21:57Z

a = 0.45, b = inf uses the naive rejection sampling and has a acceptance rate of 32% which seems like it could be improved.

Only if range includes [-0.47 σ, 0] or [0, 0.47 σ] should it use rejection sampling. [-0.47 σ, 0.47 σ] is where majority of the mass is. For [0.45 σ, &infty;] use a tail algorithm (Robert (lemma 2.2)/Marsaglia/Chopin); this is what I mean by sample_exponential_tail.

Wikipedia says Marsaglia is on average faster than Robert even though it has higher rejection rate because it does not require the costly numerical evaluation of the exponential function.

/// Marsaglia
fn sample_exponential_tail<R: Rng + ?Sized>(rng: &mut R, a: f64, b: f64) -> f64 {
    assert!(a > 0.477 && a < b); // this is here only for example purposes, remove it

    // NOTE: caller reversed a & b if b < 0.0, so same function is called and
    // only the returned value is negated
    // NOTE: if range intersects 0, then use current sampler impl with rejection

    loop {
        let u1: f64 = rng.random::<f64>().max(1e-16); // sample uniform [0, 1] f64
        let x = (a * a - 2.0 * u1.ln()).sqrt();
        if x > b {
            // reject if beyond upper bound; will be always true if b is
            // infinity, assume it's optimized away by compiler or branch
            // prediction
            continue;
        }
        let u2: f64 = rng.random::<f64>(); // and another one
        if u2 < a / x {
            return x;
        }
    }
}

…the runtime of all of them is good

benjamin-lieser · 2026-03-13T17:44:15Z

I have updated the code, it now should sample in all cases with a reasonable performance.

If we want to give value stability guaranties we need to decide on a strategy when to use which methods. Changing this later will probably break this.

mstoeckl · 2026-03-16T00:47:27Z

It may take me some time to review this in detail, but I can give some initial comments:

This implementation may work just as well for general F satisfying F: Float, StandardNormal: Distribution<F> + Exp: Distribution<F>, as I don't see any logic requiring f32 or f64 in particular.
The sampling logic looks fundamentally OK for "reasonable" parameter values that do not trigger underflow or overflow. That being said, there are a few edge case parameters on which the implementation can panic. (To find them, I've added NormalTruncated to the fuzzer in https://github.com/mstoeckl/rand_distr/tree/fuzz-params.). There may be more issues of this type (for example, if mean is NaN...). While handling extreme values may not be needed for typical use, these can crop up as inputs if users provide parameters resulting from a calculation. Would you like to resolve them all?

mstoeckl · 2026-03-16T00:49:30Z

src/normal_truncated.rs

+    pub fn new(mean: f64, stddev: f64, lower: f64, upper: f64) -> Result<Self, Error> {
+        if !(stddev > 0.0) {
+            return Err(Error::InvalidStdDev);
+        }
+        if !(lower < upper) {
+            return Err(Error::InvalidBounds);
+        }
+
+        let std_lower = (lower - mean) / stddev;
+        let std_upper = (upper - mean) / stddev;


The specific first issue that I noticed causing panics when fuzzing is that even if lower < upper, std_lower may equal std_upper, making the sampling in NormalTruncatedTwoSided fail.

(I'm not certain how best to resolve this. Perhaps make NormalTruncatedTwoSided::sample sample based on the original lower..upper range?)

mstoeckl · 2026-03-16T10:25:11Z

src/normal_truncated.rs

+    pub fn new(mean: f64, stddev: f64, lower: f64, upper: f64) -> Result<Self, Error> {
+        if !(stddev > 0.0) {
+            return Err(Error::InvalidStdDev);
+        }


It may slightly improve API consistency to have NormalTruncated generalize Normal and allow stddev = 0.0 iff mean is in [lower, upper). The wikipedia page is also careful to define the real-valued truncated normal distribution in a way that allows this.

benjamin-lieser added 10 commits September 15, 2025 09:33

first draft

ee06809

second draft

fd06f66

one sided case

f0b29b3

two sided case

a8a4c61

first prototype

89dd17d

added ks test

7378ba2

fmt and num_traits Float

a2afcec

clippy

3a5f549

typo

c4f74e1

fmt

e2e2f3d

benjamin-lieser added 4 commits March 13, 2026 18:37

added two sided tail sampling. All 4 cases are tested in KS test and …

4e628ee

…the runtime of all of them is good

Merge remote-tracking branch 'origin/master' into truncnormal

0d88e5b

rand v10 changes

12744af

fmt

0221c45

benjamin-lieser added 2 commits March 13, 2026 18:49

fix tests and doc

1742ba8

fmt

4830819

benjamin-lieser marked this pull request as ready for review March 13, 2026 17:54

dhardy assigned mstoeckl Mar 14, 2026

mstoeckl reviewed Mar 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Truncnormal#32

Truncnormal#32
benjamin-lieser wants to merge 16 commits intomasterfrom
truncnormal

benjamin-lieser commented Oct 27, 2025 •

edited

Loading

Uh oh!

benjamin-lieser commented Oct 28, 2025 •

edited

Loading

Uh oh!

Caellian commented Oct 29, 2025 •

edited

Loading

Uh oh!

benjamin-lieser commented Oct 31, 2025

Uh oh!

benjamin-lieser commented Oct 31, 2025

Uh oh!

Caellian commented Oct 31, 2025 •

edited

Loading

Uh oh!

benjamin-lieser commented Mar 13, 2026

Uh oh!

mstoeckl commented Mar 16, 2026

Uh oh!

mstoeckl Mar 16, 2026 •

edited

Loading

Uh oh!

mstoeckl Mar 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

benjamin-lieser commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Details

Uh oh!

benjamin-lieser commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Caellian commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benjamin-lieser commented Oct 31, 2025

Uh oh!

benjamin-lieser commented Oct 31, 2025

Uh oh!

Caellian commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benjamin-lieser commented Mar 13, 2026

Uh oh!

mstoeckl commented Mar 16, 2026

Uh oh!

mstoeckl Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mstoeckl Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

benjamin-lieser commented Oct 27, 2025 •

edited

Loading

benjamin-lieser commented Oct 28, 2025 •

edited

Loading

Caellian commented Oct 29, 2025 •

edited

Loading

Caellian commented Oct 31, 2025 •

edited

Loading

mstoeckl Mar 16, 2026 •

edited

Loading

mstoeckl Mar 16, 2026 •

edited

Loading