@@ -79,7 +79,7 @@ or sample.
7979:func: `median ` Median (middle value) of data.
8080:func: `median_low ` Low median of data.
8181:func: `median_high ` High median of data.
82- :func: `median_grouped ` Median, or 50th percentile, of grouped data.
82+ :func: `median_grouped ` Median ( 50th percentile) of grouped data.
8383:func: `mode ` Single mode (most common value) of discrete or nominal data.
8484:func: `multimode ` List of modes (most common values) of discrete or nominal data.
8585:func: `quantiles ` Divide data into intervals with equal probability.
@@ -329,55 +329,56 @@ However, for reading convenience, most of the examples show sorted sequences.
329329 be an actual data point rather than interpolated.
330330
331331
332- .. function :: median_grouped(data, interval=1)
332+ .. function :: median_grouped(data, interval=1.0 )
333333
334- Return the median of grouped continuous data, calculated as the 50th
335- percentile, using interpolation. If * data * is empty, :exc: ` StatisticsError `
336- is raised. * data * can be a sequence or iterable .
334+ Estimates the median for numeric data that has been ` grouped or binned
335+ <https://en.wikipedia.org/wiki/Data_binning> `_ around the midpoints
336+ of consecutive, fixed-width intervals .
337337
338- .. doctest ::
338+ The *data * can be any iterable of numeric data with each value being
339+ exactly the midpoint of a bin. At least one value must be present.
339340
340- >>> median_grouped([52 , 52 , 53 , 54 ])
341- 52.5
341+ The *interval * is the width of each bin.
342342
343- In the following example, the data are rounded, so that each value represents
344- the midpoint of data classes, e.g. 1 is the midpoint of the class 0.5--1.5, 2
345- is the midpoint of 1.5--2.5, 3 is the midpoint of 2.5--3.5, etc. With the data
346- given, the middle value falls somewhere in the class 3.5--4.5, and
347- interpolation is used to estimate it:
343+ For example, demographic information may have been summarized into
344+ consecutive ten-year age groups with each group being represented
345+ by the 5-year midpoints of the intervals:
348346
349347 .. doctest ::
350348
351- >>> median_grouped([1 , 2 , 2 , 3 , 4 , 4 , 4 , 4 , 4 , 5 ])
352- 3.7
353-
354- Optional argument *interval * represents the class interval, and defaults
355- to 1. Changing the class interval naturally will change the interpolation:
349+ >>> from collections import Counter
350+ >>> demographics = Counter({
351+ ... 25 : 172 , # 20 to 30 years old
352+ ... 35 : 484 , # 30 to 40 years old
353+ ... 45 : 387 , # 40 to 50 years old
354+ ... 55 : 22 , # 50 to 60 years old
355+ ... 65 : 6 , # 60 to 70 years old
356+ ... })
357+ ...
358+
359+ The 50th percentile (median) is the 536th person out of the 1071
360+ member cohort. That person is in the 30 to 40 year old age group.
361+
362+ The regular :func: `median ` function would assume that everyone in the
363+ tricenarian age group was exactly 35 years old. A more tenable
364+ assumption is that the 484 members of that age group are evenly
365+ distributed between 30 and 40. For that, we use
366+ :func: `median_grouped `:
356367
357368 .. doctest ::
358369
359- >>> median_grouped([1 , 3 , 3 , 5 , 7 ], interval = 1 )
360- 3.25
361- >>> median_grouped([1 , 3 , 3 , 5 , 7 ], interval = 2 )
362- 3.5
363-
364- This function does not check whether the data points are at least
365- *interval * apart.
366-
367- .. impl-detail ::
368-
369- Under some circumstances, :func: `median_grouped ` may coerce data points to
370- floats. This behaviour is likely to change in the future.
371-
372- .. seealso ::
370+ >>> data = list (demographics.elements())
371+ >>> median(data)
372+ 35
373+ >>> round (median_grouped(data, interval = 10 ), 1 )
374+ 37.5
373375
374- * "Statistics for the Behavioral Sciences", Frederick J Gravetter and
375- Larry B Wallnau (8th Edition).
376+ The caller is responsible for making sure the data points are separated
377+ by exact multiples of *interval *. This is essential for getting a
378+ correct result. The function does not check this precondition.
376379
377- * The `SSMEDIAN
378- <https://help.gnome.org/users/gnumeric/stable/gnumeric.html#gnumeric-function-SSMEDIAN> `_
379- function in the Gnome Gnumeric spreadsheet, including `this discussion
380- <https://mail.gnome.org/archives/gnumeric-list/2011-April/msg00018.html> `_.
380+ Inputs may be any numeric type that can be coerced to a float during
381+ the interpolation step.
381382
382383
383384.. function :: mode(data)
0 commit comments