mosum
¶
Moving Sum Based Procedures for Changes in the Mean
Submodules¶
Package Contents¶
Functions¶
|
Computes the asymptotic critical value for the MOSUM test |
|
Multiscale MOSUM algorithm with bottom-up merging |
|
Multiscale MOSUM algorithm with localised pruning |
|
Default choice for the set of multiple bandwidths |
|
Test data with piecewise constant mean |
|
3D Visualisation of multiscale MOSUM statistics |
- mosum.mosum(x, G, G_right=float('nan'), var_est_method=['mosum', 'mosum_min', 'mosum_max', 'custom'][0], var_custom=None, boundary_extension=True, threshold=['critical_value', 'custom'][0], alpha=0.1, threshold_custom=float('nan'), criterion=['eta', 'epsilon'][0], eta=0.4, epsilon=0.2, do_confint=False, level=0.05, N_reps=1000)¶
MOSUM procedure for multiple change point estimation
Computes the MOSUM detector, detects (multiple) change points and estimates their locations.
- Parameters:
x (list) – input data
G (int) – bandwidth; should be less than ‘len(x)/2’
G_right (int) – if ‘G.right != G}, the asymmetric bandwidth ‘(G, G.right)’ will be used; if ‘max(G, G.right)/min(G, G.right) > 4’, a warning message is generated
var_est_method (how the variance is estimated; possible values are) – ‘mosum’ : both-sided MOSUM variance estimator ‘mosum_min’ : minimum of the sample variance estimates from the left and right summation windows ‘mosum_max’ : maximum of the sample variance estimates from the left and right summation windows ‘custom’ : a vector of ‘len(x)’ is to be parsed by the user; use ‘var.custom’ in this case to do so
var_custom (float) – vector (of the same length as ‘x}) containing local estimates of the variance or long run variance; use iff ‘var.est.method = “custom”’
boundary_extension (bool) – a logical value indicating whether the boundary values should be filled in with CUSUM values
threshold (Str) – indicates which threshold should be used to determine significance. By default, it is chosen from the asymptotic distribution at the given significance level ‘alpha`. Alternatively it is possible to parse a user-defined numerical value with ‘threshold.custom’.
alpha (float) – numeric value for the significance level with ‘0 <= alpha <= 1’; use iff ‘threshold = “critical_value”’
threshold_custom (float) – value greater than 0 for the threshold of significance; use iff ‘threshold = “custom”’
criterion (Str) – indicates how to determine whether each point ‘k’ at which MOSUM statistic exceeds the threshold is a change point; possible values are ‘eta’ : there is no larger exceeding in an ‘eta*G’ environment of ‘k’ ‘epsilon’ : ‘k’ is the maximum of its local exceeding environment, which has at least size ‘epsilon*G’
eta (float) – a positive numeric value for the minimal mutual distance of changes, relative to moving sum bandwidth (iff ‘criterion = “eta”’)
epsilon (float) – a numeric value in (0,1] for the minimal size of exceeding environments, relative to moving sum bandwidth (iff ‘criterion = “epsilon”’)
do_confint (bool) – flag indicating whether to compute the confidence intervals for change points
level (float) – use iff ‘do_confint = True’; a numeric value (‘0 <= level <= 1’) with which ‘100(1-level)%’ confidence interval is generated
N_reps (int) – use iff ‘do.confint = True’; number of bootstrap replicates to be generated
- Returns:
mosum_obj object containing
x (list) – input data
G_left, G_right (int) – bandwidths
var_est_method, var_custom, boundary_extension (Str) – input
stat (list) – MOSUM statistics
rollsums (list) – MOSUM detector
var_estimation (list) – local variance estimates
threshold, alpha, threshold_custom – input
threshold_value (float) – threshold of MOSUM test
criterion, eta, epsilon – input
cpts (ndarray) – estimated change point
cpts_info (DataFrame) – information on change points, including detection bandwidths, asymptotic p-values, scaled jump sizes
do_confint (bool) – input
ci – confidence intervals
Examples
>>> import mosum >>> xx = mosum.testData("blocks")["x"] >>> xx_m = mosum.mosum(xx, G = 50, criterion = "eta", boundary_extension = True) >>> xx_m.summary() >>> xx_m.print()
- mosum.criticalValue(n, G_left, G_right, alpha)¶
Computes the asymptotic critical value for the MOSUM test
- mosum.multiscale_bottomUp(x, G=None, threshold=['critical_value', 'custom'][0], alpha=0.1, threshold_function=None, eta=0.4, do_confint=False, level=0.05, N_reps=1000)¶
Multiscale MOSUM algorithm with bottom-up merging
- Parameters:
x (list) – input data
G (int) –
- vector of bandwidths; given as either integers less than len(x)/2,
or numbers between 0 and 0.5 describing the moving sum bandwidths relative to len(x)
threshold (Str) – indicates which threshold should be used to determine significance. By default, it is chosen from the asymptotic distribution at the given significance level ‘alpha`. Alternatively it is possible to parse a user-defined function with ‘threshold_function’.
alpha (float) – numeric value for the significance level with ‘0 <= alpha <= 1’; use iff ‘threshold = “critical_value”’
threshold_function (function) –
eta (float) – a positive numeric value for the minimal mutual distance of changes, relative to moving sum bandwidth (iff ‘criterion = “eta”’)
do_confint (bool) – flag indicating whether to compute the confidence intervals for change points
level (float) – use iff ‘do_confint = True’; a numeric value (‘0 <= level <= 1’) with which ‘100(1-level)%’ confidence interval is generated
N_reps (int) – use iff ‘do.confint = True’; number of bootstrap replicates to be generated
- Returns:
multiscale_cpts object containing
x (list) – input data
G (int) – bandwidth vector
threshold, alpha, threshold_function, eta – input
cpts (ndarray) – estimated change point
cpts_info (DataFrame) – information on change points, including detection bandwidths, asymptotic p-values, scaled jump sizes
pooled_cpts (ndarray) – change point candidates
do_confint (bool) – input
ci – confidence intervals
Examples
>>> import mosum >>> xx = mosum.testData("blocks")["x"] >>> xx_m = mosum.multiscale_bottomUp(xx, G = [50,100]) >>> xx_m.summary() >>> xx_m.print()
- mosum.multiscale_localPrune(x, G=None, max_unbalance=4, threshold='critical_value', alpha=0.1, threshold_function=None, criterion='eta', eta=0.4, epsilon=0.2, rule='pval', penalty='log', pen_exp=1.01, do_confint=False, level=0.05, N_reps=1000)¶
Multiscale MOSUM algorithm with localised pruning
- xlist
input data
- Gint
- vector of bandwidths; given as either integers less than len(x)/2,
or numbers between 0 and 0.5 describing the moving sum bandwidths relative to len(x)
- max_unbalancefloat
a numeric value for the maximal ratio between maximal and minimal bandwidths to be used for candidate generation, at least 1
- thresholdStr
indicates which threshold should be used to determine significance. By default, it is chosen from the asymptotic distribution at the given significance level ‘alpha`. Alternatively it is possible to parse a user-defined function with ‘threshold_function’.
- alphafloat
numeric value for the significance level with ‘0 <= alpha <= 1’; use iff ‘threshold = “critical_value”’
- threshold_functionfunction
criterion : Str
indicates how to determine whether each point ‘k’ at which MOSUM statistic exceeds the threshold is a change point; possible values are ‘eta’ : there is no larger exceeding in an ‘eta*G’ environment of ‘k’ ‘epsilon’ : ‘k’ is the maximum of its local exceeding environment, which has at least size ‘epsilon*G’
- etafloat
a positive numeric value for the minimal mutual distance of changes, relative to moving sum bandwidth (iff ‘criterion = “eta”’)
- epsilonfloat
a numeric value in (0,1] for the minimal size of exceeding environments, relative to moving sum bandwidth (iff ‘criterion = “epsilon”’)
- ruleStr
Choice of sorting criterion for change point candidates in merging step. Possible values are: ‘pval’ : smallest p-value ‘jump’ : largest (rescaled) jump size
- penaltyStr
Type of penalty term to be used in Schwarz criterion; possible values are: ‘log’ : use ‘penalty = log(len(x))**pen_exp’ ‘polynomial’ : use ‘penalty = len(x)**pen_exp’
- pen_expfloat
penalty exponent
- do_confintbool
flag indicating whether to compute the confidence intervals for change points
- levelfloat
use iff ‘do_confint = True’; a numeric value (‘0 <= level <= 1’) with which ‘100(1-level)%’ confidence interval is generated
- N_repsint
use iff ‘do.confint = True’; number of bootstrap replicates to be generated
multiscale_cpts object containing x : list
input data
- Gint
bandwidth vector
- threshold, alpha, threshold_function, eta
input
- cptsndarray
estimated change point
- cpts_infoDataFrame
information on change points, including detection bandwidths, asymptotic p-values, scaled jump sizes
- pooled_cptsndarray
change point candidates
- do_confintbool
input
- ci
confidence intervals
>>> import mosum >>> xx = mosum.testData("mix")["x"] >>> xx_m = mosum.multiscale_localPrune(xx, G = [8,15,30,70]) >>> xx_m.summary() >>> xx_m.print()
- mosum.bandwidths_default(n, d_min=10, G_min=10, G_max=None) int ¶
Default choice for the set of multiple bandwidths
- mosum.testData(model=['custom', 'blocks', 'fms', 'mix', 'stairs10', 'teeth10'][1], lengths=None, means=None, sds=None, rand_gen=np.random.normal, seed=None, rand_gen_args=[0, 1])¶
Test data with piecewise constant mean
Generate piecewise stationary time series with independent innovations and change points in the mean.
- Parameters:
model (str) – custom or pre-defined signal
lengths (int) – vector of segment lengths (custom only)
means (int) – vector of segment means (custom only)
sds (int) – vector of segment standard deviations (custom only)
rand_gen (function) – innovation function
seed (int) – random seed
rand_gen_args (ndarray) – arguments for rand_gen
- Returns:
x (ndarray) – simulated data series
mu (ndarray) – signal
sigma (float) – standard deviation
cpts (ndarray) – true change points
Examples
>>> mosum.testData() >>> mosum.testData("custom", lengths = [100,100], means=[0,1], sds= [1,1])
- mosum.persp3D_multiscaleMosum(x, mosum_args=dict(), threshold=['critical_value', 'custom'][0], alpha=0.1, threshold_function=None, palette=cm.coolwarm, xlab='G', ylab='time', zlab='MOSUM')¶
3D Visualisation of multiscale MOSUM statistics
- Parameters:
x (list) – input data
mosum_args (dict) – dictionary of keyword arguments to mosum
threshold (Str) – indicates which threshold should be used to determine significance. By default, it is chosen from the asymptotic distribution at the given significance level ‘alpha`. Alternatively it is possible to parse a user-defined function with ‘threshold_function’.
alpha (float) – numeric value for the significance level with ‘0 <= alpha <= 1’; use iff ‘threshold = “critical_value”’
threshold_function (function) –
palette (matplotlib.colors.LinearSegmentedColormap) – colour palette for plotting, accessible from matplotlib.cm
xlab (Str) – axis labels for plot
ylab (Str) – axis labels for plot
zlab (Str) – axis labels for plot
Examples
>>> import mosum >>> xx = mosum.testData("blocks")["x"] >>> mosum.persp3D_multiscaleMosum(xx)