Kim, J., & Cheon, S. (2011). Bayesian estimation of multiple points of change with stochastic approximation of Monte Carlo annealing. Computational Statistics, 25, 215-239. Take as an example a study by Scott Ross and Robert Horner (Ross & Horner, 2009)[2]. They were interested in how a school-wide bullying prevention program affects the bullying behaviour of some problem students. In each of the three different schools, the researchers studied two students who had regularly engaged in bullying. During the baseline phase, they observed students each day for 10-minute periods during the lunch break and counted the number of aggressive behaviours they exhibited towards their peers. (The researchers used handheld computers to record the data.) After 2 weeks, they implemented the program in a school.

After another 2 weeks, it was implemented at the second school. And after another 2 weeks, they implemented it at the third school. They found that each student`s number of aggressive behaviours decreased shortly after the program was implemented in their school. Note that if the researchers had studied only one school, or if they had introduced the treatment in all three schools at the same time, it would not be clear whether the reduction in aggressive behavior was due to the bullying program or something else introduced at about the same time (e.g., a vacation, a TV show, a change of time). But with its multi-base design, this kind of coincidence would have to happen three times – a very unlikely event – to explain its results. With the exception of Natesan and Hedges (2017), all quantitative changes in SCED assume that the observed variable really belongs to the phase to which it should belong. However, this is not always the case, especially if latency is expected. Latency occurs when a treatment takes time to take effect when it is administered or when it becomes ineffective when it is withdrawn. Although latency is not desirable with SCEDs, in some cases it can be expected due to the nature of the processing and outcome variables.

For example, a child diagnosed with autism may not immediately respond to a specific treatment, or it may take some time for a drug to be completely eliminated from the human body. In such cases, gradual and/or delayed effects should be identified or considered with appropriate analytical strategies to disentangle them from long-term treatment effects (Duan, Kravitz & Schmid, 2013). Our method can be used to assess immediacy, which is one aspect of quantifying SEED outcomes through transparent, objective and repeatable procedures. The multiple basic design was first used in 1960 as in basic operant research. It was applied to human experiments in the late 1960s in response to practical and ethical problems that arose when apparently effective treatments of human subjects were withdrawn. [10] It plots two or more (often three) behaviors, people, or attitudes in a quirky graph that changes one but not the other two, then the second, but not the third, behavior, person, or attitude. The differential changes that occur in any behavior, person, or environment help reinforce what is essentially an AB design with its problematic competing assumptions. [ref. needed] There are repeated measurements or observations (as in a longitudinal group design). This property means that a limited number of measurements are carefully selected so that they are sufficiently sensitive to relevant changes while being robust to interference due to repeated use. These meters may, but are not necessarily standardized questionnaires for large groups. It may also be measures developed for the respective entity, taking into account its particularities.

In addition to focusing on individual participants, single-thematic research differs from group research in the way data is typically analyzed. As we have seen throughout the book, group research is about combining data between participants. Group data is described using statistics such as means, standard deviations, Pearson`s r, etc. to identify common trends. Finally, inferential statistics are used to decide whether the sample result is likely to be generalized to the population. In contrast, single-topic research relies heavily on a very different approach, called visual inspection. This means presenting each participant`s data as described in this chapter, carefully examining these data, and assessing whether and to what extent the independent variable influenced the dependent variable. Inferential statistics are generally not used. 18 students participated in a study (see [9]) to investigate the effects of loperamide on anorectal function in healthy men.

Anorectal manometry was performed for these subjects on two different days at least seven days apart. For this purpose, a rectal balloon was positioned in three different places (5, 10 or 15 cm from the edge of the edge). The order of the three locations was randomized. One day, subjects were given a 10 mg dose of loperamide, the other day a placebo, balancing the order of the two conditions. The subjects were blind to the drugs and the location of the balloon. The researcher who performed the manometry was blind to the drugs. Using analysis of variance, a „significant” main effect of the position of the balloon relative to a first dependent variable was determined. For a second dependent variable, there was also a major effect of balloon position and, in addition, a „significant” interaction between location and drug was found. No „significant” results were found for a range of other statistical tests. Various case-by-case models have evolved to allow clinical researchers to eliminate the various threats to internal validity described above. Typically, these designs involve complex comparisons of different stages of intervention and manipulated baselines between behaviours, attitudes and participants. The logic of many of these conceptions also stems from applied behavioural analysis and the application of basic theoretical learning assumptions.

Therefore, it is questionable to what extent this myriad complex of designs for psychotherapeutic applications is relevant. Therefore, we have avoided describing these different conceptions in detail. Several of the references cited in Further Reading specify different models, their rationales and uses. Instead, we focus on the most influential design feature that surrounds replication issues. The researchers typically used some sort of experimental design methodology on a case-by-case basis to assess the effects of GBG. Individual case plans derive their power to exclude other explanations of treatment effects (i.e., internal validity) by comparing performance under different conditions applied to the same person or group of people (as in GBG) over time. In most GBG surveys, performance or behavior data from the group or class is aggregated and evaluated, as would be the case with data from a single person. Bayesian methods often outperform conventional methods in particularly small samples, allow for a more probabilistic interpretation of statistics than conventional methods, and can more easily account for the complexity of models, such as the use of distributions that reflect the scale of observed variables, autocorrelation modeling, and the representation of hierarchical data structures. Bayesian estimation works well with small sampling data because it does not depend on asymptotic theory or large-scale sampling (Ansari and Jedidi, 2000; Gelman, Carlin, Stern & Rubin, 2004).

This makes Bayesian particularly beneficial for SCEDs. Bayesian estimates a probability distribution for each parameter. This posterior distribution can be used to calculate any summary statistic for the parameter of interest, and its highest density interval (HDI) of 95% can be directly interpreted as having a 95% probability of containing the true value (Lynch, 2007). The buttocks of points of change with a high probability mass at several points indicate weak evidence of a treatment effect. Bayesian estimates of autocorrelation have the advantage of being more accurate than frequentist estimates. Frequentist confidence intervals of autocorrelation are covered (Shadish et al., 2013). Gelman, Carlin, Stern, Dunson, Vehtari and Rubin (2013) offer a comprehensive discussion of Bayesian methods.