A Brief Overview of Within-Subject Experimental Design Logic for Individuals with ASD

Rapid Communication

Austin J Autism & Relat Disabil. 2016; 2(3): 1025.

A Brief Overview of Within-Subject Experimental Design Logic for Individuals with ASD

Cox DJ*

Department of Psychology, University of Florida, USA

*Corresponding author:David J Cox, Department of Psychology, University of Florida, 945 Center Drive, Gainesville FL 32611, USA

Received: May 10, 2016; Accepted: June 27, 2016; Published: June 28, 2016

Abstract

Many individuals with ASD receive Applied Behavior Analysis (ABA) services. Most researchers and practicing, Board Certified Behavior Analysts (BCBAs) use within-subject designs to determine the effect of an intervention on behavior. This article outlines the logic for demonstrating experimental control from within-subject research designs for researchers and professionals currently unfamiliar with how to analyze and interpret these designs. Those not familiar with these research designs will be able to better understand and critique withinsubjects research from behavior analytic journals, converse more easily with their behavior analytic colleagues, and use these designs to validate their own work with individual clients.

Keywords: ABA; Within-Subject Design; Evidence-Based Practices

Introduction

The Center for Disease Control estimates approximately 1 in 68 children are identified with Autism Spectrum Disorders (ASD) [1]. Many children with ASD require structured intervention to learn communicative, social, and daily living skills they may not learn as independently or rapidly as typically developing peers. The breadth of skill deficits demonstrated by individuals with ASD often requires remedial services provided by a variety of healthcare professions. Some of these professionals include Speech-and-Language Therapists (SLPs), Occupational Therapists (OTs), special education teachers, psychologists, and Board Certified Behavior Analysts (BCBAs).

Developing and maintaining collaborative relationships across disciplines will likely increase the overall consistency of interventions for individuals with ASD. As interdisciplinary collaboration continues to increase and be required by funding agencies, professionals face the difficulty of establishing and maintaining professional relationships between disciplines. These disciplines often differ widely in terminology, methods, and underlying theoretical assumptions. These differences have led some to discuss how collaboration might be accomplished without abandoning the theoretical foundations underlying a profession [2] through language common to all disciplines [3].

Since the author is most familiar with ABA, this brief overview covers experimental methods that are used by BCBAsand behavior analysts in clinical and research ABA settings. The goal is to briefly discuss the logic underlying various within-subject designs [4]. Discussing the rationale by which socially significant effects are demonstrated should allow the reader to understand how and why claims of effectiveness are made by researchers who employ these methods. This should, in turn, allow individuals previously unfamiliar with within-subject experimental designs to (a) feel more comfortable reading and interpreting behavior analytic research that uses withinsubject designs, (b) better communicate with BCBA colleagues who use these designs within their practice, and (c) use these methods in their own practice with individual clients where experimental evidence of the effect of an intervention is warranted.

The intended audience of this manuscript are clinicians and researchers who are primarily familiar with statistical techniques for data analysis. The first section of this manuscript discusses how the number of observations are often defined. The purpose of this section is to highlight the differences between obtaining large Ns across group designs versus within-subject designs, as well as the strengths and weaknesses of both designs relative to external and internal validity. The second section discusses the difference between statistical and visual analysis of data. The purpose of this section is to highlight differences between ABA research and common psychological research. Finally, the third section discusses the concepts of prediction, verification, and replication that are used for analyzing intervention effectiveness in three common within-subject empirical designs. The purpose of the third section is to provide examples and a description of how to analyze data from these designs to determine intervention effectiveness.

N as Number of Observations

Nearly all published research discusses the total N from the experiment. N in these contexts refers to the total number of observations made by researchers. In many situations N and number of participants may be equivalent, but this is not always so. For example, N would be equivalent for a researcher who chooses to record 1 observation on 100 participants, 10 observations on 10 participants, or 100 observations on 1 participant. Within-subject designs often use the latter approach across several subjects (see the Journal of Applied Behavior Analysis for a multitude of examples). Just as in group design research, findings from within-subject designs are interpreted in light of the number of observations present within each phase of the experiment.

All experiments seek to provide valid findings and research is often critiqued in terms of external and internal validity. Obtaining one observation for a large number of participants would result in greater external validity of findings and lower internal validity [5]. That is, the researcher would be able to make general statements about the influence of an independent variable on a dependent variable for that particular sample of individuals. However, the researcher would be unable to say much about any individual participant from the sample. Compared to within-subject designs, external validity is a strength of group designs. The comparative weakness of group designs is decreased understanding of the mechanisms of individual behavior change [6].

In contrast, recording 100 observations for 1 participant would result in greater internal validity of findings and less external validity. Every response emitted by an individual varies in some dimension from previous responses. It is impossible to emit the exact same response twice. The more observations collected for a participant, the greater the probability the observations collected accurately reflect the effect of an independent variable on the dependent variable for that particular individual. This allows the researcher to make more specific statements about the effect of an independent variable on a dependent variable for that specific participant [6]. This becomes a significant strength of within-subject designs - especially for clinicians interested in the effect of an intervention for a specific individual. However, these designs are less equipped to make claims about how results generalize to other participants within the larger population compared to group designs.

The most appropriate data collection strategy likely depends on the research question being asked and resources available to the researcher. ABA often opts for a large number of observations for each participant. Over time, the results of a large number of participants who are exposed to the same independent variable are sometimes aggregated and analyzed [7]. However, the majority of researchers publishing within ABA use a handful of participants in any one study (see Journal of Applied Behavior Analysis for examples).

Evaluating Data

Another difference between ABA and many other psychological fields is how data are analyzed. Most fields of psychology use inferential statistics to determine the effect of an intervention on a target behavior. ABA typically uses visual analysis to interpret data and not statistical techniques. Visual analysis of clinical intervention typically uses two criteria for determining the effectiveness of an intervention. The first criterion is clear demonstration of functional control of the behavior of interest (more on this below). The second criterion is the notion of social significance as opposed to statistical significance.

A socially significant change can be defined as a change in the occurrence of behavior to levels acceptable to the client and socially relevant stakeholders (e.g., caregivers, other therapists, teachers, funding agencies, etc. [8]. This differs from statistically significant change.Statistical significance refers to the probability that an observed difference between the experimental and control group occurred if the independent variable really had no effect. A demarcating example could be as follows: An individual is observed to strike their own head at a rate of 100 times per hour with a standard deviation of 10 strikes to the head. An intervention put into place that consistently results in a reduction of self-injury to 50 times per hour across all participants exposed to the intervention with a standard deviation of 5. This reduction will surely be statistically significant. However, these results would not be socially significant as the individual is still engaging in self-injurious behavior at rates harmful to his or her overall health. Statistically, the intervention would have been deemed effective. Based on social significance, the intervention would still need a lot of improvement.This does not mean that all statistically significant differences are not socially significant. Only that social significance is a different criterion by which to judge change in behavior.

The use of visual analysis has been argued by some as bias on part of the individual interpreting the data which leads to inconsistent analysis [9]. This has led some to propose structured criteria for interpreting within-subject design graphs [10] as well as the use of statistical techniques for interpreting within-subject design data [11]. However, other researchers have observed relatively high consistency among those trained in the visual analysis of within-subject design graphs [12]. As such, visual analysis in most published research and clinical settings does not use structured interpretation criteria nor statistical techniques.

The second requirement for labeling an intervention as effective is functional control. Similar to group designs, demonstrating functional control using a within-subject design involves comparison of observations in a treatment condition to observations in a control condition. This control condition is often referred to as a baseline period and provides measurement of the behavior in the absence of the intervention. Observations are typically collected until stability in responding is achieved or the behavior is trending (i.e., increase or decreasing) in a direction opposite to the intended effect of the intervention. The overall level, variability, and trend during the baseline control condition is then compared to the level, variability, and trend during the intervention condition. If these comparisons can be verified and replicated within a subject, the intervention could be argued to have demonstrated functional control over responding. How verification and replication are demonstrated visually depends on the type of within-subject design used.

Common Within-Subject Designs

ABAB reversal design

Figure 1 shows an example ABAB reversal design. This design starts in a baseline/control condition until one of two things occur. First would be that stability in responding is observed. Stability can be defined as an absence of trend (behavior increasing or decreasing) and all measures fall within a small range of values (low variability) [11]. Second would be that the data path has a trend in the opposite direction to what is expected with the intervention. For example, if an intervention is aimed at decreasing self-injurious behavior, an increasing baseline trend would suggest the researcher could implement the intervention. This is because reduction in selfinjurious behavior during the intervention would not be confounded with the baseline trend (Figure 1).