Construct Validity and Validity to Change of the Patient-Specific Functional Scale in Patients with Shoulder and Low Back Pain: A Clinimetric Study

Kromer TO; Saner J; Sieben JM; Bastiaenen CHG

Research Article

Phys Med Rehabil Int. 2021; 8(2): 1181.

Construct Validity and Validity to Change of the Patient-Specific Functional Scale in Patients with Shoulder and Low Back Pain: A Clinimetric Study

Kromer TO^1,5, Saner J^2,5, Sieben JM^{3 4,5} and Bastiaenen CHG^4,5,6*

¹Faculty of Health, Safety, Society, Furtwangen University, Germany

²School of Health Professions, Institute of Physiotherapy, Zurich University of Applied Sciences, Switzerland

³Department of Anatomy and Embryology, Maastricht University, The Netherlands

⁴Research Line Functioning & Rehabilitation, Maastricht University, The Netherlands

⁵Caphri Research Institute, Research Line Functioning & Rehabilitation, Maastricht University, The Netherlands

⁶Department of Epidemiology, Maastricht University, The Netherlands

*Corresponding author: Caroline H.G. Bastiaenen, Associate Professor, Research Program Functioning and Rehabilitation, Department of Epidemiology, P. Debyeplein 1, 6229 HA Maastricht, The Netherlands

Received: May 28, 2021; Accepted: July 09, 2021; Published: July 16, 2021

Abstract

Background: Patient-specific and condition-specific measures are widely used in clinical practice and research to measure disability or change over time. While condition-specific outcome measures comprise a range of restrictions generally relevant for all patients, the Patient-Specific Functional Scale measures restrictions chosen by the individual patient.

Objectives: Based on the hypothesis that patient-specific and conditionspecific scales deliver comparable results when used on group level. The aim of this study was to test for floor and ceiling effects, to evaluate construct validity and validity to change of the Patient-Specific Functional Scale when compared to condition-specific outcome measures. For this purpose, two datasets from patients with shoulder pain and low back pain were analyzed.

Methods: Patient-Specific Functional Scale scores were compared to the Shoulder Pain and Disability Index and the Roland Morris Disability Questionnaire at 4 time-points using stem-and-leaf-plots and correlations using Pearson’s r. Hypothesis-driven correlation levels for data interpretation were predefined, with r ≥0.75=high, r ≥0.5=moderate, r ≥0.25=low.

Results: Patient-Specific Functional Scale floor effects were comparable to condition-specific outcome measures in both samples. At none of the timepoints did the Patient-Specific Functional Scale correlate with the conditionspecific outcome measures in the expected manner.

Conclusion: Hypotheses regarding expected ranges of correlation between the Patient-Specific Functional Scale and the condition-specific outcome measures for construct validity and validity to change were not met. While the use of the Patient-Specific Functional Scale in a clinical context has its advantages, the measure is not recommended for group-level evaluations.

Keywords: Patient-centered outcome; Validity; Subacromial pain syndrome; Low back pain

Abbreviations

BL: Baseline; CSOM: Condition-Specific Outcome Measures; GPE: Global Perceived Effect Scale; LBP: Low Back Pain; MCI: Movement Control Impairment; NDI: Neck Disability Index; NRS: Numeric Rating Scale; ODI: Oswestry Disability Index; Pearson’s r: Pearson’s Correlation Coefficient; PSFS: Patient-Specific Functional Scale; RMDQ: Roland & Morris Disability Questionnaire; ROC: Receiver Operating Characteristics Curve; SD: Standard Deviation; SPADI: Shoulder Pain and Disability Index; SPS: Subacromial Pain Syndrome; T1-T3: Follow-up Time Points.

Background

For a therapist, it is essential to ascertain whether improvements in body function or structure also lead to increased activity and participation levels. Therefore, the use of assessment tools which can reflect the actual status or degree of restriction and which can measure patient’s change over time is of crucial importance. Improvements in body functions and structures are predominantly assessed through physical testing; activities and participation are commonly measured using questionnaires. Scores gathered using these measurement tools also allow comparison at a group level and enable patients, therapists and researchers to “measure” the impact of a disease, the progression over time or the effect of an intervention. However, since questionnaires often contain very specific items related to certain activities, it is possible that some items will not be relevant to all patients in the target group. By that, the importance of the individual items could vary between patients. Moreover, a “prefixed” item set may not include activities that are of importance to individual patients. Therefore, patients may be required to score questions that are only partly relevant to them. As a result, these standard questionnaires might not adequately reflect a patient’s individual restrictions or the change in these restrictions over time. In an attempt to solve this problem, the Patient-Specific Functional Scale (PSFS) was developed with the intention to monitor a patient’s progress based on relevant restrictions chosen by the individual himself [1]. The PSFS is comprised of 1 to 5 activities; each activity is rated on an 11-point Numeric Rating Scale (NRS) from 0 (impossible to do) to 10 (no difficulties at all). The PSFS is easy to administer, and takes about five minutes to complete. However, the PSFS also has been used in the past by researchers to determine the current state of function and the development of activity restrictions over time on an average group level. By choosing this approach, researchers have moved away from the originally intended individual focus of the instrument and applied the PSFS to situations for which it was not developed or validated. From a test-theoretical perspective, there are numerous problems in deviating from the original construct. Firstly, the interpretation of an average score across self-selected activities by individuals is a challenge. For researchers and clinicians who are familiar with interpreting data on a clearly defined aspect of disability, it is tempting to interpret outcomes using the same approach; but in fact, one is averaging different constructs. Another problem is that floor or ceiling effects could occur if a patient chooses either lightly activities with scores at the lower end of the scale or severely restricted activities with scores at the upper end of the scale. In the first case, it is difficult to detect a positive development and in the second case to detect a negative development over time; this may affect results for validity to change analysis to a certain degree. Problems may also occur when the initially chosen activities become increasingly irrelevant as a problem as time gone by, due to either the patient’s improved condition or reduction in complaints or because of seasonal effects, when the activity becomes more and more irrelevant during follow up as for example snow shoveling in spring. Dependent on the activities chosen by the individual patient it could also be that outcomes in the PSFS indicate higher or lower disability levels for that patient compared to Condition-Specific Outcome Measures (CSOMs) and that scorings on the PSFS may differ significantly more between patients than their corresponding outcomes on a CSOM, where all patients rate the same standardized set of items. Despite these problems, which have not yet been adequately realized or addressed, several researchers have investigated the psychometric properties of the PSFS on a group level for a variety of musculoskeletal conditions. Results have been formulated as “promising”, since the PSFS has been reported as having good construct validity, discriminant validity, and responsiveness [2-4]. Based on these results we think that testing psychometric properties and comparisons at a group level can be justified by defining the PSFS as an instrument assessing “activity restriction based on items selected by an individual patient” as the overarching construct. We hypothesized that specific musculoskeletal disorders (in our example subacromial shoulder pain and low back pain) lead to specific activity restrictions and specific pain patterns. CSOMs in our case the Shoulder Pain and Disability Index (SPADI) and the Roland & Morris Disability Questionnaire (RMDQ), summarize these typical activities and include a range of tasks from easy to more difficult. These questionnaires were designed to include items that cover the whole range of items assumed relevant for a patient group, although not every item may be of equal importance to each individual patient. Therefore, we assume that many activity restrictions chosen by individual patients for their PSFS can be traced back or are closely related to items listed in the CSOMs. If this is the case, the PSFS could be approached as a construct and, because of the hypothesized close association between the operationalization of both types of measurement, assumed to deliver a relatively high correlation with the CSOM, especially in a cross-sectional analysis. However, there is also the possibility that the PSFS measures a different dimension not covered in the COSM. Taking these arguments into consideration, the aims of this paper are threefold: In two groups of patients, suffering from either Subacromial Pain Syndrome (SPS) or Low Back Pain (LBP), and using the CSOM as an external standard comparator: 1) To test for possible floor and ceiling effects of the PSFS; 2) To evaluate its construct validity compared to the CSOM, and 3) To assess the ability of the PSFS to detect changes over time with reference to an external anchor [5].

Methods

Data were used from two different datasets collected during randomized controlled trials investigating effects of physiotherapy interventions in a patient group with SPS and a second group with LBP in primary care. A detailed description of the inclusion processes, applied treatments and primary analyses can be found in the published study protocols [6,7] and trial results [8-11]. Ethical approval was granted by the ethics committee of the Ludwig- Maximilians-University Munich, Germany (project-no. 018-10) for the SPS trial, and the Swiss Ethics Committee granted ethical approval (KEK-ZH-NR: 2010-0034/5) for the LBP trial. All patients in each trial gave informed consent. Datasets of the two samples were analyzed independently of each other.

Dataset 1 - SPS patients

Participants were recruited through referral for physiotherapy due to shoulder complaints. After baseline assessment, they were randomly assigned to either an intervention or a control group. The intervention group received exercise therapy plus manual therapy, while the control group received only exercise therapy. Baseline characteristics of the 90 participants included in the trial are presented in Table 1.

Table 1: Baseline demographic data and results for SPS and LBP patients initially included in the original trials (mean (SD) if not otherwise stated).




  
    
    SPS (n=90)
    LBP (n=106)
  
  
    Age in years
    51.8 (11.2)
    41.6 (14.1)
  
  
    Gender (female) %, n
    41.1, 46
    66.0, 40
  
  
    Duration of the current episode in weeks
    33.9 (42.8)
    --
  
  
    Overall duration of complaints in years
    8.7 (12.7)
    10.0 (11.0)
  
  
    SPADI/RMDQ total score
    40.4 (17.0)
    8.7 (3.3)
  
  
    PSFS average score
    6.0 (1.7)
    5.7 (1.6)
  
  
    GCPS total score
    --
    27.8 (10.4)
  
  
    GCPS sub-score disability
    --
    12.4 (7.6)
  
  
    FABQ total score
    32.7 (17.4)
    32.2 (14.7)
  
  
    SD: Standard Deviation; SPADI: Shoulder Pain    and Disability Index; RMDQ: Roland Morris Disability Questionnaire; PSFS:    Patient-Specific Functional Scale; FABQ: Fear Avoidance Beliefs Questionnaire;    GCPS: Graded Chronic Pain Scale (Total Score: 70; Pain Intensity: 0-30,    Disability: 0-40); SPS: Subacromial Pain Syndrome; LBP: Low Back Pain.



Table 1: Baseline demographic data and results for SPS and LBP patients initially
included in the original trials (mean (SD) if not otherwise stated).

The primary outcome measure was the SPADI, a shoulderspecific, self-reported questionnaire measuring pain and disability [12]. SPADI sub-scales for pain (items 1 to 5) and function (items 6 to 13) are scored from 0 to 100, with higher scores reflecting higher pain or disability levels. Total SPADI score was calculated by averaging scores of the two sub-scales. The SPADI has been shown to be valid and highly sensitive [12,13]. The German version of the SPADI has also been shown to have excellent reliability and internal consistency [14]. The PSFS [1,15] was also applied. Patients were instructed to choose 3 activities important to them, in which they were impaired, and to rate their ability to perform those on an 11-point NRS from 0 (impossible to do) to 10 (fully capable). The average score across all activities was calculated. For reasons of standardization, the PSFS has been rescaled in this paper, so that 0 now means “no difficulties at all” and 10 means “impossible to do”, in accordance with the other outcome measures used in this analysis. All measurement instruments were applied at Baseline (BL), after 5 weeks (T1), 12 weeks (T2), and at one year follow-up (T3).

Dataset 2 - LBP patients

A total of 106 patients with LBP, defined as pain persisting for longer than six weeks and with no radiating symptoms below the knee, were included in the original LBP trial. Eligible patients presented with LBP in combination with defined complaints associated with Movement Control Impairment (MCI). Other inclusion criteria were a score of at least two positive out of six movement tests (representing MCI) and a minimal level of disability of 5 points on the RMDQ [7,16]. Participants were randomly allocated either to an intervention group that received an individual complaint-specific exercise program, or to a control group that received general exercise therapy. Baseline characteristics are presented in Table 1.

Primary outcome measure was the PSFS [1,15,17]. Patients received the same instructions as in the shoulder trial. A secondary outcome was the RMDQ, which measures LBP-related disability. It consists of 24 dichotomous questions to be answered with either “yes” or “no”, with a “yes” score meaning high disability. Reliability was shown to be high and construct and internal validity to be good, also for the German version [18-20]. All outcomes were measured at baseline (BL), at 9-12 weeks (T1), 6 months (T2), and at one year follow-up (T3).

An overview and description of all outcome measures for both datasets are provided in Table 2. For this study, we decided only to include those patients with complete data regarding variables relevant to our analysis.

Table 2: Outcome measures used in the two trials.




  
    Outcomes measures
    Dimension
    Scale
    Scorings
  
  
    a)    SPS
    
    
    
  
  
    Shoulder Pain and Disability Index (SPADI)
    SPS - related pain & activity limitations
    0-100, continuous

      0-100, continuous
    Items 1-5 scored on a 100mm VAS

      Items 6-13 scored on a 100mm VAS

      Mean of item scores. Higher scores mean higher    pain/disability.
  
  
    Patient-Specific Functional Scale (PSFS)
    SPS - related disability
    0-10, continuous
    11 point visual numeric rating scale (end    descriptors of 0 = impossible to do, 10 = no difficulties at all)
  
  
    b)    LBP
    
    
    
  
  
    Patient-Specific Functional Scale (PSFS)
    LBP - related activity limitations
    0-10, continuous
    11 point visual numeric rating scale (end    descriptors of 0 = impossible to do, 10 = no difficulties at all)
  
  
    Roland-Morris Disability    Questionnaire (RMDQ) 
    LBP - related disability
    0-24, continuous
    Dichotomous questions (yes = with disability,    no=no disability); Scores 0 – 24 (minimal enrolment to trial RMDQ = 5)
  
  
    SPS: Subacromial Pain Syndrome; LBP: Low Back    Pain; VAS: Visual Analogue Scale.



Table 2: Outcome measures used in the two trials.

Data analysis and hypotheses

Floor and ceiling effects: Since patients may be greatly restricted in their activities at the start of treatment, they may also have high scorings for their chosen activities on the PSFS. Because high or low scores at any time point could have influenced the measurement properties of the outcomes, data were checked for possible floor and ceiling effects by using stem-and-leaf-plots at every measurement point before validity to change was investigated. Floor and/or ceiling effects were assumed when more than 15% of values were within 10% of the highest and/or lowest possible scores.

Hypothesis 1: There are no floor or ceiling effects at any measurement point.

Construct validity: To test aspects of construct validity of the PSFS we calculated correlations between the PSFS and SPADI or RMDQ, respectively, at every measurement point using Pearson’s r. A high correlation was defined as r ≥0.75, a moderate correlation as r ≥0.5 and a low correlation as r ≥0.25. High correlations were expected for baseline scorings because both measurements should reflect the status of disability. For the consecutive time points, T1, T2, and T3 progressively decreasing correlations were expected: from high at baseline to low at T3, especially in patients showing good improvement. This also could be because patients were not allowed to change the initially chosen PSFS activities over the 1-year follow up period, an application of the PSFS often used in research and clinical practice nowadays. In consequence, we expected that these activities would become increasingly irrelevant for patients as their health status improved over time.

Hypothesis 2a: There is a high correlation between the PSFS and CSOMs (RMDQ/SPADI) at baseline.

Hypothesis 2b: The correlation between the PSFS and the CSOMs (RMDQ/SPADI) measured at every follow-up point in a cross-sectional independent way (T1, T2, and T3) is lower than the correlation of the preceding point: r-values will decrease from high at baseline to moderate, and to low at T3.

Validity to change: To test the ability of the PSFS to detect change over time we calculated correlations between the change scores in the SPADI/RMDQ and the change scores in the PSFS for the following intervals: BL to T1, T1 to T2, and T2 to T3. We expected that the correlation between change scores would be acceptable in the short term, but would diverge over the longer term. Therefore, our third hypothesis was:

Hypothesis 3: PSFS change scores show high correlations with both the SPADI and the RMDQ between BL to T1, moderate correlations between T1-T2, and low correlations between T2-T3. The CSOMs are used as external anchors.

Results

Participants

Complete datasets were available for 87 SPS-participants (96.7%), and for 60 LBP-participants (56.6%). Characteristics of both samples are described in Table 3.

Table 3: Baseline demographic data and results for SPS and LBP samples included in this analysis (mean (SD) if not otherwise stated).




  
    
    SPS (n=87)
    LBP (n=60)
  
  
    Age in years
    52.0 (11.4)
    41.8 (13.9)
  
  
    Gender (female) %, n
    49.4, 43
    38.3, 23
  
  
    Duration of the current SPS episode in weeks
    33.6 (43.5)
    --
  
  
    Overall duration of complaints in years
    8.6 (12.9)
    9.1 (10.4)
  
  
    SPADI/RMDQ total score
    41.0 (17.0)
    8.7 (3.3)
  
  
    PSFS average score
    6.0 (1.6)
    5.7 (1.6)
  
  
    GCPS total score
    --
    26.5 (10.1)
  
  
    GCPS sub-score disability
    --
    11.1 (7.4)
  
  
    FABQ total score
    32.0 (17.2)
    29.3 (13.8)
  
  
    SD: Standard Deviation; SPADI: Shoulder Pain    and Disability Index; RMDQ: Roland Morris Disability Questionnaire; PSFS:    Patient-Specific Functional Scale; FABQ: Fear Avoidance Beliefs    Questionnaire; GCPS: Graded Chronic Pain Scale (Total Score: 70; Pain    Intensity: 0-30, Disability: 0-40); SPS: Subacromial Pain Syndrome; LBP: Low    Back Pain.



Table 3: Baseline demographic data and results for SPS and LBP samples
included in this analysis (mean (SD) if not otherwise stated).

Floor and ceiling effects (hypothesis 1)

For the PSFS no evidence of ceiling effects at any measurement point was found. Floor effects were found in the SPS sample at T2 and T3, with 32% (n=28) and 52.9% (n=46), respectively, within 10% of the lowest possible score. However, at T2 these participants had a mean (SD) SPADI score of 3.7 (3.6) points with only three participants scoring 10 points or higher. At T3, the mean SPADI score was 2.5 (3.6) with again only three participants scored 10 points or higher. In the LBP sample floor effects were found at T1, T2, and T3, increasing from 21.6% (n=13), 36.7% (n=22) to 43.3% (n=26), respectively. As seen in the SPS sample, the average scores of these patients on the RMDQ were also comparably low. Results are summarized in Table 4.

Table 4: Floor and ceiling effects. n= (%) of patients scoring within the highest and lowest 10% of the PSFS.




  
    
    BL
    T1
    T2
    T3
  
  
    SPS sample (n= 87) 
     
     
     
     
  
  
    PSFS =1
    N=0 (0%)
    N=13 (14.9%)
    N=28 (32.0%)
    N=46 (52.9%)
  
  
    SPADI mean (SD) score*
    --
    --
    3.7 (3.6)
    2.5 (3.6)
  
  
    PSFS >9
    N=5 (5.8%)
    N=0 (0%)
    N=2 (2.3%)
    N=2 (2.3%)
  
  
    LBP sample (n = 60) 
    
    
    
    
  
  
    PSFS =1
    N=0 (0%)
    N=13 (21.6%)
    N=22 (36.7%)
    N=26 (43.3%)
  
  
    RMDQ mean (SD) score*
    --
    1.3 (1.6)
    1.8 (3.0)
    1.5 (1.8)
  
  
    PSFS > 9
    N=1 (1.8%)
    N=0 (0%)
    N=0 (0%)
    N=0 (0%)
  
  
    *If more than 15% scored in the lower or higher    10% range of the scale, the mean and (SD) of the SPADI/RMDQ score of this    sub-sample is displayed. SD: Standard Deviation; SPADI: Shoulder Pain and    Disability Index; RMDQ: Roland Morris Disability Questionnaire; PSFS:    Patient-Specific Functional Scale; SPS: Subacromial Pain Syndrome; LBP: Low    Back Pain.



Table 4: Floor and ceiling effects. n= (%) of patients scoring within the highest and lowest 10% of the PSFS.

Construct validity (hypotheses 2a and 2b)

The PSFS correlated well with SPADI and RMDQ at baseline. However, correlation coefficients were below our predefined cutoff level of r ≥0.75. Our hypothesis 2a, therefore, had to be rejected. Correlations for the time-points T1, T2 and T3 showed a progressive increase: we found the strongest correlations at T3, with r = 0.90 in the SPS and r = 0.74 in the LBP sample. This development was completely contrary to our hypothesis 2b. Based on these results, hypothesis 2b also had to be rejected. Detailed results are displayed in Table 5.

Table 5: Correlations (Pearson`s r) between PSFS and condition-specific disability scores.




  
    Outcome measure
    BL
    T1
    T2
    T3
    Change    BL-T1
    Change    T1-T2
    Change    T2-T3
    Change    BL-T3
  
  
    SPS & LBP
    5 weeksa SPS
    9-12 weeksa LBP
    12 weeks SPS
    6 months LBP
    12 monthsb SPS &    LBP
  
  
    SPADI
    0.406
    0.663
    
    0.794
    
    0.903
    0.572
    0.417
    0.661
    0.662
  
  
    RMDQ
    0.301
    
    0.523
    
    0.651
    0.743
    0.497
    0.542
    0.176
    0.552
  
  
    BL: Baseline; r: Pearson’s Correlation    Coefficient; a: Post Intervention; b: Final Follow Up; SPADI: Shoulder Pain    and Disability Index; RMDQ: Roland Morris Disability Questionnaire; PSFS:    Patient-Specific Functional Scale; SPS: Subacromial Pain Syndrome; LBP: Low    Back Pain.



Table 5: Correlations (Pearson`s r) between PSFS and condition-specific disability scores.

Validity to change (hypothesis 3)

Here we expected decreasing correlation coefficients for change scores over time. However, results did not support this hypothesis. Instead of decreasing correlation coefficients, we found alternating patterns that varied between samples. The SPS r-values varied in an down-up sequence, with the strongest correlation for the change score between T2 and T3. In contrast, the LBP sample showed an updown pattern. Here, a very low correlation was found between T2 and T3. Results are displayed in Table 5.

Discussion

The aim of this paper was to test the following hypotheses regarding the PSFS when compared to well-established CSOMs: no floor and ceiling effects, acceptable construct validity and validity to change over time. Two samples from different populations were analyzed and similar results were found. Four measurement points were incorporated in the analysis whereby it was possible to present the development of the relationship between the PSFS and the CSOMs over the period of one year. Other than for floor and ceiling effects results showed opposite effects than we hypothesized, resulting in the rejection of our hypotheses regarding construct validity and validity to change. These results demonstrate that the overarching construct defined in the introduction must be doubted. The PSFS certainly does not reflect change on a group level in the same way that CSOMs do. The development over time of the correlations between PSFS and CSOMs has led us to conclude that the underlying constructs are different and, therefore, should not be used for the same purpose. Although the PSFS has been used in several studies as a secondary outcome measure to analyze longitudinal development of activity restrictions on a group level [8,11,21-23] and seemed to perform well for this purpose, our data suggest that the underlying construct remains unclear. Therefore, we cannot recommend the use of the PSFS without taking into account that the underlying construct is besides different from CSOM also unclear to interpret on a group level, at the moment [24].

Other authors also have investigated validity aspects of the PSFS. Hall et al. [25] investigated responsiveness of RMDQ and PSFS in patients with LBP and attested both outcome measures an “acceptable” responsiveness. In a first step, they calculated correlations between RMDQ/PSFS and the Global Perceived Effect scale (GPE), which they used as an external reference standard for change. In a second step, they used a receiver operating characteristics (ROC) curve to assess responsiveness. This methodological approach provides information about the relationship between RMDQ/PSFS and GPE as an external indicator of change. Although GPE is often used for this purpose, GPE (change) scores might be more a reflection of the current health status than of true change and GPE might, therefore, be insufficient to serve as a valid external reference for change [26]. Furthermore, when using ROC it is necessary to dichotomize the external change scores, which leads to a loss of information on the magnitude of change [27]. Our purpose was to analyze whether the PSFS was responsive relative to our CSOMs. The different approach to Hall et al. [25] may explain the difference in conclusions drawn from these results.

Thoomes-de Graaf et al. [28] tested convergent validity between the PSFS and the Neck Disability Index (NDI) in patients with neck pain; they assumed that both tools measure the same construct of “activity limitations” and expected to find a strong baseline correlation between them. However, results showed only a moderate correlation.

Abbott and Schmitt [2] investigated concurrent validity (which would be defined as construct validity in the absence of a gold standard according to Mokkink et al. [5]) and validity to change of the PSFS in a sample with mixed acute and chronic musculoskeletal disorders. According to our classification system for the correlation coefficient, they found a moderate correlation between PSFS and CSOMs at baseline in the subgroup with upper extremity disorders and a low correlation in the subgroup with LBP. Interestingly these correlations were stronger at 6-month follow-up, which were similar to the outcomes in our samples. The moderate correlations found for change scores between baseline and follow-up at 6 months were also similar to ours (between baseline and our last measurement point at 1 year), although the use of different time frames may complicate this comparison. Similar results regarding construct validity were also found by Heldmann et al. [29] in LBP-patients. They used the Oswestry Disability Index (ODI) as reference measure and found low correlations at baseline and moderate correlations at follow-up. The fact that we obtained similar results to those of Abbott et al. [2] and Heldmann et al. [29] for patients with LBP and SPS has led us to the assumption that results could be independent of the condition specific population of interest. This assumption is underpinned by similar results found for lower extremity and neck disorders in the study by Abbott et al. [30], and by results from Stratford et al. [31] who also found low correlations between change scores of PSFS and CSOM used in patients who underwent total knee arthroplasty. In our opinion, it is more likely that the similarity of results of the mentioned studies and our study is mainly based on a basic difference between the constructs of the PSFS and CSOMs.

In another study, Mannberg-Bäckman et al. [32] investigated the validity and sensitivity to change of the PSFS in a group of patients after proximal humerus fracture. Although 62% of the chosen items in the PSFS were also represented in the CSOM, correlation between the two instruments was low, whereas sensitivity to change was high in both instruments. Resnik et al. [33] investigated responsiveness of several CSOMs and the PSFS in patients undergoing rehabilitation for upper limb prosthesis. The PSFS showed the largest effect sizes from all outcomes analyzed. This difference between PSFS and other outcomes was most obvious for long-term results, where PSFS still showed significant changes while most of the other instruments did not. The authors concluded that the PSFS was one of the outcome measures most responsive to change. Heldmann et al. [29] also concluded that the PSFS is more responsive than the Oswestry Disability Index (ODI) because they found higher effect sizes for the PSFS. However, effect sizes might be rather a reflection of quantitative properties in the study population (it is merely a measure focusing on individual changes) than necessarily of the validity of the instrument itself. Therefore, these results may alternatively suggest that the PSFS measures a different construct than CSOMs and that larger effect can be reached when using the PSFS instead of a CSOM, but do not clarify the construct underlying the PSFS. Originally, the PSFS was developed to address individual activity restrictions, to prioritize treatment goals, and, last but not least, to keep patients motivated by working on personally important goals. We would agree with using the PSFS for these purposes. We also would agree with using the PSFS for risk assessment [34] and to monitor shortterm effects in individual patients. The PSFS also may help to redirect focus towards value driven treatment goals chosen by the individual patients themselves and consequently on function and ability rather than pain and disability. Based on our data, we believe that the PSFS does not provide a clinically meaningful unidimensional scale comparable on a group level to a CSOM, because it aggregates not only heterogeneous functional items in one scale but also averages item-dependent scores. This makes it difficult to compare outcomes of the two types of measures on a group level in patients.

Limitations

Both of our samples contained mainly patients with chronic complaints and our results could be different from studies that investigate samples including acute patients. Furthermore, correlations based on the Pearson’s r measure the closeness to a linear relationship but the relationship between two measures may be close but non-linear which we did not analyze. Our values set for the Pearson’s correlation coefficient for the acceptance of the stated hypotheses were high and subjective to a certain degree; this led to a more conservative interpretation of the results. Our results for construct validity were mainly based on convergent validity; based on existing data it was not possible for us to analyze divergent validity.

Conclusions

Our hypotheses of the expected ranges of correlation between the PSFS and the CSOM for construct validity and validity to change had to be rejected. While the use of the PSFS in a clinical context has its advantages, the measure is not recommended to assess the development of pathology or syndromes or to compare between patients on a group level.

References

Download PDF

Citation: Kromer TO, Saner J, Sieben JM and Bastiaenen CHG. Construct Validity and Validity to Change of the Patient-Specific Functional Scale in Patients with Shoulder and Low Back Pain: A Clinimetric Study. Phys Med Rehabil Int. 2021; 8(2): 1181.

Home

Journal Scope

Editorial Board

Instruction for Authors

Submit Your Article