24  Single Word Recognition (ROAR-Word) Concurrent Validity

ROAR-Word is designed to measure the latent construct of single word reading. Traditionally, single word reading is measured by having children read lists of real words and pseudo words of increasing complexity and scoring them based on their accuracy of pronunciation. Thus we first establish that the silent, lexical decision task in ROAR-Word taps into the same latent construct by comparing ROAR-Word scores to a variety of other standardized measures of single word reading (Section 24.1).

24.1 Convergent validity with oral measures of single word reading

24.1.1 Woodock Johnson Basic Reading Skills

24.1.1.1 Background: Published studies

In an initial proof-of-concept study we compared proportion correct on a pilot version of ROAR-Word to individually administered Woodcock-Johnson Letter Word Identification (WJ-Word-ID) scores and found an exceptionally high correlation (r = 0.91, disattenuated r = 0.94; (Yeatman et al. 2021); Figure 24.1). Moderation analysis confirmed that ROAR-Word is equally valid for children with dyslexia and typical readers (6-18 years of age). Different measures of single word reading, such as the Woodcock-Johnson (WJ) and Test of Word Reading Efficiency (TOWRE), are highly correlated, and a variety of standardized measures largely tap into the same latent construct. For example, it is common to use a threshold on WJ or TOWRE to group research participants into dyslexic versus control groups (Fletcher et al. 2006). The correlation between timed, untimed, real word, and pseudoword reading measures across these assessments ranged from r = 0.72 to 0.93; ROAR-Word is similarly correlated with each measure (Figure 24.1). Thus, in terms of convergent validity, ROAR-Word is highly correlated with myriad measures that are often used interchangeably in reading and dyslexia research.

In a second proof-of-concept study we optimized the measurement scale with IRT, added additional items tailored to younger participants, and deployed a new version of ROAR-Word that was half the length. Data from kindergarten, first, and second grade students revealed an exceptionally high correlation of r=0.97 between ROAR-Word and WJ-Word-ID (Figure 24.2) (Yeatman et al. 2021).

24.1.1.2 Additional validation against Woodcock Johnson Basic Reading Skills

The initial validation studies published in (Yeatman et al. 2021) provided strong initial evidence that ROAR-Word accurately tapped into the construct of single word reading ability. However, the sample published in (Yeatman et al. 2021) was recruited to participate in research studies in the Yeatman Lab and was not, therefore, representative of the diversity of students in the United States. Hence, we undertook a series of additional validation studies in collaboration with school districts that had adopted ROAR. Figure 24.3 shows the age distribution and Table 24.1 shows the demographics of the students that participated in these validation studies.

N % % Missing
Female 558 38.67 18.30
Free or Reduced Lunch 58 4.02 86.21
Race/Ethnicity
Hispanic Ethnicity 334 23.15 24.74
White 719 49.83 25.85
Black or African American 32 2.22 25.85
Asian 236 16.35 25.85
American Indian or Alaska Native 14 0.97 25.85
Hawaiian or Other Pacific Islander 16 1.11 25.85
Multiracial 55 3.81 25.92
Total 1443
Table 24.1: Demographics of participants in concurrent validity study of Woodcock Johnson Basic Reading Skills (WJ BRS) and ROAR-Word
Figure 24.3: Age distribution of concurrent validity study of Woodcock Johnson Basic Reading Skills (WJ BRS) and ROAR-Word

Figure 24.4 shows the relationship between ROAR-Word raw scores and Woodcock Johnson Letter Word Identification (WJ LWID) assessed at the same time point. Figure 24.5 shows this relationship broken down by grade. The strongest relationships are in early elementary school but that is because single word reading is at ceiling by late elementary school for many students. Table 24.2 reports the correlations separately for each grade.

Figure 24.4: ROAR-Word is highly correlated with Woodcock Johnson Letter Word Identification (WJ LWID)
Figure 24.5: Correlations between ROAR-Word and Woodcock Johnson Letter Word Identification across different grade bands.

When comparing the correlation between ROAR-Word scores and individually administered Woodcock Johnson (WJ) assessments, the correlation is strongest in early elementary school and decreases in middle school and high-school (Table 24.2). But this decrease in correlation is likely driven by the fact that many middle school and high school students are at ceiling in single word reading restricting the range of WJ scores. Another way to consider this is through the lens of reliability: the maximum possible correlation between two different measurements (e.g., ROAR and WJ), is the reliability of each measure. For example, if we imagine a scenario where ROAR and WJ are measuring the exact same thing (i.e., a true correlation of r=1.0), the correlation we would expect to observe in a study would be capped at the reliability of each measure. Thus we compute the disattenuated correlation allowing us to estimate the true relationship between ROAR-Word and WJ correcting for the decrease in WJ reliability in the older grades (Muchinsky 1996).

Since the two WJ subtests – Letter Word Identification and Word Attack – are both administered together as indicators of the Basic Reading Skills latent construct, we used the correlation between the two WJ subtests as an estimate of WJ reliability in each age bin. When examining WJ reliability in each age be we found, as expected, a decrease in reliability in older grades. Disattenuated correlations between ROAR-Word and WJ (correcting for measurement error) demonstrated that the decrease in correlation between ROAR-Word and WJ in older grades was mostly explained by ceiling effects that impacted WJ reliability in these grades. Table 24.3 shows disattenuated correlations that correct for differences in reliability in each sample.

Grade Word ID Word Attack Basic Reading Skills WJ Reliability N
Kindergarten 0.71 0.71 0.70 0.84 230
1 0.79 0.79 0.79 0.81 257
2 0.78 0.78 0.78 0.82 344
3 0.76 0.76 0.76 0.83 222
4 0.70 0.70 0.69 0.87 72
5 0.69 0.69 0.62 0.85 48
6-8 0.64 0.64 0.63 0.78 83
9-12 0.67 0.67 0.68 0.83 180
Table 24.2: Pearson correlations (\(\rho\)) between ROAR-Word and Woodcock Johnson Scores by grade. Basic Reading Skills is calculated by summing Word ID and Word Attack subtests. WJ Reliability is calculated as the Pearson correlation between Word ID and Word Attack subtests. This reliability metric gives an upper bound on the correlation for ROAR-Word with either subtest
Grade Word ID Word Attack N
Kindergarten 0.86 0.86 230
1 0.92 0.92 257
2 0.89 0.89 344
3 0.86 0.86 222
4 0.77 0.77 72
5 0.77 0.77 48
6-8 0.75 0.75 83
9-12 0.77 0.77 180
Table 24.3: Disattenuated correlations between ROAR-Word and Woodcock Johnson Scores by grade. The correlation between WJ Word ID and WJ Word Attack was used as an estimate of WJ reliability in each sample and ROAR-Word reliability was taken from Chapter 16.

Figure 24.6 shows concurrent validity data comparing ROAR-Word and WJ split by demographic groups. The relationship between ROAR-Word and WJ is similar across different demographics.

Figure 24.6: Correlations between ROAR-Word and Woodcock Johnson Letter Word Identification split by demographics

24.1.2 Fastbridge

24.1.2.1 Background: Published studies

The Formative Assessment System for Teachers (FAST) from FastBridge Learning, is a screener and curriculum based measure widely used across many schools in the United States. In a published validation study of the new, shortened computer adaptive version of ROAR-Word which is now the current standard of practice (ROAR-CAT), we compared the \(\theta\) estimates from ROAR-Word against the individually-admininstered FAST™ earlyReading measure and found a correlation of r=0.89 in 1st grade and r=0.73 in 2nd grade. This initial published study was a small sample size but provided strong evidence of construct validity for the new computer adaptive measure. Figure 24.7 reproduces Figure 10 from Ma et al. (2023) which shows convergent validity of the shortened CAT version of ROAR-Word with FastBridge and Fountas & Pinnell. The following sections undertake similar analyses with a much larger sample including multiple school districts

24.1.2.2 Additional validation against Fastbridge

In collaboration with two large and diverse school districts in the State of California, we ran a study of concurrent validity to compare ROAR against FastBridge. Table 24.4 shows the demographics of the sample.

Table 24.4: Demographics of concurrent validity study comparing ROAR-Word and FastBridge
N % % Missing
Female 1688 50.34 0.66
Free or Reduced Lunch 520 15.51 10.98
English Learner 493 14.70 10.98
Special Education Status 113 3.37 10.92
Race/Ethnicity
Hispanic Ethnicity 702 20.94 0.03
White 1207 36.00 0.03
Black or African American 76 2.27 0.03
Asian 919 27.41 0.03
American Indian or Alaska Native 20 0.60 0.03
Hawaiian or Other Pacific Islander 18 0.54 0.03
Multiracial 539 16.08 0.03
Total 3353

We compared ROAR-Word scores against following FastBridge measures administered within a month (concurrent validity):

  • FastBridge Curriculum Based Measurement for Reading (FAST™ CBMreading) is an Oral Reading Fluency (ORF) measure where students read a leveled passage out loud for one minute. The FastBridge technical manual states “CBMreading is a simple, efficient, evidence-based assessment used for universal screening in grades 1 through 8, and progress monitoring for grades 1-12”(Christ and Colleagues 2018, 14). Words Read Correct, or WRC, “is the primary metric used in reporting student performance on FAST™ CBMreading” (Christ and Colleagues 2018, 19). This measure includes scores on three separate passages as well as a composite score.
  • FAST™ earlyReading is designed to measure component skills of reading in kindergarten and first grade (Christ and Colleagues 2018, 30). It includes Sight Word (real word list) and Nonsense Word (pseudoword list) decoding measures.

Figure 24.8 shows the relationship between FAST™ CBMreading, FAST™ earlyReading and ROAR-Word scores. Table 24.5 reports the Pearson correlations between each measure. Correlations between all the measures were exceptionally high and the correlation between ROAR-Word and FastBridge was almost as high as the internal consistency of FastBridge measures.

(a) ROAR-Word is correlated with FAST™ CBMreading Oral Reading Fluency
(b) ROAR-Word is correlated with FAST™ earlyReading Composite
Figure 24.8: ROAR-Word is highly correlated with FAST™ CBMreading and FAST™ earlyReading
ROAR-Word ORF Passage 1 ORF Passage 2 ORF Passage 3 ORF Composite Nonsense Words Sight Words earlyReading Composite
ROAR-Word 1.00 0.82 0.82 0.82 0.83 0.75 0.78 0.76
ORF Passage 1 0.82 1.00 0.96 0.96 0.98 0.86 0.87 0.90
ORF Passage 2 0.82 0.96 1.00 0.96 0.99 0.84 0.87 0.90
ORF Passage 3 0.82 0.96 0.96 1.00 0.98 0.84 0.87 0.91
ORF Composite 0.83 0.98 0.99 0.98 1.00 0.85 0.87 0.91
Nonsense Words 0.75 0.86 0.84 0.84 0.85 1.00 0.81 0.86
Sight Words 0.78 0.87 0.87 0.87 0.87 0.81 1.00 0.85
earlyReading Composite 0.76 0.90 0.90 0.91 0.91 0.86 0.85 1.00
Table 24.5: Convergent validity of ROAR-Word: Comparision to FastBridge

Table 24.6 Shows the correlation between FAST™ earlyReading ROAR-Word in kindergarten (ORF is not typically administered until first grade). Table 24.7 shows the correlations for first grade, and Table 24.8 shows the correlations for second grade.

ROAR-Word Nonsense Words Sight Words earlyReading Composite
ROAR-Word 1.00 0.60 0.63 0.58
Nonsense Words 0.60 1.00 0.81 0.90
Sight Words 0.63 0.81 1.00 0.90
earlyReading Composite 0.58 0.90 0.90 1.00
Table 24.6: Convergent validity of ROAR-Word: Comparision to FAST™ earlyReading in kindergarten
ROAR-Word ORF Passage 1 ORF Passage 2 ORF Passage 3 ORF Composite Nonsense Words Sight Words earlyReading Composite
ROAR-Word 1.00 0.81 0.83 0.82 0.83 0.73 0.77 0.79
ORF Passage 1 0.81 1.00 0.96 0.96 0.98 0.86 0.87 0.90
ORF Passage 2 0.83 0.96 1.00 0.97 0.99 0.84 0.87 0.90
ORF Passage 3 0.82 0.96 0.97 1.00 0.99 0.84 0.87 0.91
ORF Composite 0.83 0.98 0.99 0.99 1.00 0.85 0.87 0.91
Nonsense Words 0.73 0.86 0.84 0.84 0.85 1.00 0.79 0.87
Sight Words 0.77 0.87 0.87 0.87 0.87 0.79 1.00 0.87
earlyReading Composite 0.79 0.90 0.90 0.91 0.91 0.87 0.87 1.00
Table 24.7: Convergent validity of ROAR-Word: Comparision to FAST™ earlyReading in first grade
ROAR-Word ORF Passage 1 ORF Passage 2 ORF Passage 3 ORF Composite earlyReading Composite
ROAR-Word 1.00 0.80 0.79 0.78 0.80 NA
ORF Passage 1 0.80 1.00 0.95 0.95 0.98 NA
ORF Passage 2 0.79 0.95 1.00 0.95 0.98 NA
ORF Passage 3 0.78 0.95 0.95 1.00 0.98 NA
ORF Composite 0.80 0.98 0.98 0.98 1.00 NA
earlyReading Composite NA NA NA NA NA NA
Table 24.8: Convergent validity of ROAR-Word: Comparision to FAST™ earlyReading in second grade

References

Christ, and Theodore Colleagues. 2018. Formative Assessment System for Teachers™ Technical Manual. Author and FastBridge Learning.
Fletcher, J M, G R Lyon, L S Fuchs, and M A Barnes. 2006. Learning Disabilities: From Identification to Intervention. New York: Guilford PRess.
Ma, Wanjing A, Adam Richie-Halford, Klint Burkhardt Amy and Kanopka, Clementine Chou, and Jason D Domingue Benjamin and Yeatman. 2023. ROAR-CAT: Rapid Online Assessment of Reading Ability with Computerized Adaptive Testing.”
Muchinsky, Paul M. 1996. “The Correction for Attenuation.” Educational and Psychological Measurement 56 (1): 63–75.
Yeatman, Jason D, Kenny An Tang, Maya Donnelly Patrick M and Yablonski, Mahalakshmi Ramamurthy, Iliana I Karipidis, Sendy Caffarra, Megumi E Takada, Klint Kanopka, Michal Ben-Shachar, and Benjamin W Domingue. 2021. “Rapid Online Assessment of Reading Ability.” Scientific Reports 11 (1): 6396.