20  Reliability of ROAR-Palabra

20.1 Background: Published studies

Bhat et al. (2024) reported a large study (N=1,337) examining the relationship between reading skills, math skills, and phonological awareness in Spanish speaking students in Colombia. In this sample, the reported marginal reliability of ROAR-Palabra was 0.92, reliability of ROAR-Frase was 0.82, and ROAR-Fonema was 0.85.

20.2 Criteria for identifying disengaged participants and flagging unreliable scores

To account for unreliable scores and disengaged participation (as discussed in Chapter 15 and shown in Chapter 11), participants with a median response time <450ms are flagged in ROAR-Score reports and their data is excluded from analyses.

20.3 Reliability of fixed-length ROAR-Palabra

ROAR-Palabra runs as fixed-length test and scores are computed based on a Rasch model. The current version of ROAR-Palabra takes about 5 minutes (70 items). In the near future, a computer-adaptive version will become available leveraging the full item bank. Then, more items can be administered for a more precise measure or fewer items can be administered as a quick screener. Table 20.1 reports marginal reliability computed based on data from 6034 students under the IRT model for the standard, 70 item version of ROAR-Palabra. Reliability (\(\rho_{xx^\prime}\)) is computed based on the estimated variance of \(\hat{\theta}\) relative to the estimated standard error (\(\widehat{SE}(\hat{\theta})^2\)) using Equation 22.1:

\[ \hat{\rho}_{xx^\prime} = \frac{\widehat{VAR}(\hat{\theta})}{\widehat{VAR}(\hat{\theta}) + \widehat{SE}(\hat{\theta})^2}, \tag{20.1}\]

Grade Empirical Reliability N
All 0.94 5408
1 0.75 412
2 0.88 638
3 0.91 515
4 0.91 510
5 0.91 580
6 0.89 589
7 0.84 509
8 0.83 423
9 0.81 412
10 0.79 421
11 0.79 398
Table 20.1: Reliability of ROAR-Palabra by Grade

To ensure that ROAR-Palabra is fair and equitable for different demographic groups, we also report reliability by gender (Table 20.2), eligibility for free and reduced price lunch (Table 20.3), English learner status based on state of California designations (Table 20.4), primary langauge spoken (Table 20.5), special education (Table 20.6), ethnicity (Table 20.7), and race (@Table 20.8)

Gender Empirical Reliability N
All 0.94 4349
F 0.94 2189
M 0.93 2160
Table 20.2: Reliability of ROAR-Palabra by Gender
Free/Reduced Lunch Status Empirical Reliability N
All 0.86 250
Free 0.86 120
Paid 0.86 84
Reduced 0.87 46
Table 20.3: Reliability of ROAR-Palabra by FRL (California Sub-sample Only)
English Learner Status Empirical Reliability N
All 0.86 250
English Learner 0.85 142
English Only 0.86 70
Initial Fluent English Proficient 0.86 22
Reclassified Fluent English Proficient 0.9 15
Table 20.4: Reliability of ROAR-Palabra by EL Status (California Sub-sample Only)
Primary Language Empirical Reliability N
All 0.86 239
English 0.87 123
Spanish 0.84 116
Table 20.5: Reliability of ROAR-Palabra by Primary Language (California Sub-sample Only)
Special Education Status Empirical Reliability N
All 0.86 250
No 0.86 234
Yes 0.83 16
Table 20.6: Reliability of ROAR-Palabra by Special Education Status (California Sub-sample Only)
Hispanic Ethnicity Empirical Reliability N
All 0.85 232
No 0.79 13
Yes 0.85 219
Table 20.7: Reliability of ROAR-Palabra by Hispanic Ethnicity (California Sub-sample Only)
Race Empirical Reliability N
All 0.85 232
Asian 0.89 2
Black or African American 0 2
Hispanic 0.85 219
White 0.83 8
Table 20.8: Reliability of ROAR-Palabra by Race (California Sub-sample Only)

References

Bhat, Kruttika G., Alexa Mogan, Ana Saavedra, Mia Fuentes-Jimenez, Julian M. Siebert, Wanjing Anya Ma, Carrie Townley-Flores, et al. 2024. “Shared and Unique Influences of Phonological Processing on Reading and Math.” OSF Preprints. https://doi.org/https://doi.org/10.31219/osf.io/em3bg.