11 ROAR-Morphology Assessment Design
11.1 Structure of the task and design of the items
ROAR-Morphology uses a four-alternative forced choice sentence completion task adapted from the Real Word Suffix task (Tyler and Nagy 1989; Goodwin, Petscher, and Reynolds 2021). Students see a sentence with a blank (e.g., “The ________ kitten jumped on the string”) and choose the correct word form from four morphologically-related options derived from the same base word (e.g., “play, played, player, playful”). Each item includes the target (correct) word, the untransformed base word, and two morphologically complex distractors representing incorrect transformations. Here are two examples.
This systematic distractor creation provides valuable diagnostic information that can be interpreted through error pattern analysis while maintaining construct validity. The sentence-context format offers greater ecological validity than decontextualized word manipulation tasks, as it better represents how students encounter morphologically complex words during authentic reading experiences (Carlisle 2000; Nagy, Carlisle, and Goodwin 2013). Real words are used rather than pseudowords to engage students’ existing lexical knowledge in ways that mirror authentic reading comprehension processes.
11.2 Methodological Controls
The assessment was originally designed for use in grades 2–5, representing a critical window during which students increasingly encounter complex text with greater morphological load in independent reading contexts (Chall 1983; Berko 1958; Carlisle and Nomanbhoy 1993). Work is currently underway to extend the assessment downward to kindergarten and first grade and upward through middle school.
To ensure construct validity and minimize construct-irrelevant variance, several methodological controls were implemented across all items. Vocabulary appropriateness was maintained by ensuring that 80% of base words have an estimated age of acquisition (AoA) rating of 8 or below based on Kuperman, Stadthagen-Gonzalez, and Brysbaert (2012) norms, ensuring vocabulary knowledge would not impede assessment of morphological skills for students aged 8 and above, representing a typical second grader. Base words ranged from an AoA of below kindergarten (2.61 years) to about 5th grade (11.22), with an average of 6.2 years. Word frequency norms were based on the SUBTLEXus corpus (Brysbaert and New 2009).
Sentence complexity was carefully controlled to minimize syntactic processing demands. The assessment used primarily short sentences (fewer than six words) containing one main clause with only one finite verb. All items were written at or below a Flesch-Kincaid grade level of 1.2, ensuring readability appropriate for second graders. The first word in a sentence was never the target, avoiding conflation of morphological knowledge with sentence parsing demands.
Morphological variation was systematically implemented across the item pool (see Table 11.1). Items included a balanced distribution of inflectional (32%) and derivational shifts (68%), with suffix selection varying systematically between common suffixes (65% of items) and less common suffixes (35% of items) based on established frequency counts (Warren et al. 1973; Honig et al. 2018; White, Sowell, and Yanagihara 1989).
| Word Type | Suffix Frequency | Number of Items | Percentage |
| Inflectional | Common | 8 | 20% |
| Inflectional | Less Common | 5 | 12% |
| Derivational | Common | 18 | 45% |
| Derivational | Less Common | 9 | 23% |
| Total | 40 | 100% |
Second grade serves as the optimal starting point for morphological screening. Grades 2–5 represent a critical window for morphological screening, as students in this range increasingly encounter complex text with greater morphological load in independent reading contexts (Chall 1983; Berko 1958; Carlisle and Nomanbhoy 1993). However, this assessment was designed to assess students across grades 2–5 to identify those who may struggle with morphological knowledge, with second grade representing a developmentally appropriate point at which morphological knowledge becomes increasingly important for reading success. The item bank is being expanded to better serve additional grade levels.
All items underwent review by an expert panel consisting of trained researchers and educators specializing in literacy development and morphological knowledge. Panel members evaluated each item for construct alignment, developmental appropriateness, linguistic accuracy, and cultural sensitivity. Items were revised based on panel feedback to ensure they accurately reflected the intended morphological transformation, maintained contextual validity within sentence frames, and presented clear answer choices consistent with the distractor logic.
11.3 Scoring
ROAR-Morphology uses a dichotomous scoring system where students receive full credit for each correct answer. Student responses map to developmental waypoints, which delineate levels that describe their stage of morphological knowledge, based on their demonstrated morphological understanding, providing educators with clear guidance for instructional planning.
Raw Scores: The student’s raw score represents the number of correct responses out of the total items completed, ranging from 0 to 40 for the current item pool.
Developmental Waypoints: Student responses map to construct waypoints based on their morphological understanding:
Waypoint 3 (Strategic): Above -0.50 logits - Demonstrates full range of morphological shifts, including derivational shifts in challenging contexts
Waypoint 2 (Developing): -0.50 to -2.00 logits - Shows ability for morphological shifts even with mid-frequency words and attractive distractors
Waypoint 1 (Emerging): -1.99 to -3.00 logits - Success with morphological shifts, especially inflectional shifts in simple contexts
Waypoint 0 (Not Yet Evident): Below -3.00 logits - Does not yet demonstrate application of morphological knowledge
Future Score Types:
Percentile Scores: Based on ROAR norms (in development)
Standard Scores: Age-standardized scores (planned for future versions)
As with other ROAR assessments, computer adaptive testing will be implemented to further improve assessment efficiency while maintaining measurement precision. Future enhancements may include more detailed breakdowns by morphological pattern type, providing educators with increasingly targeted instructional guidance.
11.4 Assessment Development
The ROAR-Morphology assessment was developed using the BEAR Assessment System framework (Wilson 2023, 2004) to systematically capture morphological development from basic inflectional patterns to complex derivational transformations. This approach enabled us to create items targeting specific developmental stages, establish clear developmental waypoints, and implement psychometric modeling that places students and assessment items on the same measurement scale.
While ROAR-Morphology was initially developed and validated for students in grades 2-5, ongoing field testing has demonstrated its utility with students across a broader range of developmental levels. Current development efforts include expanding the item bank to optimize measurement precision across elementary and middle school grades.
11.4.1 Construct Mapping Process
Building on the developmental progression outlined in our theoretical framework, our construct map operationalizes the established sequence from inflectional to derivational morphology and from common to less common suffixes into four distinct waypoints. The ROAR-Morphology construct map was developed through an iterative, evidence-centered design process that integrated theoretical understanding with empirical validation.
11.4.2 Guiding Theoretical Considerations
The construct map development translated established theoretical principles into operational assessment features. Three key design considerations guided this process:
Developmental progression: The documented progression from inflectional to derivational morphology (Berko 1958; Carlisle and Nomanbhoy 1993) informed waypoint boundaries and the expected difficulty ordering of morphological transformations.
Frequency effects: Evidence that common suffixes are mastered before less common ones (Deacon 2008) guided the distribution of suffix types across difficulty levels.
Contextual validity: Evidence that sentence contexts support morphological processing in authentic reading (Nagy, Carlisle, and Goodwin 2013) justified the sentence completion format as the basis for the construct map.
These considerations provided the foundation for operationalizing morphological knowledge into a systematic construct map with distinct developmental waypoints.
11.4.3 Waypoint Development
The construct map includes four waypoints representing qualitatively distinct levels of morphological knowledge development, detailed in the table below.
| Application of Morphological Knowledge | Description |
|---|---|
Strategic (above -0.50 logits) |
Demonstrates a full range of ability for morphological shifts, including derivational shifts in challenging contexts (e.g., when the target word is lower in frequency and 2 derivational distractors are present). |
Developing (from -2 to -0.49 logits) |
Demonstrates some range of ability for morphological shifts even when target words are mid-range of frequency and attractive derivational distractors are present. |
Emerging (from -3 to -1.99 logits) |
Demonstrates some success with morphological shifts, especially with inflectional shifts in a simple context (when the target ward is high frequency and only 1 or no derivational distractor is presented). |
Not Yet Evident (below -3 logits) |
Does not yet demonstrate the application of morphological knowledge. |
11.4.4 Item Design Methodology
The item design for ROAR-Morphology systematically operationalizes the Morphological Pathways Framework by targeting multiple pathways through which morphological knowledge contributes to reading development.
11.4.4.1 Task Format Selection
Building on the Real Word Suffix task approach described above, the four-alternative forced choice format was selected and refined using (Briggs et al. 2006) ordered multiple-choice distractor logic. This format was selected for several converging reasons that support both psychometric quality and practical implementation. Sentence-context tasks offer greater ecological validity than decontextualized word manipulation tasks, as they better represent how students encounter morphologically complex words during authentic reading experiences (Carlisle 2000; Nagy, Carlisle, and Goodwin 2013). The systematic distractor design provides valuable diagnostic information about error patterns while maintaining construct validity, enabling educators to understand not just what students got wrong, but how their morphological processing may have led to specific errors. From a practical standpoint, the multiple-choice format enables automated scoring and efficient group administration, making the assessment feasible for large-scale implementation. Notably, the format is familiar to students and requires minimal test-taking skills beyond the target morphological knowledge, reducing construct-irrelevant variance that might interfere with accurate measurement.
11.4.4.2 Systematic Item Features
Each item was systematically crafted to assess specific aspects of the Morphological Pathways Framework by targeting the multiple routes through which morphological knowledge contributes to reading development. The direct pathway to reading comprehension is assessed through contextual sentence frames that require semantic understanding of morphological relationships, ensuring that students must understand both the meaning and appropriate usage of morphological transformations.
The indirect pathway via word reading is captured through systematic variation of word types, with items requiring either inflectional or derivational shifts that represent different levels of orthographic-morphological complexity.
The indirect pathway via vocabulary knowledge is controlled through carefully selected word frequency (Brysbaert and New 2009) and age-of-acquisition parameters (Kuperman, Stadthagen-Gonzalez, and Brysbaert 2012) (see Section ?sec-aoa for detailed specifications), ensuring that vocabulary knowledge does not impede assessment of morphological skills while still representing authentic reading demands.
11.4.4.3 Distractor Logic
Each item follows a systematic distractor structure including four morphologically-related options derived from the same base word:
The target word (correct response)
The untransformed base word
Two morphologically complex distractors representing incorrect transformations
For example: “She is _______ pizza with cheese” with ‘eat’ as the base word includes ‘eating’ (correct), ‘eat’ (untransformed base), ‘eaten’ (wrong inflectional suffix), and ‘eater’ (wrong derivational suffix).
11.4.4.4 Expert Panel Review Process
All items underwent comprehensive expert review by trained researchers and educators specializing in literacy development and morphological knowledge. Panel members evaluated items across four critical dimensions:
Construct alignment: Accurate measurement of intended morphological knowledge
Developmental appropriateness: Suitability for second through fifth-grade students
Linguistic accuracy: Morphological transformations adhering to English language patterns
Cultural sensitivity: Freedom from bias or inappropriate content; consideration of dialectal differences
11.4.5 Assessment Refinement
ROAR-Morphology underwent systematic refinement through multiple iterations to optimize its measurement of morphological knowledge across developmental levels. This refinement process focused specifically on enhancing the morphology-specific features that distinguish this assessment within the broader ROAR suite.
11.4.6 Morphological Construct Map Refinement
The initial construct map required empirical validation to ensure that theoretical predictions about morphological development aligned with actual student performance patterns. Early pilot data revealed that the progression from inflectional to derivational morphology was more nuanced than initially conceptualized, leading to refinements in the waypoint descriptions and score boundaries for those waypoints. Specifically, the construct map was adjusted to better capture the developmental progression where students demonstrated varying competence with derivational forms depending on suffix frequency, with some students showing success with common derivational patterns before mastering all inflectional forms. This finding led to a revision of the waypoints to emphasize the complexity of morphological knowledge rather than strictly categorizing by inflectional versus derivational types. The refined waypoints now reflect empirically-observed performance patterns while maintaining theoretical coherence with established morphological development research.
11.5 Item Refinement Process
11.5.1 Distractor Effectiveness Analysis
Initial pilot testing revealed that certain morphological distractors were either too obvious or insufficiently attractive to provide diagnostic information. Items underwent systematic revision to optimize distractor effectiveness, with particular attention to ensuring that incorrect morphological transformations represented plausible student errors rather than arbitrary alternatives.
For example, early versions included distractors that violated basic English morphological patterns, which students easily eliminated regardless of their morphological knowledge. Revised distractors were designed to reflect common morphological processing errors, such as over-applying inflectional patterns or selecting semantically related but morphologically inappropriate forms.
11.5.2 Item Selection Criteria
From the initial pool of 45 candidate items, 40 items were retained based on morphology-specific performance criteria. Items were eliminated if they showed poor discrimination between students at adjacent waypoint levels or failed to contribute meaningfully to the overall measurement precision of the assessment. The final item selection prioritized items that provided clear diagnostic information about specific morphological processing abilities while maintaining appropriate difficulty progression across waypoint levels. Particular attention was paid to retaining items that effectively distinguished between students who rely primarily on lexical knowledge versus those who demonstrate systematic morphological knowledge.
11.5.3 Refinement Outcomes
The refinement process resulted in an assessment optimized for morphological knowledge measurement with improved construct validity and enhanced utility. The revised construct map better reflects empirically-observed developmental progressions, while the refined item pool provides more precise measurement across the full range of morphological development typically observed in elementary students.