Filter by Categories
Clinical Practice Research
Ethical Legal and Regulatory Considerations
Planning, Managing, and Publishing Research
Research Design and Method

Bilingual Semantics Assessment Design

Fifth in the series on Development of a Bilingual Test for Spanish-English Children.

Lisa Bedore

DOI: 10.1044/cred-pvd-c13010


The following is a transcript of the presentation video, edited for clarity. Presentation slides are available for download via the PDF button in the toolbar.

Originally presented at the ASHA Convention (November 2013) as part of the session Development of a Bilingual Test for Spanish-English Children: A Long and Winding Road. Videos in this series are:

  1. Challenges in Assessing Bilingual Populations

    (Elizabeth D. Peña, University of Texas at Austin)

  2. Steps in Test Development

    (Elizabeth D. Peña, University of Texas at Austin)

  3. Bilingual Phonology Assessment Design

    (Brian A. Goldstein, La Salle University)

  4. Bilingual Pragmatics Assessment Design

    (Aquiles Iglesias, Temple University)

  5. Bilingual Semantics Assessment Design

    (Lisa M. Bedore, University of Texas at Austin)

  6. Bilingual Morphosyntax Assessment Design

    (Vera F. Gutierrez-Clellen)

Semantics, Primary Language Impairment, and Bilingualism

When we started working on the semantics test, not a lot was known about semantic development in bilingual children.

We started to think about what the relationship would be between semantics and primary language impairment. Some of the things that we know about the vocabulary development in children are that children with PLI have weaker semantic representations than their typically developing peers, and they have less semantic depth.

We also know that they require more exposures to learn new words.

But we also know that if we give them a single-word vocabulary test, their vocabulary knowledge may fall within the low-normal range.

If we think about what’s going on with bilingual children, we know that they have normal language learning ability. There’s no empirical reason to believe they have greater risk for language impairment than their monolingual peers.

We also know that their experiences across their two languages are shared. So the concepts that they have are shared across their two languages.

Because their experiences come in two languages, they have divided input. So each of their experiences with their words will be less reinforced. They’ll have less opportunities to hear and use those words, and so they may need more time to learn the phonotactics or to learn those words.

However, again, if we give children a single-word vocabulary test, we know that they’ll score in the low-normal range.

Assessment Approach

We knew that as we developed our semantics test, we would need to develop an item set that was challenging enough to differentiate children with and without language impairment, but it wouldn’t be dependent on the specific words that children knew in each of their languages. Because we know children are learning their different words at different times, they have different experiences. They’re not necessarily going to have the same vocabulary available in each of their languages.


We decided to take an approach of organizing our test items or test blueprint around several core areas that tap semantic knowledge across the children’s languages.

We drew this from the typically developing literature available for English-speakers at the time.

We had a number of different kinds of item types that we used.

  • Analogies. That is actually pretty challenging. For example, an item such as, “Hamburger is to plate as soup is to ___”

  • Descriptions. We would ask children to tell us things about an object they might be familiar with. So something like, “Tell me three things about a school bus.” Or, “Tell me three things about a truck.”

  • Category generation. We had items such as, “Tell me as many zoo animals as you can think of.” And we give children a set amount of time to generate as many items as they could in that category.

  • Similarities and differences. For example, we showed children pictures of cards or invitations and we would ask children to tell us what makes these kinds of things go together. They might be the same color, or the same shape.

  • Item functions. We asked children to help us identify what we used different kinds of items for. “What is this pencil for?” or “What is a knife for?”

  • Associations. We asked children to tell us items that go with other categories. For example, we’d say, “Tell me something that goes with bird.” And we would expect the children to tell us that birds can fly.

  • Linguistic concepts. That’s an important school concept for children. Children might be asked to identify the color or shape of something like a balloon or box.

As we developed the actual items for the test, we focused on items we thought would be challenging enough to separate children with typically developing and language impairment. For example, instead of having single word items where children would just have to name or recognize, we had items children might look at and have to explain the difference to us.

“Here are two piñatas, they are different. Tell us what’s different about the piñatas.” And children would tell us something about the points. Or tell us there’s a different number of points, or there’s something different about them.

Psychometric Equivalence

Another thing we looked at when we were trying to develop our items was the possibility of psychometric equivalence across the items in English and Spanish.

When you ask children to do the exact same thing, as if you were directly translating a test, and you give children the same item in English as you do in Spanish, they may take that repetition of the item to indicate that they did something wrong the first time you asked and change their answer. It’s not because they didn’t know what it was that was different about the item or what you were asking. But they just take the repetition as an opportunity to say something different, and maybe get it right this time, and you’ll stop asking them.

This is a characteristic property item, where we were asking children to tell us about the features of these two items. So for example, in Spanish we might ask the child, “¿Cómo es la pelota?” [What’s the ball like?] and they can talk about the color or the shape or the features, and in English we can get at that same kind of item saying, “What’s this present like?” And again the child can talk about the color, the shape, and the features.

We address these different kinds of items with different questions — but they were psychometrically equivalent in terms of their difficulty.

Finally, we developed items that require semantic knowledge, but that you could respond to using different kinds of vocabulary. In the same way that Aquiles was talking about with the Pragmatics test, there’s lots of different kinds of appropriate responses that could get at the key feature that you were talking about, but could be answered in different ways.

Here we’re asking about how these presents are similar. So children could talk about the red bows, the red ribbons. They could talk about red string. They could talk about the shape of the presents. But they would have to know, for example, that they’re not the same size.

As we scored these kinds of items, we provide sample responses that children might respond to in either language. One of the unique things about the semantics test is that we allow children to respond in either language. So, in the English test the most likely response is going to be the English response, but if the child responds in Spanish, we count that just the same as if they had responded in English. And they would get a 1 for either of the bolded responses to “What makes these gifts go together?”

We do also mark whether the children provide other language responses, just so we can keep track of what language children are responding to the test in.

Iterative Test Development and Item Analysis

So, we started off with a very large set of items. I think about 187 per language. It took a couple of days, at least, to provide this item set to the children. Not the whole day — just a couple of test sessions. Although, maybe the teachers felt like we were pulling these children out for the full day.

We started off with an equal number of items in each of the categories. We tested these out with 71 children in our local school districts in Texas. We tested with 4, 5, and 6 year old children. And we had a subset of 5 year olds who had either nice, typical language skills, or definitely had language impairment.

So we used these iterative approaches to giving the test to identify groups of items that would reflect good items for the children — items that reliably elicited the targets we were looking for, and that differentiated children with and without impairment.

What we can see, is that as we progressed from our local set of kids to the larger set of children that we tested in different sites across the country, the number of items goes down, because we’re getting rid of items that aren’t so reliable and don’t differentiate children. And we also see that over time, the numbers of items in each of the languages varies.

We ended up with 24 items in Spanish and 24 items in English in our final version of the test. But you can see that, for example, we have many more Similarities and Differences items in English than we had in Spanish. But we had many more Functions of objects type questions in Spanish than in English. So the overall difficulty of the test is the same, but the configuration of items varies related to each of the languages.

In the final version of the test, what we end up with is, for 4, 5, and 6 year olds, we see p values increasing by age. Remember that p values are the percentage of children who get items correct at each age. So there are systematic increases in Spanish and in English for typically developing children that have normal language, as they increase from about 55% correct to 80% correct. We also see that there is a progression for the language impaired children. They go from about 26% to about 50% correct.

We also took into account item discrimination. That’s the difference between how typically developing children and language impaired children score. And here we see that all of these items cluster around 0.3. That’s an ideal difficulty level for differentiating children with and without language impairment. That’s the target value. We see that for both languages across ages we were able to hit this iterative process of getting rid of items that don’t work.

The other thing that we see as we look at our semantics test is that it corresponds or correlates with related measures. For example, we see moderate correlations — but highly significant correlations — with language sample measures. We collected narrative samples on the children to whom we gave the semantics test and the other tests.

We see correlations in the 0.3 to 0.4 range. And we also looked at how our test correlated to the Expressive One Word Picture Vocabulary Test, which is a naming test. So looking at children’s single-word vocabulary. Again we see a moderate correlation there.

But I think the other important thing to think about here, is we don’t have correlations of 0.8 or 0.9. So our test isn’t doing the exact same thing as these other measures do. So it speaks to the validity of the kinds of tasks we do, but it’s telling you that you’re going to get different information from doing this than doing these other kinds of measures you might routinely include in a large-scale assessment.

We finalized our analysis by looking at whether we were doing a good job. What’s our classification accuracy on the semantics test?

We used discriminant function analysis to set cut-points. To decide where was the ideal cut point to differentiate children with and without language impairment, and what we see is we get very nice sensitivity and specificity in Spanish. We do see a little challenge there with the five-year-olds, and I’ll talk about that when I get all the way through this.

We looked at the same thing in English. Again we see sensitivity and specificity at all ages, above 80%.

One of the unique things we did with the semantics test, that we did with all of our test that we see here. The children are bilingual. We know that by looking at the slides of classification about how children do relative to their dominance, we know that sometimes children who are bilingual do better in their first language, because they have strong word knowledge there. Or we know that in typical development, children learn words before they learn grammar. And so maybe they’re doing better in their second language because that’s starting to take over.

So you can use an approach where you look at children’s best language to classify them. We see that generally speaking, classification values go up for the children.

One thing that we see both in Spanish and the Best language classification is that we’re right there, just about 70% to 75% sensitivity for five year olds. We attribute this to some of the demands of increased use of English at kindergarten, putting a pressure both on their English, and on their ability to retain Spanish. So that one’s a little bit lower. Of course, this semantics test doesn’t stand by itself. It stands in conjunction with the other subtests that we developed, so you would need to combine it with morphosyntax to get really good classification of all of your children.

References for this Series

Allen, M. & Yen, W. (1979). Introduction to measurement theory. Belmont, CA: Wadsworth.
Alt, M., Meyers, C. & Figueroa, C. (2013). Factors that influence fast mapping in children exposed to Spanish and English. Journal of Speech, Language, and Hearing Research56(4), 1237–1248 [Article]
Alt, M. & Suddarth, R. (2012). Learning novel words: Detail and vulnerability of initial representations for children with specific language impairment and typically developing peers. Journal of Communication Disorders45(2), 84–97[Article] [PubMed]
Anderson, R. T. (2001). Lexical morphology and verb use in child first language loss: A preliminary case study investigation. International Journal of Bilingualism5(4), 377–401 [Article]
Bedore, L. M., Peña, E. D., Summers, C. L., Boerger, K. M., Resendiz, M. D., Greene, K., Bohman, T. M. & Gillam, R. B. (2012). The measure matters: Language dominance profiles across measures in Spanish/English bilingual children. Bilingualism: Language and Cognition15(3), 616–629 [Article]
Bedore, L. M., Peña, E. D., Gillam, R. B. & Ho, T. (2010). Language sample measures and language ability in Spanish-English bilingual kindergarteners.Journal of Communication Disorders43(6), 498–510 [Article] [PubMed]
Bishop, D. V. (1998). Development of the children’s communication checklist (CCC): A method for assessing qualitative aspects of communicative impairment in children. Journal of Child Psychology and Psychiatry39(6), 879–891 [Article] [PubMed]
Bonifacio, S., Girolametto, L., Bulligan, M., Callegari, M., Vignola, S. & Zocconi, E. (2007). Assertive and responsive conversational skills of Italian-speaking late talkers. International Journal of Language & Communication Disorders42(5), 607–623 [Article]
Brice, A. & Montgomery, J. (1996). Adolescent pragmatic skills: A comparison of Latino students in English as a second language and speech and language programs. Language, Speech, and Hearing Services in Schools27(1), 68–81[Article]
Cotton, E. & Sharp, J. (1988). Spanish in the Americas. Washington, DC: Georgetown University Press.
Carrow, E. (1974). Austin Spanish articulation test. Austin, TX: Learning Concepts.
Hammond, R. (2001). The Sounds of Spanish: Analysis and application (with special reference to American English). Somerville, MA: Cascadilla Press.
Goldstein, B. (2007). Measuring phonological skills in bilingual children: Methodology and clinical applications. In Centeno J., Obler L. &
Anderson R. (Eds.). Studying Communication Disorders In Spanish Speakers: Theoretical, Research, & Clinical Aspects. Clevedon, UK: Multilingual Matters.
Goldstein, B. & McLeod, S. (2012). Typical and atypical multilingual speech acquisition. In McLeod S. & Goldstein B. (Eds.). Multilingual aspects of speech sound disorders in children. Clevedon, UK: Multilingual Matters.
Goldstein, B. & Gildersleeve-Neumann, C. (2012). Phonological development and disorders. In Goldstein B. (Ed.). Bilingual language development and disorders in Spanish-English speakers (2nd edition). Baltimore: Brookes.
Gray, S. (2004). Word learning by preschoolers with specific language impairment: Predictors and poor learners. Journal of Speech, Language, and Hearing Research47(5), 1117–1132 [Article]
Gray, S. (2005). Word learning by preschoolers with specific language impairment: Effect of phonological or semantic cues. Journal of Speech, Language, and Hearing Research48(6), 1452–1467 [Article]
Gutiérrez-Clellen, V. F., Restrepo, M. A. & Simón-Cereijido, G. (2006). Evaluating the discriminant accuracy of a grammatical measure with Spanish-speaking children. Journal of Speech, Language, and Hearing Research49(6), 1209–1223 [Article]
Gutiérrez-Clellen, V. F. & Simón-Cereijido, G. (2007). Evaluating the discriminant accuracy of a grammatical measure with Latino English-speaking children. Journal of Speech, Language, and Hearing Research50(4), 968–981[Article]
Hodson, B. (1986). Assessment of phonological processes-Spanish. San Diego: Los Amigos Research Associates.
Jacobson, P. F. & Schwartz, R. G. (2005). English past tense use in bilingual children with language impairment. American Journal of Speech-Language Pathology14(4), 313–323 [Article] [PubMed]
Kohnert, K. J. & Bates, E. (2002). Balancing bilinguals ii: Lexical comprehension and cognitive processing in children learning Spanish and English. Journal of Speech, Language, and Hearing Research45(2), 347–359[Article]
Leonard, L. B., Eyer, J. A., Bedore, L. M. & Grela, B. G. (1997). Three accounts of the grammatical morpheme difficulties of English-speaking children with specific language impairment. Journal of Speech, Language, and Hearing Research40(4), 741–753 [Article]
Mcgregor, K. K., Newman, R. M., Reilly, R. M. & Capone, N. C. (2002). Semantic representation and naming in children with specific language impairment. Journal of Speech, Language, and Hearing Research45(5), 998–1014 [Article]
Mason, M., Smith, M. & Hinshaw, M. (1976). Medida Española de articulación (Measurement of Spanish Articulation). San Ysidro, CA: San Ysidro School District.
Mattes, L. (1985). Spanish articulation measures. Oceanside, CA: Academic Communication Associates.
Melgar de Gonzalez, M. (1976). Como detectar al niño con problemas del habla [Identifying the child with speech problems]. Mexico City: Trillas.
Paul, R. & Norbury, C.F. (2012). Language disorders from Infancy through adolescence. St. Louis, MO: Elsevier.
Peña, E.D., Gutíerrez-Clellen, V.F., Iglesias, A., Goldstein, B. & Bedore, L.M. (2014). Bilingual English Spanish Assessment (BESA). AR-Clinical Publications.
Restrepo, M. A. & Kruth, K. (2000). Grammatical characteristics of a Spanish-English bilingual child with specific language impairment. Communication Disorders Quarterly21(2), 66–76 [Article]
Rice, M. L. & Wexler, K. (1996). Toward tense as a clinical marker of specific language impairment in English-speaking children. Journal of Speech, Language, and Hearing Research39(6), 1239–1257 [Article]
Sheng, L., Peña, E. D., Bedore, L. M. & Fiestas, C. E. (2012). Semantic deficits in Spanish-English bilingual children with language impairment. Journal of Speech, Language, and Hearing Research55(1), 1–15 [Article]
Toronto, A. (1977). Southwest Spanish articulation test. Oceanside, CA: National Education Laboratory Publishers, Inc.
Vermeer, A. & Shohov, S.P. (2004). Exploring the lexicon: Quantitative and qualitative aspects of children’s L1/L2 word knowledge. In Advances in Psychology Research. Hauppauge, NY: Nova Science Publishers.

Lisa Bedore
University of Texas at Austin

Originally presented at the ASHA Convention (November 2013) as part of the session Development of a Bilingual Test for Spanish-English Children: A Long and Winding Road. Co-Presenters: Elizabeth D. Peña, University of Texas at Austin; Aquiles Iglesias, Temple University; Vera F. Gutierrez-Clellen, San Diego State University; Brian A. Goldstein, La Salle University; and Lisa M. Bedore, University of Texas at Austin.
Disclosure: All of the above-listed authors/co-presenters benefit financially from royalty payments from the Bilingual English-Spanish Assessment (BESA.).
Copyrighted Material. Reproduced by the American Speech-Language-Hearing Association in the Clinical Research Education Library with permission from the author or presenter.