Getting More Academic and Grade-Assigned Words from the Lexile WordBank

This topic explains how we developed WordBank and why relatively few words have domain and grade assignments and suggests some ways you can use WordBank to meet your needs for larger numbers of academic words or more words with grade assignments.

One of the challenges of developing vocabulary resources is that different users will have different purposes and needs. For direct instruction, for example, one might choose relatively few words to highlight at each grade. For developing resources for reference purposes, many words at each grade might be more appropriate. We've tried to strike a balance between providing raw, descriptive information that different users can use in different ways and also providing easy, ready-to-use lists of words that we think might be generally useful to many audiences to meet these two needs:
  • First, we provide an account of which words students are likely to encounter, at which grades, and in which domains. For this goal, we provide information on essentially all of the words that occurred in the textbook programs. Many of these words would not be appropriate for additional vocabulary instruction or assessment in general. They are the many technical words, proper nouns, and rare words representing things as far-flung as all of the different foods, plants, animals, physical objects, emotion words, technical language, etc. In this case, we provide all the data more for reference purposes, not as a prescriptive list by grade. Content authors might use this info to inform and guide their writing by having policies about how many words could appear in a passage targeted for a certain grade where those words had not occurred significantly by that grade and might therefore be unfamiliar.
  • Second, the WordBank highlights a subset of words that we believe are more important for instructional or assessment focus. There are roughly 6,000 domain-specific and general-academic words with additional data in WordBank. Unavoidably, some decisions must be made as to which words should be included in this subset and which should not, and different users of WordBank may have different needs.

To understand how you can use the additional columns in WordBank to find more words for a domain or at a specific grade level, we will present a series of examples of filtering and sorting by various WordBank columns to find additional words by domain and grade. Two sets of columns are useful for this: a) the by-grade frequencies and b) the "Likelihood" columns for each domain and for general that indicate the degree to which a word could be considered a 'science' or 'social studies' academic word.

We will start by looking at trying to find more “science” words overall. Sorting by the likelihood column for science, these are the top 10 most “sciency” words:
Grade 1-12 Overall Frequency Science Academic Word (Yes/No) Grade Assignment
Ribose 13 No
Thyroxine 17 Yes 9-12
Dipeptide 8 No
Gravitropism 15 Yes 9-12
Brachiopod 5 No
Interphase 46 Yes 6-8
Nucleon 59 Yes 9-12
Homeostasis 220 Yes 6-8
Lepton 15 Yes 9-12
Mutualism 29 Yes 5

As you can see, most of them are indicated as being science academic words and are assigned to a grade, but not all are. The words ribose, dipeptide, and brachiopod are not added to our list. There can be multiple reasons why we would not add a word, but it is probably because the overall frequency of those "non-list" words is very low. One decision we made to produce lists that we thought would be generally useful was to limit to words we suspect students would be more likely to encounter repeatedly. However, if you have a goal of building more extensive word lists, you could make different decisions and choose to include lower-frequency words.

More specifically, let’s consider finding more grade 1 science words. First, a word about our logic for grade assignments is just one of many ways to assign words to grades or choose which words are appropriate at which grades and for which purposes. Because of the uneven volume of reading/content at different grades, we wanted to create appropriately sized lists by grade. Many fewer words and concepts are discussed in first grade than in 2nd, 3rd, or 9-12th grade. Therefore, we assigned several target words for each grade and domains based on each grade's relative volume of content. We also chose to make grade designations mutually exclusive (not necessary for all purposes). To accomplish this, we took the top X most frequent science, social studies, math, and general academic words and assigned them to 1st grade, then excluded them from assignment to any other grade. So grade 2 is the most frequent (in grade 2 books) words in 2nd grade that were NOT assigned to 1st grade, and so on. Again, you might choose different conventions or criteria for grade assignments depending on your purposes.

Let's see how we can use WordBank to find these potential grade 1 science words by first filtering whether the word occurred in 1st grade above some threshold of occurrences and then again sorting by the likelihood columns. Here are the most “sciency” words that occur more than five times in 1st grade and that were not identified as grade 1 words:
Likelihood of being a Science Word Grade 1 Frequency Grade Assignment Science Academic Word (Yes/No) General Academic Word (Yes/No)
Larva 0.98 6 2 Yes No
Nutrient 0.98 6 3 No Yes
Hypothesis 0.94 7 3 No Yes
Telescope 0.92 6 2 Yes No
Repel 0.92 9 6-8 Yes Yes
Vibrate 0.90 6 2 Yes No
Dinosaur 0.88 6 2 Yes No
Zebra 0.86 6 2 Yes No
Kilogram 0.85 6 2 Yes No
Tube 0.81 6 3 Yes No

There are a few different cases here, so we'll go over them. The words nutrient, hypothesis, and repel are all identified as general academic words. Our model chose not to assign them to the science academic word list for two of those words. You might choose to include them or have them on two lists, the science and general lists. The other thing you see is that the words are all assigned to grades other than grade 1. Again, we chose to assign proportionally appropriate numbers of words to each grade, so many words that are identified as grade 2 words indeed occur in grade 1 but less frequently than other words. So they are assigned instead to grade 2. You might choose to have overlapping lists of words by grade, depending on your purpose. Or you might choose to move these words from second grade into first grade and find more words for second grade.

Speaking of finding more words to include, let's filter now by whether a word was assigned to any list. We will filter by those word-list yes/no columns and again require more than five occurrences in 1st grade and again sort by likelihood of being a science word.
Likelihood of being a Science Word Grade 1 Frequency Science Academic Word (Yes/No) General Academic Word (Yes/No)
Flipchart 0.41 38 No No
Rain 0.38 46 No No
Food 0.38 124 No No
Instrument 0.37 6 No No
Water 0.36 272 No No
Bubble 0.36 6 No No
Unlock 0.36 13 No No
Inventor 0.36 6 No No
Hatch 0.36 9 No No
Hot 0.35 50 No No

What we see now are several "marginally sciency" words, meaning their likelihood of being identified as an academic science word by a trained human rater is less than .50 but not near 0 as many words are. You can imagine why words like flipchart, instrument, or inventor might be considered science words. They did not meet all the criteria for inclusion in our lists. In these cases, the words may be used enough in other domains or in our oral language corpus to be excluded from identification as an academic science word. For example, the words food, water, and rain almost certainly have a marginal likelihood of being a science word because of their use in everyday language (i.e., occurrence in the oral language corpus). You might choose to include these words depending on your goals. Food, water, and rain can and should be discussed in a scientific context (and in 1st grade no less) even if they are words that are likely to be familiar from an everyday conversation (i.e., not "academic" words).

We hope this gives you a sense of how you can use the information in WordBank to meet your specific needs.