An old professional friend, Richard Phelps, asked me in late April to write a review of his latest book. I agreed to write a full-length review without a deadline or remuneration. The book is accurately described in the 2023 Choice Review excerpt reprinted online, although one might quibble about 2001, the year given for the events noted. The excerpt is from an advertisement for the book by a well-known bookstore owner—Barnes & Noble—noting that the Review, a journal with critical evaluations of current academic books and electronic resources, praised the book as an “Outstanding Academic Title.” The book’s title is startling, and the online excerpt in Barnes & Noble’s advertisement for the book is as follows:
The Malfunction of US Education Policy: Elite Misinformation, Disinformation, and Selfishness biased and inefficient information dissemination that has degraded US education research and policy since the year 2001, when a series of unfortunate disruptions began:
- first, the No Child Left Behind (NCLB) Act and federal imposition of an idiosyncratic and ineffectual testing program;
- second, the “big bang” reorganization of the US education testing industry from a stable, cooperative oligopoly run by psychometricians to a commercially competitive free-for-all with more opportunist and customer-pleasing ambitions; and
- third, the Common Core standards, which mandated homogenous lower content standards onto the still required NCLB testing structure.
Billions from the federal government and wealthy foundations have transformed many once-independent national education organizations into “cargo cult” dependents and promoters of the new order, intolerant of divergent points of view. The research and policy brain trust responsible comprised an alliance of convenience among two “citation cartels” of establishment and reform scholars and politicos, and an astonishingly cooperative and un-skeptical group of journalists. It succeeded in focusing attention on their work, while diverting attention away from a much larger universe of others’ work—by ignoring, dismissing, or demeaning it—that included a century’s worth of mostly experimental scholarship in the fields of psychology and program evaluation.
Phelps’s indictment of the testing industry and his reviews of the research in many, if not most, education articles and books are based on his experience working for Educational Testing Service (ETS), ACT, and other test developers, teaching secondary mathematics, and his many conversations with the psychometricians who worked with him in the testing industry. He also holds a Ph.D. in Public Policy from the University of Pennsylvania’s Wharton School and developed the Nonpartisan Education Review (NER), an online site for discussing education issues by authors or discussants who want a site free of partisan selection of manuscripts and issues. To this day, the NER remains a preferred forum for those seeking a place of security and freedom for non-mainstream ideas.
The point of what I have written, as most readers will agree, is to highlight the fact that Phelps has the relevant background knowledge and experience to make judgments about the testing programs and research that have been used in all public schools—and in many private schools—to justify what has been taught to teachers or students in recent decades. Phelps’s book is mainly about program evaluation and the effects of tests on the program. One looks in vain at the many appendices for Phelps’s book for a title suggesting licensure research on teachers of color, but none seems relevant. Yet, that is a very important topic today in education policy or research, indicating the narrowness of Phelps’s book.
For example, the following paragraph is how the current staff in charge of Bay State’s licensing procedures describe the teacher test they are modifying. They are doing so to make this test more “culturally or linguistically sustaining” and to reduce “racial disparities” among their test-takers. The researchers have claimed that “pass rate gaps persist based on racial demographics and the primary language of test-takers,” even though no evidence or data on the race or language of test-takers is provided.
About the Massachusetts Tests for Educator Licensure:
The Massachusetts Tests for Educator Licensure (MTEL) was initiated by the Massachusetts Department of Elementary and Secondary Education in 1998 as part of our statewide education reform initiative for educators seeking PreK to grade 12 academic licenses. The MTEL includes a test of communication and literacy skills as well as tests of subject matter knowledge. The tests are designed to help ensure that Massachusetts educators can communicate adequately with students, parents/guardians, and other educators and that they are knowledgeable in the subject matter of the license(s) sought. MTEL includes tests for candidates seeking vocational technical and adult basic education licenses.
As readers will note, this justification allows DESE staff to revise all subject tests and the test of communication and literacy skills required of all prospective educators. Its revisions will have to be addressed by all who must hold a license to become an administrator or teacher in any school and for any group of students. It is not clear in “About the MTEL” that most prospective educators will have to take a test of skills and subject matter knowledge.
So, what does this example have to do with Phelps’s book? It is an example of the rationale for Phelps’s book. As can be seen in a short memo to his state board of education written by Russell Johnston, the Acting Commissioner of Education, one of the purposes of a review of the literature in a memo is to update or review a rationale for a project the board has already voted for. Johnston first writes: “We are working to promote teaching and learning that is antiracist, inclusive, multilingual, and multicultural; that values and affirms each and every student and their families; and that creates equitable opportunities and experiences for all students, particularly those who have been historically underserved.” Who today on a governor-appointed state board of education would oppose such goals? He then notes that educators are required to “pass a test established by the board which shall consist of two parts: (A) a writing section which shall demonstrate the communication and literacy skills necessary for effective instruction and improved communication between school and parents; and (B) the subject matter knowledge for the certificate” Johnston has artfully located in state law the need for improving the “communication” between school or parents and “instruction.”
He goes on to note:
The MTEL is predictive of educator effectiveness. In a 2020 study, the Center for Analysis of Longitudinal Data in Education Research (CALDER) found that MTEL scores are positive and statistically significant predictors of teachers’ in-service performance ratings and contributions to student test scores (i.e., value added) once the teachers enter the workforce. These findings are consistent for educators of color.
There is a superscript for a footnote here—the only one in the memo. Johnston further adds:
The MTEL Is Also a licensure requirement that, because of differential pass rates by race and ethnicity, may be contributing to the underrepresentation of Black and Hispanic teachers in the workforce.
The Massachusetts State Board of Education is expected to be fearful of possible bias. In his footnote, Johnston notes that the MTEL research was published by James Cowan, Dan Goldhaber, Zeyu Jin, and Roddy Theobald in 2020 under the title: Teacher Licensure Tests: Barrier or Predictive Tool? in CALDER Working Paper No. 245-1020. Johnston has given the board the information it needs to justify his request and for seeing the state’s licensure tests as a possible source of bias that may account for the low proportion of “teachers of color” in the school system’s workforce. He goes on to point out:
With this context in mind – the MTEL licensure requirement, the predictive validity of the MTEL, and our commitment to cultivating a more effective and diverse workforce – we continue to work to ensure that the Commonwealth’s educator licensure assessment program is centering both access and equity in design and administration.
One study is apparently enough to attest to the usefulness—predictive validity—of an entire set of licensure tests. However, the memo has fudged the fact that “Parts A and B” are on different tests and that most new educators have had to take multiple licensing tests and will continue to do so. The researchers assert in their article that “licensing tests are a potentially costly barrier to entering the teaching profession and only modestly correlated with in-service teacher performance measures.” Indeed, there may be a problem with most licensing tests—e.g., their basis—but we do not learn the cost of tests for other professionals or why Bay State’s tests do not contain pedagogical items, whose absence the researchers note.
A close look at this CALDER Working Paper reveals the core problems faced by the Commissioner and his staff. It also reveals the rationale for Phelps’s book: one literacy skills test doesn’t begin to cover what licensure tests initially covered at their inception –subject knowledge. Second, making all the revisions will take time, as Johnston notes at the end of his memo: revisions to the licensure test to be addressed by the guidelines or framework developed by the staff to address the effort are estimated to take 2-3 years. (Readers do not learn how many different tests will be revised.) Third, no one is quoted as saying that the MTEL test of skills is not inclusive or how: “If teachers of color differ in other skills not well captured by licensure tests, then testing requirements may disproportionately exclude potentially effective teachers.” We are not told that this is the authors’ opinion which is not evidenced, or what these other skills are. Nor are we given information about why these “teachers of color” have a higher retaking rate in passing a test than others. However, the researchers can and do conclude that MTEL scores are more predictive of performance ratings for “teachers of color” than for “White teachers.”
We are well on the way to justify hiring more teachers of color with varying pass rates or—one suspects—to get rid of MTEL altogether. In all this effort, the researchers have not given us employment or teacher evaluation data to show us that “educators of color” are hired by their employers for their licensure scores in order by implication to damn those that hire them as racist. The researchers seem to assume that all educators new to a school system are hired for their licensure score, but we see no data on hiring criteria or the rationales for course or classroom assignment, a teacher subject test itself, or references to professional development programs that have clearly increased the pass rates of low-performing test-takers. In fact, according to the senior associate in charge of the tests around the time they were developed or revised—Sandra Stotsky, the author of this review—most teacher subject tests in the Bay State deliberately did not include pedagogical items because these tests were initially intended to assess a teacher’s competence in teaching the subjects they might be hired to teach and the range of courses they might teach. But we learn nothing in this article about what this state’s teachers actually teach, based on systematic observations about their daily lessons or classroom curriculum, or what their students actually learn, based on the tests and quizzes their school system and teachers give them, beyond the general pass/fail numbers or percentages in public reports on achievement. In other words, we learn nothing that would serve as a basis for licensing teachers so that a school gains a “more effective and diverse workforce.”
One might begin to conclude that this article tried to be a clever piece of propaganda. Yet, in a series of online posts, the NCTQ (National Council on Teacher Quality) affirmed that pandemic-induced emergency teacher licenses granted in 2020 did not suffice. First,
Only about half of emergency-licensed teachers were employed by fall of 2021, compared with 62% of teachers with initial licenses. Survey data revealed that teachers with emergency certifications had the impression that their licenses were viewed less favorably by districts.
In the past four years, almost every state has lowered its entry standards for teaching … New research, published in April 2024 by Ben Backes, James Cowan, Dan Goldhaber, and Roddy Theobald, shows that emergency licenses did make a difference—an academically damaging one.
While this is just one study, it adds to the evidence that teacher preparation and high standards for entry really do matter. Teacher preparation—and the signal that licensure tests provide—is an important guardrail to ensure that aspiring teachers have essential knowledge before they get the keys to the classroom.
And, third,
The authors think that these outcomes likely reflect the different levels of preparation and classroom experience the varied cohorts brought to the classroom. The earliest applicants for emergency licenses had significantly more preparation than later cohorts: 60% had already passed their first set of licensure assessments and a quarter had been enrolled in an educator preparation program. In later cohorts, only about half as many teachers had previously engaged with the teacher pipeline.
These findings underscore the need for evaluation if states make changes to requirements to enter teaching, and they must respond according to the full data and findings. Early analyses of emergency licensure policies were quite promising, but they did not fully reflect the preparation levels of applicants who may ultimately take advantage of such policies—and the consequences for students.
As readers will note, we have shifted from suggesting that licensure tests discriminate against prospective teachers of color to implying that research on such licensure is valuable but must adhere to high standards. It remains unclear who has argued that these high standards are unimportant. Additionally, the purpose of the essay is ambiguous. Johnston’s memo details why the Bay State’s licensing requirements might be biased against teachers of color and discusses the development of the Culturally and Linguistically Sustaining Communication and Literacy Skills (CLS2) Framework. This framework is intended to guide future revisions to the Communication and Literacy Skills Massachusetts Test for Educator Licensure (MTEL), despite the absence of data on the academic performance of students taught by holders of emergency licenses.
Furthermore, the eleven appendices in Phelps’s book do not appear to address research on licensure for teachers of color. Instead, they cover a range of topics unrelated to this issue. These include discussions on the enablers of educational reforms, sources of citations in C.C. Ross’ Measurement in Today’s Schools, research literatures dismissed by celebrity scholars or journalists, funders and strategic partners, comparisons of literature searches for testing effects, bellwether-linked organizations and their funder lists and board members, career paths of education press figures, media expertise and source counts, Michael Petrilli’s contributions and expertise, expert sources of journalists Barnum and Barshay, and the evolution of the Education Writers Association.
Without knowing if there is relevant, dismissed, or unexamined research in the literature review of the 2024 study, one can only conclude that emergency licensure may be less effective than regular licensure for teachers of any background and their students.
The chapters in Phelps’s book address such topics as who or what smaller groups or foundations are the members of the two major competing educational research groups he identifies as the “establishment” and the “reformers.” We do not find out the specific policies—at the time—that winning researchers and organizations could put in place, often with government money and support. The Common Core set of standards—which were adopted by most states, including Massachusetts, in 2010—stressed skills, not content, and sought to make, and so far as I know, succeeded in creating, the reading of nonfiction as important, if not more so, than the reading of literary texts in all the early grades. The Common Core included many different policies—I was told by the Chairwoman of the state board of education, Maura Banta, that the Common Core had “higher expectations” than the standards I had supported—even though there were no data on student performance to support the Common Core. Each of the major educational policies underlying the curriculum shaped by the Common Core had successfully avoided a thorough literature review that would have revealed the limitations in supporting that policy.
Image by Bill Smith — Flickr
I speak as an Affirmative Action era person. We must not lower heaven, while we are raising hell. That is to say, we must not lower academic, licensure, or other performance standards to include the so-called “underrepresented” ala DEI. This attitude does a disservice to all stakeholders, especially the recipients of such “special consideration”. We must devise assistance strategies, for those who are motivated, to meet high standards. That’s where the main and only efforts need to be.
““We are working to promote teaching and learning that is antiracist, inclusive, multilingual, and multicultural; that values and affirms each and every student and their families; and that creates equitable opportunities and experiences for all students, particularly those who have been historically underserved.”
Sadly, knowledge of subject and ability to convey it are secondary, if even that….