The Tyranny of Metrics
by Jerry Z. Muller
“This book argues that while they are a potentially valuable tool, the virtues of accountability metrics have been oversold, and their costs are often underappreciated.” There are chapters on the dysfunction of “metric fixation” in colleges and universities; schools; medicine; policing; the military; business; and philanthropy. Problems include gaming the system, costs exceeding benefits, and diverting effort from the core mission. A major theme is metrics as a substitute for competent judgment.
“The most characteristic feature of metric fixation is the aspiration to replace judgment based on experience with standardized measurement… With all that time spent reporting, meeting, and coordinating, there is little time left for actual doing. This drain on time and effort is exacerbated by the tendency of executives under the spell of metric fixation to distrust the experiences judgment of those under them.”
“There is often an unexamined faith that amassing data and sharing it widely within the organization will result in improvements of some sort—even if much information has to be denuded of nuance and context to turn it into easily transferred ‘data.’”
“Judgment is a sort of skill at grasping the unique particularities of a situation, and it entails a talent for synthesis rather than analysis… A feel for the whole and a sense for the unique are precisely what numerical metrics cannot supply.”
Colleges and Universities
The longest chapter deals with colleges and universities, where the author feels the pain most directly. Muller is a history professor at Catholic University of America.
“In an attempt to obtain ‘value,’ successive British administrations have created a series of government agencies charged with evaluating the country’s universities… There are audits of teaching quality, such as the ‘Teaching Quality Assessment,’ evaluated largely on the extent to which various procedures are followed and paperwork filed, few of which have much to do with actual teaching… A mushroom-like growth of administrative staff has occurred… The search for more data means more data managers, more bureaucracy, more expensive software systems. Ironically, in the name of controlling costs, expenditures wax.”
“In academia as elsewhere, that which gets measured gets gamed.”
“In addition to expenditures that do nothing to raise the quality of teaching or research, the growing salience of rankings has led to ever new varieties of gaming through creaming and improving numbers through omission or distortion of data.” The book explains how American law schools manipulate their USNWR rankings.
“Colleges, both public and private, are measured and rewarded based in part on their graduation rates, which are one of the criteria by which colleges are ranked, and in some cases, remunerated… There is pressure on professors—sometimes overt, sometimes tacit—to be generous in awarding grades. An ever-larger portion of the teaching faculty comprises adjunct instructors—and an adjunct who fails a substantial portion of her class (even if their performance merits it) is less likely to have her contract renewed.”
“When individual faculty members, or whole departments, are judged by the number of publications, whether in the form of articles or books, the incentive is to produce more publications, rather than better ones… Only citations in the journals within the author’s discipline were counted. That too was problematic, since it tended to shortchange works of trans-disciplinary interest.”
An unintended consequence of K-12 testing-and-accountability legislation is that “students too often learn test-taking strategies rather than substantive knowledge… Because students in English are taught to answer multiple choice and short-answer questions based on brief passages, the students are worse at reading extended texts and writing extended essays.”
“An emphasis on measured performance through standardized tests creates another perverse outcome, as Campbell’s Law predicts: it destroys the predictive validity of the tests themselves. Tests of performance are designed to evaluate the knowledge and ability that students have acquired in their general education. When that education becomes focused instead on developing the students’ performance on the tests, the test no longer measures what it was created to evaluate.”
“The costs of trying to use metrics to turn schools into gap-closing factories are therefore not only monetary. The broader mission of schools to instruct in history and in civics is neglected as attention is focused on attempting to improve the reading and math scores of lower-performing groups.”
Do the reported numbers mean what you think they mean? The WHO’s World Health Report 2000 ranked U.S. healthcare system 37th among the nations of the world. “Scott W. Atlas, a physician and healthcare analyst, has scrutinized and contextualized those claims, which turn out to be more than a little misleading. Most of us assume that the WHO rankings measured the overall level of health. But actual health outcomes accounted for only 25 percent of the ranking scale. Half of the points awarded were for egalitarianism: 25 percent for ‘health distribution,’ and another 25 percent for ‘financial fairness,’ where ‘fairness’ was defined as having everyone pay the same percent of their income for healthcare. That is, only a system which the richer you are, the more you pay for healthcare was deemed fair. The criterion, in short, was ideological. The fact that there was a number attached (37th) gave it the appearance of objectivity and reliability. But in fact, the overall performance ranking is deceptive.”
“Indeed, the ease of measuring may be inversely proportional to the significance of what is measured. To put it another way, ask yourself, is what you are measuring a proxy for what you really want to know? If the information is not very useful or not a good proxy for what you’re really aiming at, you’re probably better off not measuring it.”
“What are the costs of acquiring the metrics? … Every moment you or your colleagues or employees are devoting to the production of metrics is time not devoted to the activities being measured.”
“Measurements are more likely to be meaningful when they are developed from the bottom up, with input from teachers, nurses, and the cop on the beat. That means asking those with tacit knowledge that comes from direct experience to provide suggestions about how to develop appropriate performance standards… Remember, that a system of measured performance will work to the extent that the people being measured believe in its worth.”
Muller summarizes with 10 unintended but predictable negative consequences of metric fixation: “Goal displacement through diversion of effort to what gets measured; promoting short-termism; costs in employee time; diminishing utility… [as] the marginal costs of assembling and analyzing the metrics exceed the marginal benefits; rule cascades [in response to gaming and cheating]; rewarding luck; discouraging risk-taking; discouraging innovation; discouraging cooperation [by promoting competition]; degradation of work; and costs to productivity.”
“In the end, there is no silver bullet, no substitute for actually knowing one’s subject and one’s organization, which is partly a matter of experience and partly a matter of unquantifiable skill. Many matters of importance are too subject to judgment and interpretation to be solved by standardized metrics. Ultimately, the issue is not one of metrics versus judgment, but metrics as informing judgment, which includes knowing how much weight to give to metrics, recognizing their characteristic distortions, and appreciating what can’t be measured. In recent debates, too many politicians, business leaders, policymakers, and academic officials have lost sight of that.”
Muller, Jerry Z. The Tyranny of Metrics. Princeton: Princeton University Press, 2018. Buy from Amazon.com