Algorithmic Essay-Grading: Teacher’s Savior Or Bane Of Learning?

A contest is underway at data-crunching competition site Kaggle that challenges people to create “an automated scoring algorithm for student-written essays.” This is just the latest chapter in a generations-long conflict over the nature of teaching, and to that end it’s also just one of many inevitable steps along the line. Automated grading is already prevalent in simpler tasks like multiple-choice and math testing, but computers have yet to seriously put a dent in the most time-consuming of grading tasks: essays.

Millions of students write dozens of essays every year, and teachers will often take home hundreds to read at a time. In addition to loading the teachers with frequently undocumented work hours, it’s simply difficult to grade consistently and fairly. Are robo-readers the answer? Mark Shermis at the University of Akron thinks it’s at least worth a shot.

The competition is structured as you might expect, and actually is nearing its conclusion. It’s been ongoing for a few months and ends on April 30th. So far there are over 150 entrants and over 1,000 submissions. The contest provides them with a database of essays and their scores to “train” the engines, then tests them, naturally, on a new set of essays without scores. Presumably the engine that produces the most reliably human-like results will take home the first prize: $60,000. Then it’s $30k for second place and $10k for third. The contest is sponsored by the William and Flora Hewlett Foundation.

It’s interesting enough as a data-analysis project, but likely also to be a major point of contention over the next decade or so. The increasing systematization of education is something many teachers and parents decry; the emphasis on standardized tests is abhorrent to many, while the human component of essay grading is considered indispensable.

To replace human readers with robots – “It’s horrifying,” says Harvard College Writing Program director Thomas Jehn, speaking to Reuters. “I like to know I’m writing for a real flesh-and-blood reader who is excited by the words on the page. I’m sure children feel the same way.”

Fair enough. But if the results are the same, is there really a conflict? Ideally, these machine readers would produce the same grade, for the same reasons. Is a TA scanning each essay and marking off the salient key words and checking for obvious failures doing a better job? It probably depends on the TA. And the professor or teacher, and the student, the length and topic of the essay, and so on.

There’s a counter-argument, then, that grading essays is, much of the time, a mechanical process that humans have to perform, and which in many cases they can’t perform consistently. Like working at an assembly line — a robot can do it faster and cheaper. This has real benefits, not least of which is freeing the humans to do more human work. TAs could spend more time doing one-on-one tutoring, and teachers could work harder on lesson plans and the actual process of teaching.

The essay-grading portion is only “phase one” of the project’s plan, though. Phase two would focus on shorter answers, and phase three on charts, graphs, representations of math and logic. It’s exciting, but it’s one of those areas of advancement that makes many uncomfortable. It could be said, though, that we feel uncomfortable because it is those very areas that need the most attention.