AI, Big Data and the Writing “Machine”

As I was doing my daily reading (on the internet that is), I came across an article that intrigued me. I of course like to delve into the latest in technology solutions and trends, and this particular article happened to cover two things that I am always interested in reading about — those being tech, and sports.

AI writer

The Sportswriting Machine, an article appearing in March in The New Yorker, details how automated sportswriting could have the potential to replace the sports beat writer. In fact this has already been demonstrated by the Associated Press, where in early March it was announced that it would be relying on “automation technology” to cover college sports (e.g. baseball and lower-division basketball and football), where it has traditionally been unable to send reporters. It was stated that the AP would be working with a platform called Wordsmith for this purpose, an application developed by Automated Insights which uses natural language generation which would analyze the data from a game and then use it to write a “coherent and familiar-looking” recap. It operates by using an algorithm that finds narrative in numbers, and simply put by the company, lets your data tell its story.

According to Robbie Allen, the founder and C.E.O. of Automated Insights, “We could pinpoint, for example, the double in the eighth inning of a baseball game that led to the key run that won the game” claiming this is fact based on quantitative information that is gathered in the game. “For basketball, Wordsmith looks at play-by-play information, win probability at different points, plays that might have made the biggest difference in a game.” It might identify something like a key missed free throw, something that a sportswriter would look for to write the lede in their story.

What’s more notable is how the algorithm literally takes a split-second to put together a game recap after it becomes final. One thing to possibly be considered though is how such game analysis could be considered short and even impersonal, almost as if it was written by Siri. Steve Wiseman, who works as the sports editor and Duke men’s basketball beat reporter for the Herald-Sun said “If I notice that certain players are talking during an important timeout, I will ask them about it later in the locker room, and try to figure out what they had been saying,” Wiseman continued “Getting that kind of insight is what being a sportswriter is all about.” Thus for those who look for that personal angle to a game recap, it might just be missing – although that is likely left to individual perception.

Before discussion of sports coverage, a challenge in financial reporting faced by the AP is explained in a 2014 Automated Insights case study The Associated Press Leaps Forward, where every quarter public companies in the US release corporate earnings and Associated Press reporters have to pour through the reports, extracting relevant financial numbers to compose stories based on those numbers. In terms of the challenge, two problems were apparent – first, AP could only produce 300 such stories per quarter, which left thousands of potential company earnings stories unwritten, and second these stories were not only taking up reporters’ time – they were, in the words of USA Today’s Roger Yu, “the quarterly bane of the existence of many business reporters.”

Wordsmith

A statement – the Associated Press is going robotic – appeared in  the USA Today article as AP found their answer in the Wordsmith platform. In this case, it transforms raw earnings data from Zacks Investment Research into a publishable AP story, in just a fraction of a second. As a result of using Wordsmith, the AP produced nearly 4,300 quarterly earnings stories – fourteen times the output of its manual efforts. The stories retain the same quality and accuracy that readers expect from any of AP’s human-written articles as the Wordsmith team specifically configured the natural language generation engine to write in AP style. Results were reported in January 2015, as AP’s Assistant Business Editor explained automation’s overall benefits.

I looked for more evidence of AI writing and came upon a recent post on Top10.press Top 10 Amazing (Somewhat Terrifying) Facts about AI where one fact does point to how artificial intelligence can write. It gives an interesting example of this, which appeared last year on the Los Angeles Times website detailing an earthquake that took place on the morning of March 17th. This report came less than three minutes after the earthquake:

“A shallow magnitude 4.7 earthquake was reported Monday morning five miles [8km] from Westwood, California, according to the US Geological Survey. The temblor occurred at 6.25am Pacific time at a depth of 5.0 miles. According to the USGS, the epicenter was six miles from Beverly Hills, California, seven miles from Universal City, California, seven miles from Santa Monica, California, and 348 miles from Sacramento, California. In the past 10 days, there have been no earthquakes magnitude 3.0 and greater centered nearby. This information comes from the USGS Earthquake Notification Service and this post was created by an algorithm written by the author.”

While this may have looked like a wire report quickly drafted by a press agency, it was actually a computer writing a story based on data that was pulled from seismographs, which turned them into figures and then plugged those figures into a story. The software selected the relevant information and drafted it in everyday English. This technology was developed in part by Larry Birnbaum, a professor of journalism and the head of the Intelligent Information Laboratory at Northwestern University. He is one of the developers of the Quill system, defined as “Data-Driven Communications at Machine Scale.”

Quill is the creation of Narrative Science (Birnbaum is their Chief Scientific Advisor) whose work is in creating exceptional software to enhance human productivity. It is the Narrative Science natural language generation (Advanced NLG) platform that goes beyond reporting the numbers as it creates – according to the company – perfectly written narratives to convey meaning for any intended audience.

Quill

Narrative Science with Quill helps companies automatically transform data into narratives, and the artificial intelligence software dramatically reduces the time and energy people spend analyzing, interpreting and explaining data. Quill produces efficient narratives with insights that might otherwise be buried in Big Data.

Touching once again on the sportswriting perspective, Norman Chad, a sportswriter and syndicated columnist who is seen on ESPN, wrote an article Couch Slouch: Rise of the sportswriting machines and highlights both Automated Insights and Narrative Science claiming that they “are aiming to make sportswriters the next dinosaurs. These companies transfer data into written reports, rendering obsolete the ink-stained wretch sitting in a press box eating free hot dogs.”  He asks the question: Can an algorithm really replace a beat writer? His answer: I guess so — never misses a deadline, less chance of libel, no bloated expense reports.

In my mind, especially as one who grew up reading the sports section and the baseball (mainly New York Mets) game recap to go with the boxscore every day, I don’t see automation taking over the full process and replacing good sportswriters, however in this fast accelerating world of news and technology, it appears that automation can add benefit to sports reporting in many situations.

I do of course see great advantage to artificial intelligence in writing as has been produced in applications by Automated Insights and Narrative Science and look forward to writing more on these developments.

Just as long as they’re not looking to replace bloggers that is…

The New Yorker: The Sportswriting Machine

Washington Post: Couch Slouch: Rise of the sportswriting machines

Case study: The Associated Press Leaps Forward