Computer science team wins global contest with AI model that translates English to code

IBM will use Vanderbilt model as end-user scripting assistant in its open-source Command Line AI Project

One day, a deep learning model will translate the sentence you type in English to computer code that will accomplish what you stated in the sentence.

A team of Vanderbilt computer scientists has shortened the distance with a top prize-winning computer model showcased in a high-powered competition at one of the two leading international conferences in artificial intelligence and deep learning—the 2020 Neural Information Processing Systems Conference, which ended Dec. 12.

The NLC2CMD competition is run by IBM and is based on IBM’s open-source Project CLAI (Command Line AI), a research and development platform that aims to provide AI on the command line. In the competition, participants build natural language processing models that take a description in English and convert it to its corresponding Bash syntax, a Unix command language.

Jules White

“Basically, it’s artificial intelligence writing computer software,” said Jules White, associate professor of computer science, and member of the three-person Magnum team from Vanderbilt that includes CS graduate research assistants Quchen Fu and Zhongwei Teng. White’s research lab is named the Magnum Research Group.

“Quchen and Zhongwei are working toward a future where, contrary to the current one, software engineers aren’t as in demand as now. You would just tell the computer what you wanted software for and it would generate the correct code to achieve your goal.

“Of course, we are a LOOOOOONNNNG way from that right now, but it is a fun future to work toward,” White said.

Quchen Fu

The idea is to create natural language processors that will allow users to type in English what they want to do without searching for commands they don’t know for the task they’re working on. Instead, the model translates for them.

There are two sets of prizes—for accuracy and efficiency. For the accuracy prize, the top two teams with the highest scores, Magnum followed by Hubris (Bell Labs), won a grand prize of $2,500 each. AICore (Samsung) won the efficiency prize.

Magnum also will be invited to replace tellina in the CLAI skill catalog. Tellina is an end-user scripting assistant that can be queried via natural language. It translates a natural language sentence typed by the user into a piece of short, executable script.

“Essentially, they are going to replace the ‘brain’ behind this system with our model,” Fu said. “IBM aims to build an AI assistant and our model can be seen as one of the skills—an intelligent English-to-Bash translator.”

Zhongwei Teng

The competition featured 16 teams from around the world. Overall, the competition received more than 200 submissions, split over the development (38), validation (119) and test (47) phases. Conference organizers reported that the 2020 virtual conference was “exceptional and historic” and drew more than 23,ooo attendees. IBM Research Blog reported more than 13,000 attendees in 2019.

“This is an outstanding achievement by our graduate students and a significant win for our team and for Vanderbilt,” White said. “Deep learning is one of the hottest topics in CS research.”

MORE >> Coders worldwide help computers understand natural language, Dec. 11, 2020 IBM Research Blog

Contact:  Brenda Ellis, 615 343-6314