An undergraduate machine learning course led Huan Ling to author breaking research in the field – before he even began his fourth-year of studies at the University of Toronto’s department of computer science.
Ling and his co-author Sanja Fidler, an assistant professor in U of T Mississauga’s department of mathematical and computational sciences, have brought “humans into the loop” of training a neural network with natural language feedback, an area Fidler says hasn’t been deeply explored.
Ling says that an image is usually described as someone doing something at some place. The algorithm that's been created is trying to make automatic predictions about such descriptions.
Ling took it one step further [by] building a reinforcement learning model so that individuals can “teach” the algorithm how to improve, using natural language – just as a parent teaches a child.
“Now, you have a link between the learning algorithm and a human via language,” says Fidler. “You can influence how a 'robot' learns just by describing the mistakes it is making.”
These corrections shape a reinforcement learning model, improving the algorithm’s overall performance.
“Reinforcement learning is driven by some reward – a scaler – that tells you how well you're doing,” says Fidler. “So now instead of that numerical reward, we can directly use language.”
Fidler is one of the co-founders of the Vector Institute. Launched earlier this year, Vector builds on the expertise of the U of T computer science department’s pioneering work in neural networks and deep learning, and positions Toronto as Canada’s epicenter for AI.
“I still feel nervous when I recall the time I approached Sanja with my ideas. I didn't even know recurrent neural networks,” says Ling, whose interest in machine learning first peaked when he studied Introduction to Machine Learning with U of T’s Raquel Urtasun and Richard Zemel.
From the papers Fidler suggested he read, Ling chose a U of T and MIT paper on aligning book text with movies.
“Within a week he had a demo on image captioning running. I was like, ‘Wow! Who is this student?’” Fidler says, noting Ling's work has already received attention from academic peers. Fidler has also shared Ling’s work at a recent Google summit and will give a presentation at Facebook this fall.
There’s been no study break for Ling this summer. He’s excited to see what the future holds for artificial intelligence in Toronto.
He’s not only adding to his work on the image captioning project, but he’s also collaborating with another student on segmenting objects – like bikes, cars and people – in photos using the Cityscapes Dataset for autonomous driving. Ling says the future lies in applying these techniques to the real world.
The research was partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC). The researchers also acknowledge the support from NVIDIA for its donation of the GPUs used for this research.