• Feature
  • Learning in Machines & Brains

Q&A with Yoshua Bengio

by Graham Taylor
Aug 1 / 18
This Q&A is part of CIFAR’s series on building a research lab.

Graham Taylor is a CIFAR Azrieli Global Scholar in CIFAR’s Learning in Machines & Brains program and an Associate Professor at the School of Engineering, University of Guelph and the Vector Institute. Yoshua Bengio is Co-Director of CIFAR’s Learning in Machines & Brains program and Full Professor in the Department of Computer Science and Operations Research Canada Research Chair in Statistical Learning Algorithms at Université de Montréal.

Graham Taylor (GT): Can you tell me about your first faculty position?

Yoshua Bengio (YB): At Université de Montréal I was the only one doing machine learning or neural nets. I was very motivated and said “yes” to every student that came to me: I should’ve been a bit more selective.

I took advantage of the network outside my department, people I had worked with during my postdoc. I got in touch with Geoff [Hinton] and people in Toronto. Networking with people doing similar things is important, especially since I was the only one doing that stuff here.

I also got lucky in that people here recognized my potential. They immediately gave me some teaching relief.

LMB_YoshuaBengio
Program Co-Director, Yoshua Bengio


GT:
Right from the beginning?

YB: Yes. So, two courses instead of three for the first seven years. Then I got the Canada Research Chair and it became one. I think that’s probably the most challenging thing: being overloaded with teaching and starting a lab. I regretted working so hard when my second baby was born; I was overwhelmed with the faculty position. In retrospect, I could’ve managed a better life balance.

GT: Did anybody mentor you internally?

YB: No, there wasn’t anyone. Maybe I should’ve reached out to some of the older professors, not necessarily ones doing the same thing. If I had been less shy I would’ve tried to get more feedback. I didn’t realize it was possible. New faculty should just reach out, make the connection. Departments should make that happen. Senior faculty are happy to do it, even if they might not take the initiative.

GT: So just ask, even if it’s not formalized at the university?

YB: That’s right.

GT: Looking at these external networks, what’s the best advice you received as a faculty member?

YB: In my first decade as a professor I had quite a few interactions with Geoff Hinton, even though they were remote. That was very helpful in terms of getting me to focus on things that matter.

One thing I would’ve done differently is not disperse myself in different directions, going for the idea of the day and forgetting about longer term challenges. It’s hard when you’re establishing yourself because you’re anxious about publishing enough to get tenure. But you have to spend at least some of your time focusing on the longer term. The success of your career depends on it. If you’re too much in survival mode, which is an easy thing to fall into, you may miss something important. The discussions I had with Geoff helped me do that.

GT: Do you think this is exacerbated now that applied machine learning is in the spotlight, with companies or collaborators coming to you with projects where you apply it to a very specific problem? Again the danger is to disperse yourself.

YB: I think each person has to find their path. In order to become good at something and make breakthroughs you have to become an expert. So if you’re going to do applications, and especially if you’re a young faculty member, then you should focus on one and become the strongest person on earth in this subject.

That’s one danger with applied machine learning. It can be used for so many things, right? In the ‘90s I was touching many applications.

LMB_GrahamTaylor
CIFAR Azrieli Global Scholar, Graham Taylor

GT: So it wasn’t much different than it is now?

YB: In the ‘90s there was a lot of industrial interest for neural nets and machine learning. The Canadian system encouraged collaborations with industry; it was tempting to do that to get funding for basic science. So I used some of the contracts money to fund longer-term investigations. I don’t think it’s what was expected, but the system implicitly encourages it by underfunding basic research. We should give stronger value to long-term questions because this is what that has led us to the amazing progress we have seen in AI.

Let me add something about grants.

One thing I didn’t realize when I started is that when you write a proposal to an organization like the Natural Sciences and Engineering Research Council of Canada (NSERC), they are less concerned that you do what you said you would. They want you to do good work that you can report on years later.

One thing I would’ve done differently is not disperse myself in different directions, going for the idea of the day and forgetting about longer term challenges.

In research it’s hard to predict what will be the hot thing, what problems you’ll encounter. It’s important to be adaptive and NSERC allows that. It’s not necessarily true when you have a contract with a company for something that’s going to help them. But you shouldn’t feel that you’re tied to what you said, so long as you satisfy whoever is giving you the money. In the case of NSERC it’s basically just do good science, publish and remain flexible in your research path.

GT: What are some of the best practices for managing your lab? You have many students, lots of equipment, many projects.

YB: Get help. Early on I didn’t have a lot of money for postdocs. You choose postdocs who already know the stuff, so they don’t need two years to get up to speed and can help manage a larger team. We don’t learn when we do a PhD how to manage a team or lead a group, but we should. One thing that has benefited me is to recognize natural leadership ability in the students. I’ve very often used internships, so I have undergrads or grad students from other places coming to the lab, and PhD students who want to manage and are good at it. Sometimes it’s just one student; some end up supervising multiple students.

We shouldn’t underestimate the ability of younger people to do a better job than their elders as managers.

GT: With the types of post-PhD job offers right now, is it becoming more difficult to hire high quality postdocs?

YB: Yes and no. I don’t think I’m a good example.

GT: Right, because people are attracted here. Someone getting started though?

YB: One thing that has worked for me is to find somebody who has the right background in math or physics and has dabbled in machine learning: these people can learn the skills very fast. So if you consider two, three-year postdocs, these kinds of people are worth recruiting. They wouldn’t be able to get a job at Google Brain or something because they haven’t yet demonstrated proof of their machine learning abilities. But it’s a gamble that has worked well for me in certain instances.

GT: What can you say to convince postdocs to take on a postdoc position as opposed to going out to industry right away?

YB: If they come to a machine learning lab and participate in published research it will greatly increase their value for industry. So even if later on that’s what they want to do, they’ll be able to get better positions, better pay if they start by doing a postdoc. Now of course it depends. Some PhD students don’t need postdocs to get top positions, right? In that case the factor might be something like, “Well if you want to do an academic career,” because money is not the only thing that people care about. Then doing a postdoc is actually a good thing: to learn more about how academia works, be involved in management roles and even grant writing and things like that.

Industry is not one uniform thing. There’s a huge difference between a job in a basic research lab and a job where you won’t be given as much freedom and you’ll be a skilled technician for either other researchers or for applied research with heavy development work and so on. There are lots of industrial jobs and few of them are actually of the kind that a postdoc might prefer.

GT: But their number might be growing, with Google Brain, DeepMind, FAIR and these types of places?

YB: Yes, even new companies like Element AI have these kinds of groups.

GT: In terms of personnel, what do you think is the ideal size?

YB: It depends on the professor, and on the experience of that person in managing groups. I started with three people and now I have a humungous group. But I could never have done this in one shot. So I gradually learned how to manage more people and built infrastructure and funding. You should grow at your own speed. Some people will never be comfortable with more than five or six students and that’s fine. Of course when you have more students you can produce more papers. But then you spend less time with each student so it might not be as rewarding for them.

One thing that worked well with the larger groups that I’ve had for 15 years or so is to facilitate collaboration. In other words to not isolate each student with his or her particular project, but make it flexible for them to strike new collaborations. Then they are not in a one-on-one relationship with the professor; they are part of a big network.

GT: Peer to peer communication.

YB: Exactly. It’s more socially rewarding. If there’s one person they depend on for feedback who’s not very available, they don’t feel good. But if they have these ten other colleagues, it creates a much richer network.

GT: So what are some mechanisms for creating an environment? I imagine some would be physical, some would mean subtly changing the lab culture.

YB: Physical proximity: asking students to spend their day in the lab and not work at home. Giving them freedom to collaborate and strike new projects outside of what you’ve suggested, even with other professors. Minimizing barriers between students associated with different professors. It makes the group larger and frees students to enter in discussions with other faculty.

And then we organize regular events like reading groups, seminars and outside activities.

GT: I recently had a conversation about co- supervision with someone who had worked in the U.S. This person said that there, the first topic among the supervisors is how funding will be split up. In Canada that doesn’t really come up.

YB: Students cost a lot less here than the U.S. It’s a big factor, right? But maybe it’s a cultural thing as well. It’s important to let go of the funding aspect and prioritize working together, not necessarily having expectations. Here’s another thing that’s connected both to grants and to collaborations: I don’t have a lot of faith in pre-arranged marriage. In other words, collaborations don’t happen because we’ve made a formal agreement that you and I are going to co-supervise the student and the student is going to work on that. Really it’s happening because we have these common interests and we have regular exchanges and things happen, right?

And hopefully you’re running onto some ideas. This is what needs to be encouraged. If you have a pre-arranged marriage it could work but it might also be like a prison that forces a particular project and relationship between person A and person B. Maybe the relationship would be more natural between person A and person C.

In the main lab for example we had this informal notion that there’s a big pool of money and you can collaborate with whoever you want and you’ll be funded anyway.

GT: Is that because the principal investigators involved in the project decided to pool resources?

YB: My funding became the common pool. I don’t say that this is the model, but when there’s a senior professor it’s usually easier to get funding. So we wrote grants where I was doing the asking. Same thing with contracts. At the end of the day if people don’t feel as constrained by money questions what dominates is having fun and exploring together.



First, it’s not just doing the research, it’s making it known. Going to workshops and conferences, visiting other labs. You don’t have to wait to be invited.

GT: It’s an important point for new faculty to consider if they are thinking about where to go - where there’s nobody else working on the area? Or to an area where there’s existing faculty and a senior member where you could get this kind of collaboration going?

YB: Yeah. It’s much easier when you go to an existing group so long as you can get along with the main people there.

So if you like the spirit of the group it’s a lot easier for junior faculty to get started. They don’t have to be stressed about funding and they can bounce ideas off, get co-supervisions, get feedback early on for their work.

GT: I’d like to talk about recruiting students. Maybe it was different prior to this recent deep learning craze. How do you respond when you don’t have a tonne of interest? How do you deal with an immense amount of interest?

YB: First, it’s not just doing the research, it’s making it known. Going to workshops and conferences, visiting other labs. You don’t have to wait to be invited. You can say, “Hey, I’ll be in the area. Like me to give a talk?” Then you get to be known in the network. If you do formal collaborations through grants, again, there’s going to be this networking aspect.

That’s a tremendous help for recruiting and of course there is also teaching. One of the side effects of teaching is you get to be known by students. Especially your graduate class or maybe final year undergrad. You can talk about the things you care about and establish relationships. You can see how good students are and try to coax them to your lab.

So I had summer interns from undergrad and other places. This way, you can see whether a person has potential for research. It’s much safer to do that than to recruit somebody out of a single interview because that person is going to stay for five years, right?

Now of course I think there were like 700 applications here last year and there’s probably going to be 1,000 this year, and an intake of maybe a couple of dozen. So you have to be organized. You need secretarial help, you need people who are going to write scripts to do this as automatically as possible. It’s an effort to set these things up but if you don’t it’s much worse, because you’re going to be overwhelmed by the number of applications and can’t answer every email personally.

GT: So the final thing we’ll discuss is industry collaboration. How do you know if a collaboration is the right fit?

YB: You don’t, and you only realize a couple of years later. Usually what happens is a mismatch between expectations on both sides, so you have to be careful about that and to educate the people in industry about what academia can do for them. It’s important they understand that academics are not cheap labour. They’re not producing products, but can have amazing ideas that may transform business. So industry needs to understand that this is only part of the investment. They also have to have people inside who will go from the algorithms and prototypes into the products, otherwise the collaboration is doomed to fail. It’s tempting not to talk about that because that means more costs for the company. But you have to.

Listen to your gut. Many people lack the self-confidence necessary for that and they miss opportunities.



GT:
Have you had to ever disengage from a relationship with a company?

YB: No. What happens is both sides agree that it’s not worth renewing the contract.

GT: Because I guess they usually would have a fixed time?

YB: Yeah so those contracts would be one, two, three years typically.

GT: In closing is there anything we’ve missed that you would like to share with new faculty members?

YB: Listen to your gut. Many people lack the self-confidence necessary for that and they miss opportunities. As researchers our main job is to have useful ideas that advance knowledge. These ideas always come from somewhere hidden in our brain and we must cultivate our ability to give that idea-generation process enough time. You need space in your week to think, to not work on programming or writing or even reading. Just think about the big questions that bug you.

GT: Thank you. I think a lot of the things you said were some of the most important lessons I learned over a few years on the job. But for somebody to see this in year one or two could be very valuable. Like you said, you’re not trained as a PhD to lead a lab.

Read more articles from the Building a Research Lab Series