It was a great conference. The organizers had to break with tradition to accommodate the rapid growth in submissions and attendance, but despite my nostalgia, I feel the changes were beneficial. In particular, leveraging parallel tracks and eliminating poster spotlights allowed for more presentations while ending the day before midnight, and the generous space allocation per poster really improved the poster session. The workshop organizers apparently thought of everything in advance: I didn’t experience any hiccups (although, we only had one microphone, so I got a fair bit of exercise during discussion periods).
Here are some high-level themes I picked up on.
Openness. Two years ago Amazon started opening up their research, and they are now a major presence at the conference. This year at NIPS, Apple announced they would be opening up their research practices. Clearly, companies are finding it in their best interests to fund open basic research, which runs counter to folk-economic reasoning that basic research appears to be a pure public good and therefore will not be funded privately due to the free-rider problem. A real economist would presumably say that is simplistic undergraduate thinking. Still I wonder, to what extent are companies being irrational? Conversely, what real-world aspects of basic research are not well modeled as a public good? I would love for an economist to come to NIPS to give an invited talk on this issue.
Simulation. A major theme I noticed at the conference was the use of simulated environments. One reason was articulated by Yann LeCun during his opening keynote: (paraphrasing) “simulation is a plausible strategy for mitigating the high sample complexity of reinforcement learning.” But another reason is scientific methodology: for counterfactual scenarios, simulated environments are the analog of datasets, in that they allow for a common metric, reproducible experimentation, and democratization of innovation. Simulators are of course not new and have had waves of enthusiasm and pessimism in the past, and there are a lot of pitfalls which basically boil down to overfitting the simulator (both in a micro sense of getting a bad model, but also in a macro sense of focusing scientific attention on irrelevant aspects of a problem). Hopefully we can learn from the past and be cognizant of the dangers. There’s more than a blog post worth of content to say about this, but here are two things I heard at the dialog workshop along these lines: first, Jason Williams suggested that relative performance conclusions based upon simulation can be safe, but that absolute performance conclusions are suspect; and second, Antoine Bordes advocated for using an ensemble of realizable simulated problems with dashboard scoring (i.e., multiple problems for which perfect performance is possible, which exercise apparently different capabilities, and for which there is currently no single approach that is known to handle all the problems).
Without question, simulators are proliferating. I noticed the following discussed at the conference this year:
and I probably missed some others.
By the way, the alternatives to simulation aren’t perfect either: some of the discussion in the dialogue workshop was about how the incentives of crowdsourcing induces unnatural behaviour in participants of crowdsourced dialogue experiments.
GANs The frenzy of GAN research activity from other conferences (such as ICLR) colonized NIPS in a big way this year. This is related to simulation, albeit more towards the mitigating-sample-complexity theme than the scientific-methodology theme. The quirks of getting the optimization to work are being worked out, which should enable some interesting improvements in RL in the near-term (in addition to many nifty pictures). Unfortunately for NLU tasks, generating text from GANs is currently not as mature as generating sounds or images, but there were some posters addressing this.
Interpretable Models The idea that model should be able to “explain itself” is very popular in industry, but this is the first time I have seen interpretability receive significant attention at NIPS. Impending EU regulations have certainly increased interest in the subject. But there are other reasons as well: as Irina Rish pointed out in her invited talk on (essentially) mindreading, recent advances in representation learning could better facilitate scientific inquiry if the representations were more interpretable.
Papers I noticed
Would you trust a single reviewer on yelp? I wouldn’t. Therefore, I think we need some way to crowdsource what people thought were good papers from the conference. I’m just one jet-lagged person with two eyeballs (btw, use bigger font people! it gets harder to see the screen every year …), plus everything comes out on arxiv first so if I read it already I don’t even notice it at the conference. That makes this list weird, but here you go.
- Generating Text via Adversarial Training, GANS for Sequences of Discrete Elements with the Gumbel-softmax Distribution, and Adversarial Evaluation of Dialogue Models. I’m interested in techniques that are relevant to simulating or evaluating dialogue systems.
- Building Machines That Learn and Think Like People. The talk was great, so I want to dig into the paper. The talk explored how humans are leveraging lots of priors that we probably want to build into our systems, with some specific observations resulting in actionable research directions. (This appears relevant to dialog, since this line of research might explain the pseudo-intelligibility of statements like “ the blorf flazzed the peezul.”)
- Learning values across many orders of magnitude. At first blush this might appear to be optimization minutae, but this problem is pervasive in counterfactual setups; and I’m a big fan of scale invariance as a useful prior.
- Reward Augmented Maximum Likelihood for Neural Structured Prediction. If you squint, this reads as another way to use a world model to mitigate the sample complexity of reinforcement learning (e.g., what if edit distance was just the initial model of the reward?).
- Safe and Efficient Off-Policy Reinforcement Learning. This is an important setting. The particular adjustment is reminiscent of a previously proposed estimator in this area, but nonetheless this is interesting.
Also this paper was not at the conference, as far as I know, but I found out about it during the coffee break and it’s totally awesome:
- Understanding deep learning requires rethinking generalization. TL;DR: convnets can shatter the standard image training sets when the pixels are permuted or even randomized! Of course, generalization is poor in this case, but it indicates they are way more flexible than their “local pixel statistics composition” architecture suggests. So why do they work so well?