Beware the Important Problem (aka, Scope Creep in Research)

When doing research, it's easy to say "I want to work on the important problems. In that case, I'll find the problems that seem like they'll do the most good, and try and solve them".

I'm familiar with the notion that some problems are too hard to solve. This happens all the time in mathematics. After spending quite a bit of time on them, I've learned the hard way that (as Richard Hamming says) "the important problems should be restricted to the problems that have a plausible line of attack". That one should keep around a lot of problems they want to tackle, and when a plausible line of attack comes up, you should go pursue it and see if it works out. Sometimes it's helpful to spend some time on the hard problems so you know what they "feel like", what common approaches people have tried are, and can know when a plausible approach shows up. But generally you shouldn't waste too much time on them.

However, when you are approaching fuzzier problems, it can be a little harder to detect when the problem is too hard. Many of your approaches will sort of work, and seem like sort of like viable attacks. But they'll also feel like they have plenty of things missing, and feel pretty far away from the actual problem you want to solve.

I think there's a different technique that's useful in these cases: "Important problems" are often problems that, if solved, solve a lot of other things. Of course this is usually a sign that they are very hard, and that all of the obvious approaches are unlikely to work. However, the "solve a lot of other things" is worth looking at more.

Sometimes, each of those things you want the solution for can be solved in simper ways. They'll add some constraints that do two helpful things:

- Limit the domain of possible approaches

- Make it easier to tell when you have what you want

And sometimes if you think more carefully about it, you'll also realize that a solution to the larger problem wouldn't have actually helped after all.

In the cases where it would have helped, sometimes a much better idea is to attack those smaller problems. They'll seem less important, and possibly less interesting than the big problem, sure. But the point is that as you attack the approachable problems, you are biting off chunks of the bigger problem. Eventually if you do this enough, either:

- The bigger problem ended up not being needed for the vast majority of applications, and then you're in a good situation

or

- The smaller problems will also be very difficult

In the case where you don't solve the smaller problems, I think that's still okay. When you encounter a difficult problem, it can sometimes still give you some useful insights:

- You'll get a good "feel" for what is actually difficult about the problem, which helps you recognize potential approaches in the future

- Multiple different problems that reduce to each other give you different lenses with which to approach the "underlying difficult thing"

- You can start to more quickly recognize when you've encountered a difficult problem, and accept it's hard, move on, and save time (reductions like uncomputability, and NP-Completeness + hardness of approximation bounds are my favorite hammers to use for this purpose)

Essentially, the point is that for more fuzzy problems, "viable directions of attack" are best phrased as "simpler situations where it's easier to know if you have the right answer, and impose additional constraints that make finding the right answer tractable".

An example I've recently encountered of this is society simulation. Obviously I knew the problem was very difficult: it's the domain of sociology and plenty of other fields for the past very long time. It's not even necessarily a problem one can solve in general, because the second you use your model to influence the world that model becomes a part of the world and is no longer comprehensive. 

Yet still, I recognized that having a better answer to this problem would be really helpful for a lot of other problems I cared about. I got really distracted by the "important problem" of understanding how opinions move around in society that I ended up neglecting plenty of the proposed ideas I had that were plausible at addressing the original problems I cared about. I looked at the ideas and they weren't viable directions of attack for approaching the bigger problem, so they seemed like they weren't worth investigating more.

This sounds somewhat obvious in hindsight, but I think that this is a really easy trap to fall into if you aren't careful. It's very important to specify precisely what problem you want to solve:

- what are the constraints?

- what does a good solution look like?

- what pieces don't you need?

and you should avoid trying to solve anything more general than that. Also be careful not to pre-specify the outcome: often answering these questions involves some exploration, and that is natural and okay as well. But it's important to keep coming back to these questions.

If you don't, it's very easy to fall into a mode where you poke at a more general thing than you actually need, and end up going in circles. You'll have plenty plausible approaches for approaching your original problem, but overlook them because they don't solve these more general things you don't need. Always constantly be questioning your assumptions of what you need, and as your understanding of the problem grows you'll start to understand how many pieces of the problem are just excess you can ignore in your particular case. Of course, if you can eventually produce a solution to your particular problem, it's worth seeing how well it generalizes, but if it doesn't that's okay.

This is very reminiscent of feature creep in game programming. "Feature creep" is this problem where you are making a game, and you have all these ideas for things that would be cool. You decide you want to implement all of them, and the game gets progressively more and more bloated and messy and less cohesive. If you aren't careful, this can turn a simple game into a 10 year project. Minimum viable product, small prototypes, and filtering features based on "are they fun" can all help, but it's a constant battle you need to avoid.

There's also the related phenomenon of when you think your software is going to look like X and need Y configurable features, so you pre-design this pretty architecture that lets you nicely swap out Y features. It'll inevitably end up being that you didn't actually need Y features, you needed Z features that are really non-trivial to easily swap out and you need an entire refactor, and making all the work setting up the nicely configurable Y features is a waste of time. In that post I think I appreciated the importance of exploration, but I underestimated the second piece of this, which is all the details around "minimum viable product", "fail fast", and "avoid scope creep". It's hard to have scope creep in formal mathematical problems because "what a solution looks like" is precisely defined, but in more fuzzy research problems scope creep is just as much of an issue as it is in software development.

Comments

Popular posts from this blog

My prediction of AGI in 2021-2023

OpenAI Scholars Project

Inductive Linguistic Biases