Letting the problem shape the direction you go
In my November 6 blog post, I talked about my general approach for trying to solve problems and formalize fuzzy ideas. In summary, it has two pieces that you alternate between:
- Do lots of reading, thinking, and brainstorming about tools and perspectives you could use
- Create formal, concrete proposals. Implement them, try them out. They'll probably have an issue somewhere, or be missing some important part of the idea you care about, but that's okay. You repeat this process and continually get closer to your goal.
However, I've come to realize that this description is incomplete. It's a good approach when the tools already exist, and what you are trying to do is just a few steps away from the existing set of tools. The reading and thinking help you understand what tools are available and how they relate to your problem, and the formalizing is about trying out tools and seeing if they actually do what you want.
Yet, sometimes the tools don't already exist. The problem you want to tackle is too many steps away for you to build the intermediate chain.
Initially, it seems like the right response is "okay, just work on things that seem like you're getting closer, and eventually you'll get there". But I'm not sure this is right.
In game design, eventually you start realizing that it's really hard to "design something to be fun". What things are fun is very hard to predict in advance, and can often be very counter-intuitive and surprising. Instead, a much more practical approach is to just prototype a ton of ideas, and try them out. If they are fun, great, explore that idea more. If they aren't, that's okay, it's better to learn that early on. You slowly start building an intuition for what kind of things seem likely to be fun, but it's very hard to calibrate that intuition, and it's important to constantly be checking it against reality by prototyping. This video explains this point really well.
In software development, it can be very tempting when initially building a system to plan out everything in advance. To try and design a beautiful architecture that does everything you need it to do, and is easily configurable to whatever you need. Yet, once you build something, you'll inevitably realize it's not exactly what you (or the client) wanted, and you'll need to do a lot of changes. If you try and pre-design the architecture and set of things that can be changed, you'll find you spent a lot of time on relatively unimportant features, and the features you actually end up needing are very difficult to implement without rebuilding much of the system. There are various design methodologies that can help prevent this from becoming a serious problem, and much of the learning that goes on in software is about how to build things so they work nicely and reliably with later things. Still, oftentimes you just need to try things out, prototype quickly and "fail fast" if a direction is not worth doing. In doing so, you start to gain an intuition for the clean way to approach certain kinds of software problems, and that expertise then helps you develop good software in whatever domain you specialize in. Yet, this knowledge does not always transfer: a significantly different domain often ends up having very different patterns, and much of the learning and building of intuition about where the dragons lie needs to be done again.
Let the wind guide you
A cheesy phrase, but the point I'm trying to make here is that oftentimes, exploration is important. With large search spaces, it can be very difficult to know if you are headed in the right direction ahead of time. If you keep trying to say "how close am I to my goal of doing X", you'll probably end up digging yourself a progressively more convoluted hole, because X wasn't where you actually needed to get. Even if know you are trying to do X (such as solve some open problem), if there aren't clear steps to getting there, it's probable that the intermediate path involves many steps that seem completely unrelated to X.
Instead, you should let the problem itself have a large say in the direction you go. For game design, guiding your decisions based on what is "fun" can lead to much better games. For software design, guiding the design process by what features end up being needed and useful can help prevent bloat and poor architecture decisions. It's still important to have a loose feel of somewhere you are trying to go, but there's an important balance here. I'm reminded of Stewart Butterfield, who keeps trying to make games and his companies accidentally make software like Flickr and Slack instead.
I'm starting to realize a similar analogy is true in ML Research as well.
I've found that in my process of doing research, I'm generating tons and tons of ideas. I probably have a small novel worth of research ideas at this point, and it keeps growing. Because I'm new to AI research, probably almost all of them are trivial, bad ideas, but that's okay. In mathematics, software, and game design, I'm used to being able to quickly test out my ideas, so this isn't really a problem. But in ML, the iteration time between experiments is much longer, which makes it harder to develop an intuition for what works and what doesn't.
It's also reminding me of my early days making games. My games would be these hodgepodges of tons of ideas all crammed together, not really having a cohesive whole, and not driven at all by what is "fun".
Because many of the problems I want to work on are multiple steps away from our current location, it's plausible that there are pieces needed that seem completely unrelated to those end goals. While there's utility in understanding the stepping stones built in other fields, and seeing if they can relate, there's also quite a bit of utility in just focusing on smaller, potentially unrelated goals.
I wish I could end this blog post on a conclusive note, but I'm not really there yet. While I feel like I understand that I need to let the problems guide me more, I'm still trying to get a good feel for what those proxy metrics for "this is a promising direction" are. My impression is that this is the "having good intuition for promising directions" people are referring to when they talk about experienced researchers, but I don't know much of what that intuition entails.
I plan on doing more reading about the history of science, and writing by other researchers, to see if there's any insights there. There's a large amount of past experience on what constitutes a promising path that might be able to be utilized to help build my intuition. In other words, it's easy to tell what's promising in hindsight.
But I also think it seems important to just start doing things. Partially for the goal of developing that intuition, I've started working towards running lots of experiments. I'm loosely prioritizing my ideas under four metrics:
- Do I find it interesting?
- Can I make progress in that direction?
- Can the experiments be ran quickly?
- Will the results be insightful? (and will other people care about them)
And trying to focus on the particular ideas around the research direction I discussed earlier. But I imagine things may change over time, and that's okay.
PS: Much of these ideas are inspired from Kenneth Stanley's Why Greatness Cannot Be Planned: The Myth of the Objective, which I highly recommend.
Comments
Post a Comment