Posts

Letting the problem shape the direction you go

In my November 6 blog post , I talked about my general approach for trying to solve problems and formalize fuzzy ideas. In summary, it has two pieces that you alternate between: - Do lots of reading, thinking, and brainstorming about tools and perspectives you could use - Create formal, concrete proposals. Implement them, try them out. They'll probably have an issue somewhere, or be missing some important part of the idea you care about, but that's okay. You repeat this process and continually get closer to your goal. However, I've come to realize that this description is incomplete. It's a good approach when the tools already exist, and what you are trying to do is just a few steps away from the existing set of tools. The reading and thinking help you understand what tools are available and how they relate to your problem, and the formalizing is about trying out tools and seeing if they actually do what you want. Yet, sometimes the tools don't already exist. The pr...

Research Direction

My research direction that I've chosen is what I've been discussing previously: I want to try and make synthetic language tasks that can help transfer performance on real world language tasks. In this post I'm going to give a motivation of this problem from a few different perspectives: Transfer Learning, and Understanding Inductive Biases. Transfer Learning When you want to teach a model a skill, one way to do this is by feeding the model labeled (input, output) pairs. If you don't give it enough data, it'll be confused about what you wanted and won't properly learn the task. Eventually you can give it enough data and it'll figure out what you are trying to get it to do, allowing it to generalize to data it hasn't seen. One way to think of this is that the model starts out with a hypothesis space of "here are all possible things they might be trying to teach me", and as you feed it data, it can eliminate hypotheses. Eventually it's thrown...

Biweekly update

I'm working on a fairly detailed post about linguistic biases and how to formalize them as open-ended tasks, but it isn't quite ready yet. In the meantime, using this blog more as a journal entry, I'll try and give a summary of my thoughts and research directions. Ideally, what I want to do is create a series of curriculum that are open-ended and teach an agent "intelligence". Because that's pretty ambitious, I'm okay with simply some open ended curriculum that teach some important aspects of intelligence and help augment the curriculum of language models in various ways. There's a few reasons this is still my research direction. My main point is safety : (I'm still doing a lot of reading about AI safety and so my thoughts on these topics is constantly changing and I'm not on expert on them. This is just my current impressions) One way to create AI would be to feed a model very high resolution EEG signals and have it predict what the underlying...

Research Processes and Inductive Linguistic Biases

Image
  In general, I’ve been thinking about two main questions: Good ways of conducting research Formalizing the notion of “breaking language learning into synthetic tasks that capture most of the problem” 1 I’ve received a few pieces of advice for research processes that I’ve been reflecting on. There are two ways to go about “getting something to work”. One is to fix the task, then throw every tool you can think of until your model does what you want. The other way is to use a standard algorithm on fairly default settings, and keep adjusting the task, problem setup, etc. until things work. Either approach is warranted in different cases. The first makes more sense on clearly defined tasks we know we want to do well on but haven’t managed to do well yet. The second is much better for exploratory work (“trying to take a vague idea of behavior you want to see and condensing that into a formal setup”), as it helps you keep a clearer perspective and doesn’t muddy the waters of potential th...

10/12-10/23

The last two weeks in the OpenAI Scholars program have been really great. I've met a lot of really cool people and have been learning a ton. I decided that to start out, I wanted to get experience with using pytorch to implement various things, roughly from scratch. The fast.ai course of 2019 (especially part two) was really helpful for showing me how to do this, but doing it myself has been really useful in drilling in the knowledge of how things work. Along the way, I've learned a few interesting things. Standard deviation is biased for small sample sizes. The intuitive explanation for this is that it's calculated by subtracting the mean from your samples, squaring all the values, then summing the squares. When you take the mean of your samples, it's likely to be closer to your samples than the true mean (since it's an average of your samples). Thus, the differences are likely to be smaller, and your standard deviation will be an underestimate. One interesting poi...