Posts

Showing posts from February, 2021

Encoding cellular automata into gradient descent

Image
After downscoping my project to just "training a model on its own outputs", I've been writing up a first draft on Overleaf. I found it more difficult initially than writing a blog because I felt like it was "supposed to be good", wheras in a blog I don't really care who reads this so I can just write out my thoughts like I'm explaining them to someone. Eventually I learned that that concern over your writing being good makes it hard to write, and it's better to just have a bad rough draft and then polish it into what you want. I have to keep reminding myself of this, but it does make the writing workable. However honestly I feel like I downscoped my project a little too much? I have a detailed analysis written up I'll eventually post of why this is the downscoping I did, and unfortunately I feel like further research in this direction would take more time than I have. But I feel like my writeup won't take the month more that I have remaining,...

Beware the Important Problem (aka, Scope Creep in Research)

When doing research, it's easy to say "I want to work on the important problems. In that case, I'll find the problems that seem like they'll do the most good, and try and solve them". I'm familiar with the notion that some problems are too hard to solve. This happens all the time in mathematics. After spending quite a bit of time on them, I've learned the hard way that (as Richard Hamming says ) "the important problems should be restricted to the problems that have a plausible line of attack". That one should keep around a lot of problems they want to tackle, and when a plausible line of attack comes up, you should go pursue it and see if it works out. Sometimes it's helpful to spend some time on the hard problems so you know what they "feel like", what common approaches people have tried are, and can know when a plausible approach shows up. But generally you shouldn't waste too much time on them. However, when you are approachi...

On AI Alignment

(this isn't my bi-weekly progress blog post, this is a less polished series of ramblings about AI safety I've been musing over for the past few weeks. Feel free to skip, my bi-weekly blog post is here )  I use writing to work out my thoughts. This post is in two pieces: a summary of my thoughts, and what I think the takeaways should be, and a longer "working out why things are the way they are". I don't expect anyone to read the "working out stuff" sections, the summary should be the most useful and interesting. This post is still in progress, I keep tweaking it as I think of new things. There are a few assumptions within alignment that can lead to vastly different predictions about what the right research direction is. Here's the rough process I've been through in understanding these opinions: Assumption 1: There will be a slow takeoff, so we need to do "black box alignment" where multiple people build systems simultaneously. It's ...