Posts

Showing posts from January, 2021

Feedback Loops in Opinion Formation

Image
The particular problem I want to try and approach is "how do we talk about systems changing human values and opinions, through the feedback loops that exist in those systems?" (note: since I've wrote this post, I realized how much Scope Creep matters in research , and my project has been pruned to  Training Machine Learning Models on data produced by those models ) This post is a summary of my initial thoughts on the problem, and some of the directions I plan on taking my research during the next two months. Nothing here should be read as conclusive or final, this is mostly documenting my thoughts in progress. Research Directions Here are a few ways I can think of to approach this problem, alongside their pros and cons. I know pro-con analysis isn't great, but in this case it seemed like the most sensible way to present things. Making a formal model of human cultural evolution Pros: - Allows running intervention experiments to verify hypothesis - Can help pave the pat

New Research Direction

Image
AI Safety I spent the holidays reading AI Safety literature. I was somewhat familiar with the general concepts before, but this helped me get a deeper understanding of what some of the particular directions are. In doing this reading, I've become more convinced that scaling up AI is a bad idea. I think that scaling up AI+techniques that improve sample complexity has a very practical chance of making some kind of AI that is superhuman in intelligence, and that seems to be OpenAI's main research direction. I have the impression that they are fairly considerate about safety in general, but I still am not sure that scaling up is actually a good idea. Basically I feel this way because there is not a good answer to the question of "how to do AI safety" yet. Stuart Russel's preference learning direction seems like it may eventually get to a decent place, but Paul Christiano's approach for informed oversight still seems like it has some fundamental issues . There'