AI Research Blog

Posts

Showing posts from January, 2021

Feedback Loops in Opinion Formation

January 29, 2021

The particular problem I want to try and approach is "how do we talk about systems changing human values and opinions, through the feedback loops that exist in those systems?" (note: since I've wrote this post, I realized how much Scope Creep matters in research , and my project has been pruned to Training Machine Learning Models on data produced by those models ) This post is a summary of my initial thoughts on the problem, and some of the directions I plan on taking my research during the next two months. Nothing here should be read as conclusive or final, this is mostly documenting my thoughts in progress. Research Directions Here are a few ways I can think of to approach this problem, alongside their pros and cons. I know pro-con analysis isn't great, but in this case it seemed like the most sensible way to present things. Making a formal model of human cultural evolution Pros: - Allows running intervention experiments to verify hypothesis - Can help pave the pat...

New Research Direction

January 15, 2021

AI Safety I spent the holidays reading AI Safety literature. I was somewhat familiar with the general concepts before, but this helped me get a deeper understanding of what some of the particular directions are. In doing this reading, I've become more convinced that scaling up AI is a bad idea. I think that scaling up AI+techniques that improve sample complexity has a very practical chance of making some kind of AI that is superhuman in intelligence, and that seems to be OpenAI's main research direction. I have the impression that they are fairly considerate about safety in general, but I still am not sure that scaling up is actually a good idea. Basically I feel this way because there is not a good answer to the question of "how to do AI safety" yet. Stuart Russel's preference learning direction seems like it may eventually get to a decent place, but Paul Christiano's approach for informed oversight still seems like it has some fundamental issues . There...