I'm an admin of this site; I work full-time on trying to help people on LessWrong refine the art of human rationality.
Longer bio: www.lesswrong.com/posts/aG74jJkiPccqdkK3c/the-lesswrong-team-page-under-construction#Ben_Pace___Benito
I think it's fairly self-evident that you should have exceedingly high standards for projects intending to build AGI (OpenAI, DeepMind, others). It's really hard to reduce existential risk from AI, and I think much thought around this has been naive and misguided.
(Two examples of this outside of OpenAI include: senior AI researchers talking about military use of AI instead of misalignment, and senior AI researchers saying responding to the problems of specification gaming by saying "objectives can be changed quickly when issues surface" and "existential threats to humanity have to be explicitly designed as such".)
An obvious reason to think OpenAI's impact will be net negative is that they seem to be trying to reach AGI as fast as possible, and trying a route different from DeepMind and other competitors, so are in some world shortening the timeline until AI. (I'm aware that there are arguments about why a shorter timeline is better, but I'm not sold on them right now.)
There are also more detailed conversations, about alignment, what the core of the problem actually is, and other strategic questions. I expect (and take from occasional things I hear) I have substantial disagreements with OpenAI decision-makers, which I think alone is sufficient reason for me to feel doomy about humanity's prospects.
That said, I'm quite impressed with their actions around release practises and also their work in becoming a profit-capped entity. I felt like they were a live player with these acts and were clearly acting against their short-term self-interest in favour of humanity's broader good, with some relatively sane models around these specific aspects of what's important. Those were both substantial updates for me, and make me feel pretty cooperative with them.
And of course I'm very happy indeed about a bunch of the safety work they do and support. The org give lots of support and engineers to people like Paul Christiano, Chris Olah, etc that I think is better than those people probably would get counterfactually, and I'm very grateful that the organisation provides this.
Overall I don't feel my opinion is very robust, and could easily change. Here's some example of things that I think could substantially change my opinion:
See all the discussion under the OpenAI tag. Don't forget SSC's post on it either.
I mostly think we had a good discussion about it when it launched (primarily due to Ben Hoffman and Scott Alexander deliberately creating the discussion).
Aww alas. Another time :)
Bayesop's Fables is a great name, I'm stealing it.
All of them get tagged AI. Not all of the technical content gets tagged AI risk – for example, when Scott Garrabrant writes curious things like prisoner's dilemma with costs to modelling, this is related to embedded agency, but it's not at all clearly relevant to AI risk, only indirectly at best. The ones that are explicitly about AI risk get tagged that way, such as What Failure Looks Like, or The Rocket Alignment Problem get tagged AI risk.
I think AI Risk is open to improvement as a name, but it's definitely a more narrow category than AI. AI includes reviews of AI textbooks, explanation of how certain ML architectures work, and just anything relating to AI. AI risk is about the downside risk and analysis of what the risk looks like.
Hm, I like this, I feel resolved against 'aspiring rationalist', which was always losing anyway because it's a longer and less catchy phrase.
Nonetheless, thanks for the links :)
Curated. Short and sweet summary of a research approach being pursued in AI alignment, that I think has not been written up like this before (especially on LessWrong).