This is a link to a question asked on the EA Forum by Aryeh Englander. (Please post responses / discussion there.)
Does the following seem like a reasonable brief summary of the key disagreements regarding AI risk?
Among those experts (AI researchers, economists, careful knowledgeable thinkers in general) who appear to be familiar with the arguments:
Reply to: Meta-Honesty: Firming Up Honesty Around Its Edge-Cases
Eliezer Yudkowsky, listing advantages of a "wizard's oath" ethical code of "Don't say things that are literally false", writes—
Repeatedly asking yourself of every sentence you say aloud to another person, "Is this statement actually and literally true?", helps you build a skill for navigating out of your internal smog of not-quite-truths.
I mean, that's one hypothesis about the psychological effects of adopting the wizard's code.
A potential problem with this is that human natural language contains a lot of ambiguity. Words can
... (Read more)On the one hand this post does a great job of connecting to previous work, leaving breadcrumbs and shortening the inferential distance. On the other hand what is this at the end?
But one thing I'm pretty sure won't help much is clever logic puzzles about implausibly sophisticated Nazis.
I have no idea what this is talking about.
I actually think that 2020 could be the year of the Linux desktop
Linux has had the advantages it has for twenty years...so why now?
(Cross-posted from Facebook.)
I don’t really stand by the last half of the points abov, I.e. the last ~3rd of the longer review. I think there’s something important to say here about the relationship between common knowledge and deontology, but that I didn’t really say it. I hope to get the time to try again to say it.
CFAR recently launched its 2019 fundraiser, and to coincide with that, we wanted to give folks a chance to ask us about our mission, plans, and strategy. Ask any questions you like; we’ll respond to as many as we can from 10am PST on 12/20 until 10am PST the following day (12/21).
Topics that may be interesting include (but are not limited to):
Do you think that Elon doesn't get his employees to do what's best for his companies?
This is a cross post from http://250bpm.com/blog:128.
In the past I've reviewed Eliezer Yudkowsky's "Inadequate Equilibria" book. My main complaint was that while it explains the problem of suboptimal Nash equilibria very well, it doesn't propose any solutions. Instead, it says that we should be aware of such coordination failures and we should expect ourselves to fare better than the official institutions in such cases. What Yudkowsky is saying (if I understand him correctly) is that given that the treatment of short bowel syndrome in babies is stuck in an inadequate eq... (Read more)
This essay provides some fascinating case studies and insights about coordination problems and their solutions, from a book by Elinor Ostrom. Coordination problems are a major theme in LessWrongian thinking (for good reasons) and the essay is a valuable addition to the discussion. I especially liked the 8 features of sustainable governance systems (although I wish we got a little more explanation for "nested enterprises").
However, I think that the dichotomy between "absolutism (bad)" and "organically grown institutions (good)" that the essay creates needs
... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this postThe recent adversarial collaboration on spiritual experiences on Slate Star Codex includes this paragraph:
It was also discovered that people in the United States, Australia, the United Kingdom, and Scandinavia do not tend to share their spiritual experiences with others. Hood et al. wonder if this is why such spiritual experiences are thought to be uncommon (as fewer people in these societies might have heard reports of others’ spiritual experiences).
This naturally lead me to wonder, what spiritual experiences have LessWrong readers have that they are willing to share, since the readers... (Read more)
I believe I've had kesho experiences too. This easily meets the criteria of "spiritual experience" and "mystical perception", though it has no hallucinatory component.
An antimeme is a meme with the following three characteristics:
I call these "antimemes" because they exhibit behavior opposite that of regular memes. The typical
... (Read more)I hadn't noticed utilitarianism and ethical vegetarianism check these boxes. I wrote this series hoping for exactly this kind of insight. Thanks!
Your comment on the cross-cultural application of utilitarianism makes this extra insightful. I have edited the original post to acknowledge that antimemes are not always culture-specific.
To celebrate all the possibilities of humanity during these holidays, have a possible calendar of the year 12020 of the human era (link to full calendar here).
Minor fact: in the Gregorian calendar, the days of the week cycle exactly every 400 years, so the non-time-travellers among you can use this for 2020 as well...
(previous holiday specials can be found here and here)




























Do you happen to be making a reference to the Holocene calendar? (Which was popularized by this Kurzgesagt video.) It advocates that we reset the zero-year to be 10k years older, thereby set before most of human civilization.
In the past 3-4 years, I went through a prolonged and painful life crisis in which I systematically deconstructed my existing worldview and slowly moved away from Evangelical Christianity into something Rationalist or Rationalist-adjacent. In the past 4 months, I've started hanging around the Berkeley Rationality community and am now dating someone embedded therein. At this point my partner is still my main connection to the specific values and practices of the community, and given that my worldview is currently being fleshed-out, she has an outsized influence on what my future beliefs and val
... (Read more)Thanks for the welcome!
This is super helpful. It sounds like you've lived the thing that I'm only hypothesizing about here. Hopefully "Can't wait for round three" isn't sarcastic. This first round for me was extremely painful, but it sounds like round 2 was possibly more pleasant for you.
I like the framework you're using now, and I'm gonna try to condense it into my own words to make sure I understand what you mean. Basically, you're trying to optimize around keeping the various and conflicting hopes, needs, fears, etc. within you at least relatively cool
... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this postIn my short-form, I write:
[...] This is way more obvious and way more clear in Inadequate Equilibria. Take a problem, a question and deconstruct it completely. It was concise and to the point, I think it's one of the best things Eliezer has written; I cannot recommend it enough.
Just finished Inadequate Equilibria. Now, I'm reading:
Edited above comment with fuller details :)
LessWrong is currently reviewing the posts from 2018, and I'm trying to figure out how voting should happen. The new hotness that all your friends are talking about is quadratic voting, and after thinking about it for a few hours, it seems like a pretty good solution to me.
I'm writing this post primarily for people who know more about this stuff to show me where the plan will fail terribly for LW, to suggest UI improvements, or to suggest an alternative plan. If nothing serious is raised that changes my mind in the next 7 days, we'll build a straightforward UI and do it.
The second paragraph in the linked post says:
Many people find the Hugo voting system (called “Instant Runoff Voting“) very complicated.
Growing up, the bedrooms in the house had clear names: Jeff's room, Rose's room, Alice's room, Rick and Suzie's room, the Au Pair room, and the guest room. But people have moved around a lot: later occupants of "my" room have included Rose, Stevie, then later me, Julia, and Lily, and then even later Alice, Alex, and their children. Other rooms had a similar range of people rotating through ("the Wyman St home for itinerant folk-dancing youth") and referring to rooms became really difficult.
Around this time last year we decided to give names to the rooms: England, Scotland, Wales, and ... (Read more)
For what it's worth, I tried something like the "I won't let the world be destroyed"->"I want to make sure the world keeps doing awesome stuff" reframing back in the day and it broadly didn't work. This had less to do with cautious/uncautious behavior and more to do with status quo bias. Saying "I won't let the world be destroyed" treats "the world being destroyed" as an event that deviates from the status quo of the world existing. In contrast, saying "There's so much fun we could ha... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post
Most models of agency (in game theory, decision theory, etc) implicitly assume that the agent is separate from the environment - there is a “Cartesian boundary” between agent and environment. The embedded agency sequence goes through a long list of theoretical/conceptual problems which arise when an agent is instead embedded in its environment. Some examples:
Ooh, that is very insightful. The word-boundary problem around "values" feels fuzzy and ill-defined, but that doesn't mean that the thing we care about is actually fuzzy and ill-defined.
An introduction to a recent paper by myself and Ryan Carey. Cross-posting from Medium.
For some intellectual tasks, it’s easy to define success but hard to evaluate decisions as they’re happening. For example, we can easily tell which Go player has won, but it can be hard to know the quality of a move until the game is almost over. AI works well for these kinds of tasks, because we can simply define success and get an AI system to pursue it as best it can.
For other tasks, it’s hard to define success, but relatively easy to judge solutions when we see them, for example, doing a backflip. Getti
... (Read more)This looks really interesting to me. I remember when the Safety via Debate paper originally came out; I was quite curious to see more work around modeling debate environments and getting a better sense on how well we should expect it to perform in what kinds of situations. From what I can tell this does a rigorous attempt at 1-2 models.
I noticed that this is more intense mathematically than most other papers I'm used to in this area. I started going through it but was a bit intimidated. I was wondering if you may suggest tips for reading through it and und
... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this postForgive me if some of this is repetitive, I can’t remember what I’ve written in which draft and what’s actually been published, much less tell what’s actually novel. Eventually there will be a polished master post describing my overall note taking method and leaving out most of how it was developed, but it also feels useful to discuss the journey.
When I started taking notes in Roam (a workflowy/wiki hybrid), I would:
Just realized the "it" in "I'm curious what it looks like." probably referred to "my DB", not "the feedback". I'd love to either user test my DB on you (you play with it while I watch) or have you beta test the description I'm writing, if you're interested.
Quick context: Epistemic spot checks started as a process in which I did quick investigations a few of a book’s early claims to see if it was trustworthy before continuing to read it, in order to avoid wasting time on books that would teach me wrong things. Epistemic spot checks worked well enough for catching obvious flaws (*cou*Carol Dweck*ugh*), but have a number of problems. They emphasize a trust/don’t trust binary over model building, and provability over importance. They don’t handle “severely flawed but deeply insightful” well at all. So I started trying to create something better.
Be... (Read more)
I don't immediately see how they're related. Are you thinking people participating in the markets are answering based on proxies rather than truly relevant information?
Continuation of: No Individual Particles
Followup to: The Generalized Anti-Zombie Principle
Suppose I take two atoms of helium-4 in a balloon, and swap their locations via teleportation. I don't move them through the intervening space; I just click my fingers and cause them to swap places. Afterward, the balloon looks just the same, but two of the helium atoms have exchanged positions.
Now, did that scenario seem to make sense? Can you imagine it happening?
If you looked at that and said, "The operation of swapping two helium-4 atoms produces an identical configuratio... (Read more)
This "explanation" leaves lingering doubt. It doesn't dissolve all the questions that I have about personal identity. Ok, I'm a factor in a subspace of an amplitude distribution: I get that and I'm okay with that. But there are still unresolved issues of anticipation.
Let's say I record in sufficient fidelity the amplitude distribution factor which represents "me" at this point in time. Then after I am dead some machine is used to recreate this amplitude distribution to sufficient fidelity as to re-create me, as I exist now. That person will come into being
... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this postOff-topic riff on "Humans are Embedded Agents Too"
One class of insights that come with Buddhist practice might be summarized as "determinism", as in, the universe does what it is going to do no matter what the illusory self predicts. Related to this is the larger Buddhist notion of "dependent origination", that everything (in the Hubble volume you find yourself in) is causally linked. This deep deterministic interdependence of the world is hard to appreciate from our subjective experience, because the creation of ontology crea... (Read more)(Click to expand thread. ⌘/CTRL+F to Expand All)Cmd/Ctrl F to expand all comments on this post
Valid. I was primarily summarizing the risk part though, rather than the solutions.