Recent Discussion

Some argue that Elon Musk's plan to build a self-sustaining colony on Mars is effectively impossible for the foreseeable future without discontinuous tech progress. (See e.g. Jess Riedel's claim here, which Robin Hanson agrees with.)

Is there a simple explanation for why this might be so? I'd be especially interested in Fermi estimates of key bottleneck resources.

Impossible? That's a really tough standard to meet. When you weaken it to "effectively impossible" and further "without X", you need to quantify those better before you can estimate. If you mean "one in a billion chance", that's about as likely as something that happens to 7-8 living humans today. If you only mean "one in a thousand", I'd argue that the "impossible" label stops being applicable.

Then you get into the definitions of self-sustaining. Honestly, it probably_IS_ possible to cr... (read more)

5CellBioGuy17hYou drastically underestimate the difficulty of genetics.
Covid-19: My Current Model
1365d19 min readShow Highlight

The post will be a summary of my current key views on various aspects what is going on, especially in places where I see many or most responsible-looking people getting it importantly wrong.

This post is not making strong evidence-based arguments for these views. This is not that post. This is me getting all this out there, on the record, in a place one can reference.

Risks Follow Power Laws

It is impossible to actually understand Covid-19 if you think of some things as ‘risky’ and other things as ‘safe’ and group together all the things in each category. And yet, that’s exact

... (Read more)

I don't think "too much Covid content" is the major problem here. Rather, the major problem with this essay is that it mostly states Zvi's updates, without sharing the data and reasoning that led him to make those updates. It's not going to convince anyone who doesn't already trust Zvi.  

This is perhaps an acceptable trade-off if people have to move fast and make decisions without being able to build their own models. But it's an emergency measure that's very costly long-term. 

And for the most important decisions, it is especially important that

... (read more)
4Rob Bensinger4hWould you feel similarly concerned about a hypothetical curated essay that instead said 'the WHO has done a reasonably good job and should have its funding increased' (in the course of a longer discussion almost entirely focused on other points) while providing just as little evidence? If so, then I disagree: in a dialogue about world affairs with people I respect, where someone has thirty important and relevant beliefs but only has time to properly defend five of them, I'd usually rather that they at least mention a bunch of the other beliefs than that they stay silent. I think it's good for those unargued beliefs to then be critiqued and hashed out in the comments, but curation-level content shouldn't require that the author conceal their actual epistemic state just because they don't expect to be able to convince a skeptical reader. If not, then I think I'd disagree a lot more strongly, and I'm a bit confused. Suppose we have a scale of General Institutional Virtue, where a given institution might deserve an 8/10, or a 5/10, or a 1/10. I don't see a reason to concentrate our prior on, say, 'most institutions are 8/10' or 'most institutions are 3/10'; claiming that something falls anywhere on the scale should warrant similar skepticism. Perhaps the average person on the street thinks the WHO's Virtue is 7/10; but by default I don't think LessWrong should put much weight on popular opinion in evaluating arguments about institutional efficacy, so I don't think we should demand less evidence from someone echoing the popular view than from someone with a different view. (Though it does make sense to be more curious about an unpopular view than a popular one, because unpopular views are likely to reflect less widely known evidence.)
3Sherrinford1hYou disagree with both answers I could possibly, in your view, give to a question that you ask. But the hypothetical alternative you give is not the mirror image of Zvi's essay, nor is it the mirror image of what I referred to from Zvi's essay. Zvi claims that governments are "lying liars" and they have "no ability to plan or physically reason". In the reasoning for your first disagreement possibility, you effectively say that your prior about any of Zvi's statements being right is high because you respect him and also because he properly defends some of the many statements he makes. By contrast, my own prior of the quality of Zvi's less political assessments in this text are lower because the statement that governments have "no ability to plan" is false. Concerning the WHO, Zvi's statement about the response to the pandemic is that it is "not that different" from "attacking, restraining and killing innocent people". This is far from my understanding of what governments or the WHO are doing. So my priors on other statements in the essay are not high after reading the politics sections, and this does not depend on the average person on the street's opinion, or on what Zvi writes in other texts. I agree that LessWrong should not put much weight on popular opinion in evaluating arguments per se. Though depending on what you mean by your suggestion to be more curious about an unpopular view, I may disagree there. Unpopular views can very well be unpopular because they are wrong. The statement that the moon is made of toothpaste is unpopular for that reason. Of course, Zvi's essay is very popular on Lesswrong, though I would not say that that is sufficient to tell me whether what he states is right. To my impression, Zvi is not just writing a curated essay that says the WHO should be dissolved, but actually makes many statements that go through on the nod for people that believe them in advance. However, I believe people who do not believe the statements in advance m
2Rob Bensinger20mSorry, those two weren't the only answers I imagined you might give, I just didn't want to make the comment longer before letting you respond. My next guess was going to be that your objection was stylistic — that Zvi was using a lot of hyperbole and heated rhetoric [] that's a poor fit for curated LW content, even if a more boring and neutral version of the same core claims would be fine. I think that's part of what's going on (in terms of why the two of us disagree). I think another part of what's going on is that I feel like I have good context for ~all the high-level generalizations and institutional criticisms Zvi is bringing in, and why one might hold such views, from reading previous Zvi-posts, reading lots of discussion of COVID-19 over the last few months, and generally being exposed to lots of rationalist and tech-contrarian-libertarian arguments over the years, such that it doesn't feel super confusing or novel as a package and I can focus more on particular parts I found interesting and novel. (Like critiques of particular organizations, or new lenses I can try out and see whether it causes a different set of actions and beliefs to feel reasonable/'natural', and if so whether those actions and beliefs seem good.) This isn't to say that Zvi's necessarily right on all counts and you're wrong, and I think a discussion like this is exactly the way to properly bridge different people's contexts and priors about the world. And given the mix of 'this seems super wrong' and 'the style seems bad' and 'there aren't even hyperlinks I can use to figure out what Zvi means or where he's coming from', I get why you'd think this isn't curation-worthy content. I don't want to go down all the object-level discussion paths necessarily to reach consensus about this myself, though if someone else wants to, I'll be happy about that.
Open & Welcome Thread - June 2020
133d1 min readShow Highlight

If it’s worth saying, but not worth its own post, here's a place to put it. (You can also make a shortform post)

And, if you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are welcome.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ.

The Open Thread sequence is here.

How much rioting is actually going on in the US right now?

If you trust leftist (i.e. most US) media, the answer is "almost none, virtually all protesting has been peaceful, nothing to see here, in fact how dare you even ask the question, that sounds suspiciously like something a racist would ask". 

If you take a look on the conservative side of the veil, the answer is "RIOTERS EVERYWHERE! MINNEAPOLIS IS IN FLAMES! MANHATTEN IS LOST! TAKE YOUR KIDS AND RUN!" much rioting has there actually been? How much damage (very roughly)? How many deaths? A

... (read more)
2Dagon3hI do not intend to claim that I'm particularly great at this, and I certainly don't think I have sufficient special knowledge for 1-1 planning. I'm happy to listen and make lightweight comments if you think it'd be helpful. My plans are half-formed, and include maintaining some foundational capabilities that will help in a large class of disasters that require travel. I have bank accounts in two nations and currencies, and I keep some cash in a number of currencies. Some physical precious metals or hard-to-confiscate digital currency is a good idea too. I have friends and coworkers in a number of countries (including over a border I can cross by land), who I visit enough that it will seem perfectly normal for me to want to travel there. I'm seriously considering real-estate investments in one or two of those places, to make it even easier to justify travel if it becomes restricted or suspicious. I still think that the likelihood is low that I'll need to go, but there may come a point where the tactic of maintaining rolling refundable tickets becomes reasonable - buy a flight out at 2 weeks and 4 weeks, and every 2 weeks cancel the near one and buy a replacement further one. This is harder to advise. I'm older than most people on LW, and have been building software and saving/investing for decades, so I have resources that can help support what seem to be important causes, and I have a job that has (indirect, but clear) impact on keeping the economy and society running. I also support and participate in protests and visibility campaigns to try to make it clear to the less-foresightful members of society that tightening control isn't going to work. This part is more personal, less clearly impactful toward my goals, and takes a huge amount of time, effort, and personal risk. It's quite possible that I'm doing it more for the social connections with friends and peers than for purely rational goal-seeking. I wouldn't fault anyone for preferring to put their effort
6CarlShulman4hRe hedging, a common technique is having multiple fairly different citizenships and foreign-held assets, i.e. such that if your country become dangerously oppressive you or your assets wouldn't be handed back to it. E.g. many Chinese elites pick up a Western citizenship for them or their children, and wealthy people fearing change in the US sometimes pick up New Zealand or Singapore homes and citizenship. There are many countries with schemes to sell citizenship [] , although often you need to live in them for some years after you make your investment. Then emigrate if things are starting to look too scary before emigration is restricted.
3Bjartur Tómas5h+1 for this. Would love to talk to other people seriously considering exit. Maybe we could start a Telegram or something.

In Inaccessible Information, Paul Christiano lays out a fundamental challenge in training machine learning systems to give us insight into parts of the world that we cannot directly verify. The core problem he lays out is as follows.

Suppose we lived in a world that had invented machine learning but not Newtonian mechanics. And suppose we trained some machine learning model to predict the motion of the planets across the sky -- we could do this by observing the position of the planets over, say, a few hundred days, and using this as training data for, say, a recurrent neural network. And suppos

... (Read more)

I thought this was a great summary, thanks!

Yes it’s true that much of MIRI’s research is about finding a solution to the design problem for intelligent systems that does not rest on a blind search for policies that satisfy some evaluation procedure. But it seems strange to describe this approach as “hope you can find some other way to produce powerful AI”, as though we know of no other approach to engineering sophisticated systems other than search.

I agree that the success of design in other domains is a great sign and reason fo... (read more)

4Ben Pace2hI really appreciate this post. Re-explaining Paul’s new post clearly and simply in your own words helped me a great deal, I now feel that I’ll have a far easier time engaging with Paul’s post if I want to. (Your feeling about MIRI’s approach came across fairly clearly too in the context of what you’d set up.)
3Pattern14hIs your website open source?
3gwern6hYes [], if you can figure it out.

I'm looking to start a blog for myself. Is it likely I, a fairly strong CS student with no web dev experience, can figure it out in a reasonable amount of time?

Spiracular's Shortform Feed
1y1 min readShow Highlight

I was just thinking about how to work myself up to posting full-posts, and this seemed like exactly the right level of difficulty and exposure for what I'm currently comfortable with. I'm glad that a norm for this exists!

This is mostly going to consist of old notebook-extracts regarding various ideas I've chewed on over the years.

While I don't agree with this narrative, I really enjoyed the story, thanks for writing it!

Pasteur's quadrant
72h5 min readShow Highlight

In my recent post on the case study of the transistor, we saw that the research that led to its invention did not fall neatly into the categories of “basic” vs. “applied”, but in fact cycled rapidly between them.

An entire book—Pasteur’s Quadrant, by Donald Stokes—is dedicated to the thesis that “basic” vs. “applied” is a false dichotomy that is harming science funding.

The core idea of Pasteur’s Quadrant is that basic and applied research are not opposed, but orthogonal. Instead of a one-dimensional spectrum, wi... (Read more)

I've been optimizing various aspects of my investment setup recently, and will write up some tips and tricks that I've found in the form of "answers" here. Others are welcome to share their own here if they'd like. (Disclaimer: I’m not a lawyer, accountant, or investment advisor, and everything here is for general informational purposes only.)

This can prevent you from being able to deduct the interest as investment interest expense on your taxes due to interest tracing rules (you have to show the loan was not commingled with non-investment funds in an audit), and create a recordkeeping nightmare at tax time.

As researchers and philosophers discuss the path towards human-equivalent / superhuman general artificial intelligence, they frequently examine the concept of control. Control of our world has always been of central importance to our species, so it’s no surprise that we’d look to extend our control in the future. However, most of the currently discussed control methods miss a crucial point about intelligence – specifically the fact that it is a fluid, emergent property, which does not lend itself to control in the ways we’re used to. These methods treat the problem ... (Read more)

Special thanks to Kate Woolverton, Paul Christiano, Rohin Shah, Alex Turner, William Saunders, Beth Barnes, Abram Demski, Scott Garrabrant, Sam Eisenstat, and Tsvi Benson-Tilsen for providing helpful comments and feedback on this post and the talk that preceded it.

This post is a collection of 11 different proposals for building safe advanced AI under the current machine learning paradigm. There's a lot of literature out there laying out various different approaches such as amplification, debate, or recursive reward modeling, but a lot of that literature focuses primarily on outer alignment at

... (Read more)

Thanks for the great post. It really provides an awesome overview of the current progress. I will surely come back to this post often and follow pointers as I think about and research things.

Just before I came across this, I was thinking of hosting a discussion about "Current Progres in ML-based AI Safety Proposals" at the next AI Safety Discussion Day (Sunday June 7th).

Having read this, I think that the best thing to do is to host an open-ended discussion about this post. It would be awesome if you can and want to join. More details can be found... (read more)

1adamShimi10hThanks for the answers. * About the guarantees, now that you point it out, the two sentences indeed have different subjects. * About the 3, makes sense that myopia is the most important part * For evaluation vs imitation, I think we might be meaning two different things with richer. I mean that the content of the signal itself has more information and more structure, whether I believe you mean that it applies to more situations and is more general. Is that a good description of your intuition, or am I wrong here? * For the difference between reward learning + maximization and imitation, you're right, I forgot that most people and systems are not necessarily optimal for their observable reward function. Even if they are, I guess the way the reward generalizes to new environment might differ from the way the imitation differs.
Effective children educationQ
372d2 min readShow Highlight

I am trying to find out what are the most cost-effective ways of (early) education. I have a 4-year-old daughter and that gives me about ~2 more years to figure this out and I am trying to put together as much material as I can. Given the age of my daughter, I’d like to “solve” something like K-12 for now, but I guess some things may be applicable at any age.

I am familiar with Bryan Caplan's main theses formulated in the Case Against Education or Robin Hanson's Elephant in the Brain arguing that education is mostly about signalling and stuff. I therefore partly understand what's wrong and I am

... (Read more)

Hey Raj. Thanks a lot for an insightful post, it's definitely that sort of things I was looking after, regardless if I immediately agree with them or not.

1-self learning: How I read it so far is that instead of selecting "the way" first and optimizing it later, instead it might be a good idea to focus on learning how to learn by yourself first, recognizing what's the most effective in any given case, be it via internet or an actual human resource such as a tutor.

By the way, my solely main motivation for her to know English was the access to much better mat

... (read more)
1Ericf4hThe question of "how to best educate this one specific person" (the OP Q) is very, very different than the question of "how to educate the 67% of the population within one SD of average*" (and also different from the question of "how to educate over 90% of the population"). *different challenges exist when considering general intelligence, self-motivation/executive function, and family resources as the metric of interest. Many, "how to fix education" plans seem to assume that all kids are "close enough to how I was (or my kid is)"
Covid-19 6/4: The Eye of the Storm
331d6 min readShow Highlight

Still standing by this: Covid-19: My Current Model

Previous update posts: Covid-19 5/29: Dumb ReopeningCovid 5/14: Limbo Under

Remember last week when I opened with this?

I remember when people on Twitter had constant reminders that today was, indeed, only Wednesday, or whatever day it happened to be. Time moved that slowly.

Time has sped up again.

Well, yeah. Not so much anymore.

In March and April I found myself constantly checking Twitter and the financial markets for news, frantically hunting for ways to get a handle on what was happening in the world, worried everything would fall apart. Wo... (Read more)

From Facebook:

Jai Dhyani: This seems like an extremely overconfident prediction and I don't think it accurately reflects popular opinion regarding pandemic response.

Rob Bensinger: What are the main things you think Zvi's wrong about? What do you think will happen?

Jai Dhyani: A series of predictions to which I assign each individually 75%+ chance: Social distancing is going to remain popular. Reopening will continue at a slow and steady pace. Large indoor gatherings will continue to be mostly avoided. Continued increases in testing capacity will slow spread

... (read more)
12habryka14hMod note: I decided to move the second half of this comment to the Open Thread, because Zvi explicitly requested that comments should stay on-topic. Here is a link to the new comment. []
6Larks17hI attempted to produce a rough estimate of this here [] (excerpted below):
3Bob Jacobs8hContinuing my streak of hating on terms this community loves [] . I hate the term 'Motte-and-bailey []'. Not because the fallacy itself is bad, but because you are essentially indirectly accusing your interlocutor of switching definitions on purpose. In my experience this is almost always an accident, but even if it wasn't, you still shouldn't immediately brand your interlocutor as malicious. I propose we use the term ' defiswitchion' (combining 'definition' and 'switch') since it is actually descriptive and easier to understand for people who hear it for the first time and you are not indirectly accusing your interlocutor of using dirty debate tactics.

I'll stick with motte-and-bailey (though actually, I use "bait-and-switch" more often). In my experience, most of the time it's useful to point out to someone, it _is_ intentional, or at least negligent. Very often this is my response to someone repeating a third-party argument point, and the best thing to do is to be very explicit that it's not valid.

I'll argue that the accusation is inherent in the thing. Introducing the term "defiswitchion", requires that you explain it, and it _still_ is accusing your correspondent of sloppy or motivated unclarity.

  1. People can short stocks of companies and then sabotage or assassinate important people to make the stocks drop.
  2. Political goals seem ripe for assassination. For example, the US recently killed Iran’s general Soleimani. Couldn’t they do this without taking responsibility (i.e. with plausible deniability)? If so, aren’t further assassinations beneficial?
  3. Political goals also seem ripe for sabotage. Currently, sometimes nations do this with cyberattacks. But I haven’t heard of physical sabotage attempts. For example, poisoning the water supply of enemy cities, burning their government buildings, cu
... (Read more)

I'm not so sure it's feasible to carry out an assassination with enough secrecy that no one could know you did it. It's hard to keep a secret if the world's best intelligence agencies are all highly motivated to figure it out! Now, your word choice was "no one could necessarily _prove_ you did it," but even if it could not be proven in say an international tribunal, if other countries knew that my country did it, they could retaliate.

1J C13hEven with the organizing technology, there are no outspoken people who have decided to act as leaders of these movements. No martin luther kings, Malcom Xs, or huey newtons, or fred hamptons. My suggestion is no one wants to be those guys because they all got assassinated. thus these movements sadly remain unorganized and leaderless.
2ChristianKl13hAnybody who's very outspoken doesn't survive in today's highly political correct enviroment of the left. If you take a person like Sarah Wagenknecht who's outspoken and an leader on the left she enough thought that was independent from left wing orthodoxy to not ally effectively.
2Answer by RedMan16h [] at least one group of people appear to have accepted at least some of your argument. Furthermore, assassinations fall into three categories: Where the assassin takes credit afterwards (for intimidation, bragging to supporters, etc), where a third party is blamed (to prevent reprisals being directed at the source), and where it is unclear that an assassination was performed (wow IBM got screwed hard by that plane crash). From the perspective in the OP, it is clear that there is a detection challenge. The most useful categories (to an assassin) are the third and the second, the least useful is the first. An external observer will see only the first category, and a potential subset of the second category, but is unlikely to see many members of the third category. Maybe they're very common, and you're just not seeing the obvious.

Let's say you have an idea that you think might be interesting to investigate, possibly a new aspect in AI safety, maybe some new algorithm.

If you're an experienced researcher, you probably have plenty of intuition to think through it, consider the possible outcomes and decide whether it's worth investigating.

If you have a decent academic network, you can probably bring it up even in casual conversations with people who are as good or better than you in the field to get a sense of their intuitions.

What if you have none of those things? Is there an online forum for such discussi... (Read more)

Also a PhD here - read, read, read. You need to know what's been done to see what the gaps are and how your project would fit in. You will also build up that intuition.

Sure, it's also helpful to be able to bounce ideas around your network, but the less well-formed the idea is, the more likely it is to go to friends who aren't just going to shoot you down or for it to get the benefit of the doubt as "early-stage." You need to get the idea formed to the point where someone can feel comfortable pointing out issues, which will take ind... (read more)

Status-Regulating Emotions
192d2 min readShow Highlight

Elizer Yudkowsky wrote an interesting comment on What Universal Human Experiences Are You Missing Without Realizing It?

It was the gelling of the HPMOR hatedom which caused me to finally realize that I was blind, possibly I-don’t-have-that-sense blind, to the ordinary status-regulation emotions that, yes, in retrospect, many other people have, and that evolutionary psychology would logically lead us to expect exists.

…It was only afterward that I looked back and realized that nobody ever hates Hermione, or Harry, on account of either of them acting like they have more status than someone else

... (Read more)

(I'm basing this on what I feel like – unlike you, Isusr, and Eliezer, I feel this emotion very strongly.)

I agree that Justin's answer is missing the point. I also think your description isn't quite right. You assume that what is inappropriate is based on social norms. That does not need to be true.

For example, I am not at all angry at the success of HPMoR because I think the success is appropriate. But my blood still boils in other cases where people are successful. And success isn't even required – I can get angry at someone even attempting to do somethi

... (read more)

Some back of the napkin math. Suppose we:

  • Value a QUALY at $50k. 
  • Use an expected lifespan of 10k years. Perhaps you expect a 1% chance of living 1M years due to the possibility of a friendly superintelligence or something.

That would mean:

  • The value of your life would be $500M.
  • A 1% chance of death would cost $5M.
  • A 100x smaller chance of death of 0.01% would cost $50k.
  • Decreasing your chance of death 100x would be worth ~$5M.

There seem to be various ways to decrease your chance of death from the coronavirus by 100x or more by going from "normal careful" to extremely careful.

1% is a pretty high estimate; however, It’s okay to value your life to an arbitrary degree. Yes, that breaks down outside certain bounds, but it’s okay to take precautions. It’s a scary situation. Just don’t forget to see to your emotional needs too.

I hope that you’re doing well. It’s nice to run into you.

  • Jason Kleinberg
Three characteristics: impermanence
2512h18 min readShow Highlight

This is the sixth post of the "a non-mystical explanation of the three characteristics of existence" series.


Like no-self and unsatisfactoriness, impermanence seems like a label for a broad cluster of related phenomena. A one-sentence description of it, phrased in experiential terms, would be that “All experienced phenomena, whether physical or mental, inner or outer, are impermanent”.

As an intellectual claim, this does not sound too surprising: few people would seriously think that either physical things or mental experiences last forever. However, there ... (Read more)

Endorphins feel vibrationy without any additional investigation, my guess is that the normal sample rate on pleasurable sensations is just higher. you get more sensory clarity with pleasurable sensations which is part of what makes them pleasurable in the first place.

Inaccessible informationΩ
613d14 min readΩ 26Show Highlight

Suppose that I have a great model for predicting “what will Alice say next?”

I can evaluate and train this model by checking its predictions against reality, but there may be many facts this model “knows” that I can’t easily access.

For example, the model might have a detailed representation of Alice’s thoughts which it uses to predict what Alice will say, without being able to directly answer “What is Alice thinking?” In this case, I can only access that knowledge indirectly, e.g. by asking about what Alice would say in under different conditions.

I’ll call information like “What is Alice thinki... (Read more)

Thanks for this post Paul. I wrote a long-form reply here.

Buddhists talk a lot about the self, and also about suffering. They claim that if you come to investigate what the self is really made of, then this will lead to a reduction in suffering. Why would that be?

This post seeks to answer that question. First, let’s recap a few things that we have been talking about before.

The connection between self and craving

In “a non-mystical explanation of ‘no-self’”, I talked about the way in which there are two kinds of goals. First, we can manipulate something that does not require a representation of ourselves. For example,... (Read more)

If a model were trying to prevent itself from being falsified, that would predict that we look away from things that we're not sure about rather than towards them.

That sounds like the dark room problem. :) That kind of thing does seem to sometimes happen, as people have varying levels of need for closure. But there seem to be several competing forces going on, one of them being a bias towards proving the hypothesis true by sampling positive evidence, rather than just avoiding evidence.

Model A: I will eat a cookie and this will lead to an immediate
... (read more)
Load More