Adele Lopez

Comments

What happens if you drink acetone?

I think you missed the most interesting effect, which is that ingesting it would put you into some sort of ketosis or at higher levels: ketoacidosis.

Escalation Outside the System

On one hand, I think you're mostly right about this not being an actual proposal, but I also think that people saying stuff like this would (and will) use guillotines if/when they have the opportunity and think they can get away with it.

Forecasting Thread: AI Timelines

That 30% where we get our shit together seems wildly optimistic to me!

Forecasting Thread: AI Timelines

Roughly my feelings: https://elicit.ought.org/builder/trBX3uNCd

Reasoning: I think lots of people have updated too much on GPT-3, and that the current ML paradigms are still missing key insights into general intelligence. But I also think enough research is going into the field that it won't take too long to reach those insights.

How much can surgical masks help with wildfire smoke?

I think an arbitrary kind of mask is effective for COVID-19 largely because of the fluid dynamics:

If the air has the particles you want to avoid evenly distributed throughout (as with smoke), then this model predicts you'll miss out on most of the benefit of a mask which does the appropriate filtering. So it's probably not worth using surgical masks for smoke.

Do we have updated data about the risk of ~ permanent chronic fatigue from COVID-19?

Anecdotal evidence suggests that it is fairly common: https://www.reddit.com/r/COVID19positive/ -- 2 of the 5 top posts from today are from people complaining about experiencing this, and are both full of comments personally relating to it. There is obviously going to be a selection bias here, but it seems like a good starting point for estimating a lower bound if you can't find enough good studies.

Many-worlds versus discrete knowledge

Say that there is some code which will run two instances of you, one where you see a blue light, and one where you see a green light. The code is run, and you see a blue light, and another you sees a green light. The you that sees a blue light gains indexical knowledge about which branch of the code they're in. But there's no need for the code to have a "reality" index parameter to allow them to gain that knowledge. You implicitly have a natural index already: the color of light you saw. I don't see why someone living in a Many-Worlds universe wouldn't be able to do the equivalent thing.

So I guess I would say that in some sense, once you've figured out the rules, measurements don't give you any knowledge about the wave function, they just give you indexical knowledge.

Adele Lopez's Shortform

It seems that privacy potentially could "tame" a not-quite-corrigible AI. With a full model, the AGI might receive a request, deduce that activating a certain set of neurons strongly would be the most robust way to make you feel the request was fulfilled, and then design an electrode set-up to accomplish that. Whereas the same AI with a weak model wouldn't be able to think of anything like that, and might resort to fulfilling the request in a more "normal" way. This doesn't seem that great, but it does seem to me like this is actually part of what makes humans relatively corrigible.

Adele Lopez's Shortform

Privacy as a component of AI alignment

[realized this is basically just a behaviorist genie, but posting it in case someone finds it useful]

What makes something manipulative? If I do something with the intent of getting you to do something, is that manipulative? A simple request seems fine, but if I have a complete model of your mind, and use it phrase things so you do exactly what I want, that seems to have crossed an important line.

The idea is that using a model of a person that is *too* detailed is a violation of human values. In particular, it violates the value of autonomy, since your actions can now be controlled by someone using this model. And I believe that this is a significant part of what we are trying to protect when we invoke the colloquial value of privacy.

In ordinary situations, people can control how much privacy they have relative to another entity by limiting their contact with them to certain situations. But with an AGI, a person may lose a very large amount of privacy from seemingly innocuous interactions (we're already seeing the start of this with "big data" companies improving their advertising effectiveness by using information that doesn't seem that significant to us). Even worse, an AGI may be able to break the privacy of everyone (or a very large class of people) by using inferences based on just a few people (leveraging perhaps knowledge of the human connectome, hypnosis, etc...).

If we could reliably point to specific models an AI is using, and have it honestly share its model structure with us, we could potentially limit the strength of its model of human minds. Perhaps even have it use a hardcoded model limited to knowledge of the physical conditions required to keep it healthy. This would mitigate issues such as deliberate deception or mindcrime.

We could also potentially allow it to use more detailed models in specific cases, for example, we could let it use a detailed mind model to figure out what is causing depression in a specific case, but it would have to use the limited model in any other contexts or for any planning aspects of it. Not sure if that example would work, but I think that there are potentially safe ways to have it use context-limited mind models.

Adele Lopez's Shortform

Half-baked idea for low-impact AI:

As an example, imagine a board that's lodged directly by the wall (no other support structures). If you make it twice as wide, then it will be twice as stiff, but if you make it twice as thick, then it will be eight times as stiff. On the other hand, if you make it twice as long, it will be eight times more compliant.

In a similar way, different action parameters will have scaling exponents (or more generally, functions). So one way to decrease the risk of high-impact actions would be to make sure that the scaling exponent is bounded above by a certain amount.

Anyway, to even do this, you still need to make sure the agent's model is honestly evaluating the scaling exponent. And you would still need to define this stuff a lot more rigorously. I think this idea is more useful in the case where you already have an AI with high-level corrigible intent and want to give it a general "common sense" about the kinds of experiments it might think to try.

So it's probably not that useful, but I wanted to throw it out there.

Load More