NunoSempere

Comments

Is there an easy way to turn a LW sequence into an epub?

Use the LW GraphQL API (https://www.lesswrong.com/posts/LJiGhpq8w4Badr5KJ/graphql-tutorial-for-lesswrong-and-effective-altruism-forum) to query for the html of the posts, and then use something like pandoc to translate said html into latex, and then to epub.

Link to the graphQL API

The command needed to get a particular post:

 {
      post(input: {
        selector:{
          _id: "ZyWyAJbedvEgRT2uF"
        }
      }) {
        result {
          htmlBody
        }
      }
    }
Aggregating forecasts

Geometric mean of the odds = mean of the evidences.

Suppose you have probabilities in odds form; 1: 2^a and 1:2^b, corresponding to a and b bits, respectively. Then the geometric mean of the odds is 1: sqrt(2^a * 2^b) = 1 : 2^((a+b)/2), corresponding to ((a+b)/2) bits; the midpoint in the evidences.

For some more background as to why bits are the natural unit of probability, see for example this arbital article, or search Probability Theory, the Logic of Science. Bits are additive: you can just add or substract bits as you encounter new evidence, and this is a pretty big "wink wink, nod, nod, nudge, nudge" as to why they'd be the natural unit.

In any case, if person A has seen a bits of evidence, of which a' are unique, and person B has seen b bits of evidence, of which b' are unique, and they have both seen s' bits of shared evidence, then you'd want to add them, to end up at a'+b'+s', or a + b -2s'. So maybe in practice (a+b)/2 = s' + (a'+b')/2 ~ a'+b'+s', when a' + b' small (or overestimated, which imho seems to often be the case; people overestimate the importance of their own private information; there is also some literature on this).

This corresponds to the intuition that if someone is at 5%, and someone else is at 3% for totally unrelated reasons, the aggregate should be lower than that. And this would be a justification for Tetlock's extremizing.

Anyways, in practice, you might estimate s' as the historical base rate (to which you and your forecasters have access), and take a' b' as the deviation from that.

Forecasting Newsletter: July 2020.

Thanks.

The major friction for me is that some of the formatting makes it feel overwhelming. Maybe use bold headings instead of bullet points for each new entry? Not sure.

Fair point; will consider.

ozziegooen's Shortform

> The name comes straight from the Latin though

From the Greek as it happens. Also, alethephobia would be a double negative, with a-letheia meaning a state of not being hidden; a more natural neologism would avoid that double negative. Also, the greek concept of truth has some differences to our own conceptualization. Bad neologism. 

Competition: Amplify Rohin’s Prediction on AGI researchers & Safety Concerns

Notes

  • Field of AGI research plausibly commenced on 1956 with Dartmouth conference. What happens if one uses Laplace's rule? Then a priori pretty implausible that it will happen, if it hasn't happened soon.

  • How do information cascades work in this context? How many researchers would I expect to have read and recall a reward gaming list (1, 2, 3, 4)

  • Here is A list of good heuristics that the case for AI x-risk fails. I'd expect that these, being pretty good heuristics, will keep having an effect on AGI researchers that will continue keeping them away from considering x-risks.

  • Rohin probably doesn't actually have enough information or enough forecasting firepower to predict that it hasn't happened at 0.1%, and be calibrated. He probably does have the expertise, though. I did some experiments a while ago, and "I'd be very surprised if I were wrong" translated for me to a 95%. YMMV.

  • An argument would go: "The question looks pretty fuzzy to me, having moving parts. Long tails are good in that case, and other forecasters who have found some small piece of evidence are over-updating." Some quotes:

    There is strong experimental evidence, however, that such self-insight is usually faulty. The expert perceives his or her own judgmental process, including the number of different kinds of information taken into account, as being considerably more complex than is in fact the case. Experts overestimate the importance of factors that have only a minor impact on their judgment and underestimate the extent to which their decisions are based on a few major variables. In short, people's mental models are simpler than they think, and the analyst is typically unaware not only of which variables should have the greatest influence, but also which variables actually are having the greatest influence. (Source: Psychology of Intelligence Analysis , Chapter 5)

    Our judges in this study were eight individuals, carefully selected for their expertise as handicappers. Each judge was presented with a list of 88 variables culled from the past performance charts. He was asked to indicate which five variables out of the 88 he would wish to use when handicapping a race, if all he could have was five variables. He was then asked to indicate which 10, which 20, and which 40 he would use if 10, 20, or 40 were available to him.

    We see that accuracy was as good with five variables as it was with 10, 20, or 40. The flat curve is an average over eight subjects and is somewhat misleading. Three of the eight actually showed a decrease in accuracy with more information, two improved, and three stayed about the same. All of the handicappers became more confident in their judgments as information increased. (Source: Behavioral Problems of Adhering to a Decision Policy)

    • I'm not sure to what extent this is happening with forecasters here: finding a particularly interesting and unique nugget of information and then over-updating. I'm also not sure to what extent I actually believe that this question is fuzzy and so long tails are good.

Here is my first entry to the competition. Here is my second and last entry to the competition. My changes are that I've assigned some probability (5%; I'd personally assign 10%) that it has already happened.

Some notes about that distribution:

  • Note that this is not my actual distribution, this is my guess as to how Rohin will update
  • My guess doesn't manipulate Rohin's distribution much; I expect that Rohin will not in fact change his mind a lot.
  • In fact, this is not exactly my guess as how Rohin will update. That is, I'm not maximizing expected accuracy, I'm ~maximizing the chance of getting first place (subject to spending little time on this)

Some quick comments at forecasters:

  • I think that the distinction between the forecaster's beliefs and Rohin's is being neglected. Some of the snapshots predict huge updates, which really don't seem likely.
An online prediction market with reputation points

Hey! I think this is cool. May I suggest "How many people in Kings County, NY, will be confirmed to have died from COVID-19 during September?" as a question?

I have a forecasting newsletter with ~150 subscribers; I'll make sure to mention this post when it gets sent at the end of this month.

What are the best tools for recording predictions?

Foretold has a public API; requests can be made to it from anything that sends requests. This would require some work.

What are the best tools for recording predictions?

Personally, I've used Foretold, Google Sheets, CSVs, an R script, and my own bash script (PredictResolveTally) (which writes to a csv.).

Personally, I like my own setup best (it does work at the 5 second level), but I think you'd be better off just using a CSV, and then analyzing your results every so often with the programming language of your choice. For the analysis part, this is a Python library I'm looking forward to using.

Assessing Kurzweil predictions about 2019: the results

Browsing Wikipedia, a similar effort was the 1985 book Tools for thought, (available here), though I haven't read it.

Load More