Oh I see. The quoted section seemed confused enough that I didn't read the following paragraph closely, but the following paragraph had the basically-correct treatment. My apologies; I should have read more carefully.
Problem is, there isn't necessarily a modular procedure used to identify yourself. It may just be some sort of hard-coded index. A Solomonoff inductor will reason over all possible such indices by reasoning over all programs, and throw out any which turn out to not be consistent with the data. But that behavior is packaged with the inductor, which is not itself a program.
I'm about 80% on board with that argument.
The main loophole I see is that number-of-embedded-agents may not be decidable. That would make a lot of sense, since embedded-agent-detectors are exactly the sort of thing which would help circumvent diagonalization barriers. That does run into the second part of your argument, but notice that there's no reason we need to detect all the agents using a single program in order for the main problem setup to work. They can be addressed one-by-one, by ad-hoc programs, each encoding one of the hypotheses (world model, agent location).
(Personally, though, I don't expect number-of-embedded-agents to be undecidable, at least for environments with some kind of private random bit sources.)
Asserting that there are n people takes at least K(n) bits, so large universe sizes have to get less likely at some point.
The problem setup doesn't necessarily require asserting the existence of n people. It just requires setting up a universe in which n people happen to exist. That could take considerably less than K(n) bits, if person-detection is itself fairly expensive. We could even index directly to the Solomonoff inductor's input data without attempting to recognize any agents; that would circumvent the K(number of people) issue.
"Okay, so you're saying the actual hypotheses that predict my observations, which I should assign probability to according to their complexity, are things like 'T1 and I'm person #1' or 'T2 and I'm person #10^10'?" says the Solomonoff inductor."Exactly.""But I'm still confused. Because it still requires information to say that I'm person #1 or person #10^10. Even if we assume that it's equally easy to specify where a person is in both theories, it just plain old takes more bits to say 10^10 than it does to say 1."
"Okay, so you're saying the actual hypotheses that predict my observations, which I should assign probability to according to their complexity, are things like 'T1 and I'm person #1' or 'T2 and I'm person #10^10'?" says the Solomonoff inductor.
"But I'm still confused. Because it still requires information to say that I'm person #1 or person #10^10. Even if we assume that it's equally easy to specify where a person is in both theories, it just plain old takes more bits to say 10^10 than it does to say 1."
I think this section is confused about how the question "T1 or T2" gets encoded for a Solomonoff inductor.
Given the first chunk in the quote above, we don't have two world models; we have one world model for each person in T1, plus one world model for each person in T2. Our models are (T1 & person 1), (T1 & person 2), ..., (T2 & person 1), .... To decide whether we're in T1 or T2, our Solomonoff inductor will compare the total probability of all the T1 hypotheses to the total probability of all the T2 hypotheses.
Assuming T1 and T2 have exactly the same complexity, then presumably (T1 & person N) should have roughly the same complexity as (T2 & person N). That is not necessarily the case; T1/T2 may contain information which makes encoding some numbers cheaper/more expensive. But it does seem like a reasonable approximation for building intuition.
Anyway, point is, "it just plain old takes more bits to say 10^10 than it does to say 1" isn't relevant here. There's no particular reason to compare the two hypotheses (T1 & person 1) vs (T2 & person 10^10); that is not the correct formalization of the T1 vs T2 question.
I was under the impression that movie producers DO hire experts for this sort of thing. At the very least, I know they hire science consultants for scientific accuracy problems; I assume they often do the same for historical accuracy.
Let me try another explanation.
The main point is: given a system, we don't actually have that many degrees of freedom in what abstractions to use in order to reason about the system. That's a core component of my research: the underlying structure of a system forces certain abstraction-choices; choosing other abstractions would force us to carry around lots of extra data.
However, if we have the opportunity to design a system, then we can choose what abstraction we want and then choose the system structure to match that abstraction. The number of degrees of freedom expands dramatically.
In programming, we get to design very large chunks of the system; in math and the sciences, less so. It's not a hard dividing line - there are design problems in the sciences and there are problem constraints in programming - but it's still a major difference.
In general, we should expect that looking for better abstractions is much more relevant to design problems, simply because the possibility space is so much larger. For problems where the system structure is given, the structure itself dictates the abstraction choice. People do still screw up and pick "wrong" abstractions for a given system, but since the space of choices is relatively small, it takes a lot less exploration to converge to pretty good choices over time.
There is a major difference between programming and math/science with respect to abstraction: in programming, we don't just get to choose the abstraction, we get to design the system to match that abstraction. In math and the sciences, we don't get to choose the structure of the underlying system; the only choice we have is in how to model it.
Given a fundamental difference that large, we should expect that many intuitions about abstraction-quality in programming will not generalize to math and the sciences, and I think that is the case for the core argument of this post.
The main issue is that reality has structure (especially causal structure), and we don't get to choose that structure. In programming, abstraction is a social convenience to a much greater extent; we can design the systems to match the chosen abstractions. But if we choose a poor abstraction in e.g. physics or biology, we will find that we need to carry around tons of data in order to make accurate predictions. For instance, the abstraction of a "cell" in biology is useful mainly because the inside of the cell is largely isolated from the outside; interaction between the two takes place only through a relatively small number of defined chemical/physical channels. It's like a physical embodiment of function scope; we can make predictions about outside-the-cell behavior without having to track all the details of what happens inside the cell.
To draw a proper analogy between abstraction-choice in biology and programming: imagine that you were performing reverse compilation. You take in assembly code, and attempt to provide equivalent, maximally-human-readable code in some other language. That's basically the right analogy for abstraction-choice in biology.
Picture that, and hopefully it's clear that there are far fewer degrees of freedom in the choice of abstraction, compared to normal programming problems. That's why people in math/science don't experiment with alternative abstractions very often compared to programming: there just aren't that many options which make any sense at all. That's not to say that progress isn't made from time to time; Feynman's formulation of quantum mechanics was a big step forward. But there's not a whole continuum of similarly-decent formulations of quantum mechanics like there is a continuum of similarly-decent programming languages; the abstraction choice is much more constrained.
I understood "ritual" here as not just a blackbox process, but a blackbox process which has undergone cultural selection - i.e. metic knowledge. If we "treat baking as a ritual" in that sense, it would mean carefully following some procedure acquired from someone else, on the assumption that some parts are really important and we don't have a good way to tell which.
Telomere shortening is an interesting case. (I'm going to give my current understanding here without trying to dig up references, so take it all with a grain of salt.)
It's clearly a plausible root cause - it's a change which could stick around on long enough timescales to account for aging. On the other hand, it is possible for telomeres to turn over: telomerase is active in stem cells, so telomere length should at least not be an issue for cell types which regularly turn over - the telomeres turn over with the cells, which are ultimately replaced from the stem cells. For long-lived cells, there's a stronger case that telomere shortening could be an issue.
Telomeres do get shorter with age, BUT they get shorter even in cell types which turn over regularly. That's a bit of a red flag - either the telomeres aren't being fully replaced by telomerase in the stem cells (in which case the stem cells ought to die a lot sooner), or some other mechanism is making them short besides accumulated loss over lifetime. The alternative mechanism which jumps out to me is: DNA damage, and oxidative damage in particular, has been observed to rapidly shorten telomeres. DNA damage and oxidative damage rates are generally observed to be much higher in aged cells of most types, so that would explain why telomeres are shorter in older organisms.
In terms of actual experiments, telomerase-boosters have been experimented with a fair bit, and my understanding is that they don't have much effect on age-related diseases (though of course there's the usual pile of low-N studies which find barely-significant and blatantly p-hacked results).
Other things will eventually be covered later in this sequence.