OK, let's decompose this.
First off, for the record, I was a scholarship swimmer at university and have a degree in Biological Science. With that in mind:
The claim that Thomas jumped from being basically bottom-of-the-barrel in men’s sports to competing at topmost levels of women’s is so incredibly widespread that it is not even questioned at all."
RESPONSE: True, but I don't make much of the term "bottom of the barrel"; that comes from people who don't understand swim rankings, and that any swimmer who is in the top 1000 in the world is a pretty damn good swimmer. They're just not a "world class", or "elite" swimmer.
(BTW, I totally understand your frustration at trying to noodle this out; it's easy to find the Top 100 in world events, not so easy to find more than the top 100, and even more difficult to pull non elite rankings in national compeititions.)
Before I go on, let's add some information that help break down some of the confusion.
1) The NCAA (and USA High Schools) race in 25 yard long pools, aka "Short Course Yards", or SCY for short. The rest of the world races in 50 METER long pools, aka "Long Course Meters" or LCM for short. This makes comparisons dicey.
2) It is not unusual for some swimmers who are pretty good in SCY pools to be quite a bit less good, in terms of ranking, when swimming LCM. I don't know if this is the case for Thomas, but it's possible.
3) There is a ton of swimming talent in the world. Just because a swimmer qualifies in the top 16 at the NCAA's does not mean they are "elite". In fact, you probaby have to be in the top four or five at the NCAAs to be competitive in a world competition; the 16th place swimmer is probably not in the world top 200.
So, it's possible that that meme was created in good faith by somebody who didn't know they were making an apples to oranges comparison. So, let's unkink the debate by sticking to NCAA only, with the caveat that the quality of competition at the Olympics or Worlds is significantly higher than the NCAAs.
Anyway, working in yards, this site lists Will Thomas' best performances in 2019. For comparision against the NCAA only, note that 1650 yard time is a distance which is competed ONLY in NCAA championship and invitational meets. (Dual meets swim the 1000 yard distance, which makes it kind of useless for comparisons.)
Thomas best 2019 times, along with the NCAA championship qualifying times:
200 Free … 1:39.31
200 Free NCAA A/B qualifying times: 1:32.05/1:36.32
500 Free … 4:18.72
500 Free NCAA A/B qualifying times: 4:11.82/4:23.34
1650 Freestyle - 14:54.76.
1650 Free NCAA A/B qualifying times: 14:37.31/15:26.19
NCAA 2022 1650 Free Winning Time: 14:12.52
NCAA 2022 1650 Free 16th Place Time: 14:51.36
So, note that Will Thomas did not not make either A or B cut time in the 200, and made only the B cuts in the 500 and 1650. Definitely better at the longer events than the shorter. (There is a reason why Thomas' agenda changed in the 2022 Championships I'll mention later.)
"And a huge feature of this table, aside from a single outlier, is that Thomas competed on a similar level before and after switching categories from male to female. Notice, her best ranks were 6th (UPDATE: 65th*, her second best time would be 34th in that case) and 7th place, which became 1st and 3rd after transitioning. Now, whether these are due to some natural advantage trans women have could be subject to discussion, but I think that a jump from 7th place to 1st is a much more reasonable one than from the 462nd."
OK, but this didn't happen as you describe it, unfortunately.
First off, you have to throw out the 1000 yard distance. It's meaningless because it's not competed at the NCAA championship level, and there's no analogous event in the world level. So, instead, let's look at the NCAA Championship results from 2019:
There's no mention of Thomas anywhere. However, Thomas' personal bests, taken from the Penn site would have placed 35th in the 500 free and and 23rd in the 1650.
"Now, there remains the case of the 481st ranking in the 200 Freestyle, in which she’d later rank 3rd. This result seems highly incongruent with the rest of the results. I do not really have the data to make any reasonable claims about this number, but I think it may be due to the small sample size. Or for the same reason that her time in 50 Freestyle in 2022/2021 was so low compared to all of the rest. It may have also been Lia just switching her focus from 500 and 100 freestyle to 500 and 200 freestyle as she transitioned, or something."
It's not focus. There's two reasons for this, keeping in mind that as Will Thomas, she was not particularly good at the shorter distances, and was unable to make NCAA qualifying times at those distances. Clearly a better performer at the longer distances.
First, the percentage difference in times between male and female times are larger in SCY pools compared to LCM pools. Why? Because you do twice the number of turns, and those turns are huge advantages for men, who can generate much more force from the lower body than women can, partially due to a (on average) a higher percentage of fast twitch muscle fiber, and partially due to their more efficient lower body skeletal geometries. On a turn, the advantage goes to the swimmer who can generate more power off the wall and dolphin kick harder before surfacing.
Secondly, the percentage difference in times between men and women is also larger at shorter distances, again because of aforementioned physiology.
So, if you put both of those together, it's not surprising that Thomas moved events down from the longer to the middle distances, because it maximized her skeletal geometry advantages as well as the fact that HRT weakens, but does not change, the % of fast to slow twitch muscle fibers. If the meet had been competed in a LCM pool, I suspect she would have stuck to the longer events.
But nevertheless, this piece of data, along with the 217th place in 2020/2019, comes alongside multiple placements within the top 10 for men, which do enough to show that she was certainly capable, even back then, of achieving incredible times.
"Incredible" is hyperbolic. Thomas' times as Will were good-not-great, from a national perspective. Will Thomas never came close to a top 10 national finish among men. Thomas didn't even make the A qualifying times at the NCAA, not even in his best events.
Now, here is the question, why would somebody just spread misinformation like that — and why did people so uncritically accept it. This part will be mostly opinion, just a heads up.
Well, I think a large part of the problem is that you've misinterpreted the data. As I previously said, you tried to do something that is very hard for to do, compile multiple data sources, and if you're not a swimmer, you lack the context of determining what's meaningful and what's not.
However, from the example above this “jump” from 462nd place to 1st was, at best, a misrepresentation of statistics (either purposeful or accidental).
Could be both. A world ranking of 450-ish is very possible for Will Thomas in the 200m free. It was by far not his best event. A person who doesn't understand how rankings work might easily then have ended up creating an apples-to-oranges meme without ill intent.
HOWEVER, the point remains that a swimmer who never placed in the NCAAs or any world event as a male, and who never made an NCAA "A" qualifying time as a male, WON the 500 free in the NCAAs as a female. THAT is the data point that matters, for anyone interested in the "fairness" part of this debate.
"Because, frankly, if this statistic was true, I would also be much more hesitant about accepting trans women into women’s sports. But, well, it simply is not."
Well, it kinda is. A swimmer who, as a male, couldn't even make the A qualifying time in the 500 free, transitions, and wins the event as female. This is not going to strike very many people as "fair".
"I just have a particular spot for weird-looking statistics that makes me want to debunk them."
You and me both. But in this case, you kinda needed to be an ex-swimmer, or a swimming junkie, or both, to make sense of the data. I think you tried your best to make your case in good faith, but making sense of the data was just not in your wheelhouse.