I like the point about Clone Chip being huge for the BaBW matchup. You're absolutely right that Archer was a key cog in the BaBW decks that were prevalent around that time, and Clone Chip takes Archer from being backbreaking to a tempo/econ hit. Archer was still good, just not amazing, and that could easily be why Kate's winrate vs BaBW climbed so steadily.
I also want to comment on the CT thing a bit more, since a few people on Reddit asked about it as well. I took a few minutes just now to change the Runner filtering to >= 40 cards and run all of the code again, just to triple check that excluding CT hadn't affected anything. The player rating distribution was essentially unchanged (average 1429 vs 1428, stdev 148 vs 149). The distribution plot is virtually identical.
This implies that illegal decks are pretty uncommon -- i.e. that not a lot of people are running 40-card Andy or Kate decks on OCTGN. That's also useful information. (I later counted these games and found that in the 210k dataset there are 35 games played with 45-card IDs using decks that have 40-45 cards.)
Given all of that, I'll simply use this new version of the dataset for Parts 2 and 3. It doesn't change anything and illegal decks weren't a problem anyway!
What follows are more details for the interested. Back when I initially looked at CT's matchup numbers and concluded she didn't have enough games to merit inclusion in the articles, I didn't bother saving any numbers for her, so I played around with the data a little bit to look at them.
Including CT in the data resulted in the skilled players set going up to 695 from 678 -- i.e. 2.4% of the skilled players were playing CT. But at least some of those players really like her, because the dataset of games played by the skilled players went form 71k to 81k games. I suspect (but haven't confirmed) that the huge outlier player I noticed who has played 2k+ games is -- or at least was -- a CT fan.
CT looks pretty popular if you only look at total OCTGN games played:
1 Anarch | Noise 13610
2 Anarch | Reina Roja 2812
3 Anarch | Whizzard 6332
4 Criminal | Andromeda 12785
5 Criminal | Gabriel Santiago 14243
6 Shaper | Chaos Theory 9869
7 Shaper | Exile 1591
8 Shaper | Kate McCaffrey 14322
9 Shaper | Rielle "Kit" Peddler 4454
10 Shaper | The Professor 1148
However, CT's play has dropped off tremendously since Opening Moves, which is why she doesn't have enough data to merit doing the matchup trends:
RunID Pack Games
1 Shaper | Chaos Theory Trace Amount 4
2 Shaper | Chaos Theory Cyber Exodus 1350
3 Shaper | Chaos Theory A Study in Static 1827
4 Shaper | Chaos Theory Humanity's Shadow 597
5 Shaper | Chaos Theory Future Proof 1399
6 Shaper | Chaos Theory Creation and Control 562
7 Shaper | Chaos Theory Opening Moves 1835
8 Shaper | Chaos Theory Second Thoughts 430
9 Shaper | Chaos Theory Mala Tempora 304
10 Shaper | Chaos Theory True Colors 629
11 Shaper | Chaos Theory Fear and Loathing 751
12 Shaper | Chaos Theory Double Time 181
In contrast, here's Kate:
RunID Pack Games
1 Shaper | Kate McCaffrey Trace Amount 425
2 Shaper | Kate McCaffrey Cyber Exodus 544
3 Shaper | Kate McCaffrey A Study in Static 1385
4 Shaper | Kate McCaffrey Humanity's Shadow 481
5 Shaper | Kate McCaffrey Future Proof 2036
6 Shaper | Kate McCaffrey Creation and Control 962
7 Shaper | Kate McCaffrey Opening Moves 3703
8 Shaper | Kate McCaffrey Second Thoughts 1152
9 Shaper | Kate McCaffrey Mala Tempora 606
10 Shaper | Kate McCaffrey True Colors 1261
11 Shaper | Kate McCaffrey Fear and Loathing 1513
12 Shaper | Kate McCaffrey Double Time 254
CT did pick up steam over MT and F&L, so it's possible that by the time the next data dump comes out, she'll have enough data to be included. Like TWIY*, she isn't far off.
I actually think it would be quite interesting to see which players are loyal to their IDs and which players switch around a lot, and whether that's correlated with player skill. I'd have to think about how to do that, though, and it certainly won't be until after I get Parts 2 and 3 written!