"When you optimise characters on a tree inferred from the very same characters to draw conclusion about evolutionary pathways (the change of morphs through time, which you model with the tree), it's circular."
Oh, sorry! I agree with this, of course. I didn't understand that you were thinking three steps ahead! I was only thinking of making apomorphy lists or change lists – restatements of the tree, not uses of the tree for further research.
"To de-circularise you need to make a reconstruction without the traits you want to map. Which, in your set-up, is impossible because then you expect a worse tree."
That would be the case if all known characters which are parsimony-informative for the taxon sample were included in the matrix. That's an ideal no analysis has probably reached. In the present case, a lot of known parsimony-informative characters have yet to be added, as you can see from the first "figure" in this post: https://theropoddatabase.blogspot.com/2019/07/phylogeny-of-lori-analysis-1-philosophy.html – maybe the fact that the characters number exactly 700 is not an accident, but a deliberate restriction to avoid even more delays to publication? The most drastic example I found is that 104 such characters from Brusatte et al. (2014) are not yet included.
"Adding more and more taxa with less and less information, you end up with more and more pseudo-monophyla: wrong internodes, branches that are just artefacts, such as clades based on positively selected convergences."
Yes! You have to add characters, too!
"Adding more characters won't help, because you just add more "?" for that taxon."
It is a very important fact that adding characters can change what the most parsimonious trees for the taxa that can be scored for it are, and that in turn can change what the most parsimonious trees for all taxa are, and that in turn can change the optimizations for other characters which another taxon may be scored for and by which those taxa may be placed. There are many papers that have analyzed mixed matrices and found that the addition of molecular data to an otherwise morphological matrix changes the positions of fossil taxa for which molecular data are wholly unknown.
"Worst case scenario is you add a shared apomorphy and two parallelisms, enforcing the wrong clade"
That can happen if the two parallelisms are compatible with each other and if the other 500 parallelisms don't cancel them out.
That, in turn, happens when there are redundant (or less extremely correlated) characters in the matrix. And that's a widespread problem that there's literature about; it is a lot of work to tackle.
"It shows what your data prefers as a backbone tree"
No, because the selection criterion (percentage of missing data) is arbitrary and meaningless.
It might be interesting to instead remove the taxa with the highest percentages of character conflict. Those might be the ones that have undergone the most homoplasy. Actually, now I'm curious: has that ever been done?
"It's simply not true that more data means better inferences no matter of the (evolutionary) quality of the data. Not even in the case of genes"
I know; that's an issue of character correlation, not to mention things like heterotachy in the case of model-based approaches.
"There no reason to assume a molecular clade must be wrong, because morphology tells a different story."
I never claimed there was! And frankly you won't find many people (anymore, as opposed to 20 years ago) who do make that claim.
BTW, in your blog post you claim that "there has (to my knowledge) not been a single morphology-based tree that was fully congruent to a molecular tree with sufficient taxon and gene sampling". There are lots, they just aren't mentioned much because they're expected. The cases where there are discrepancies get a lot of attention, and they are invariably those where the morphology is really difficult. Roland picked an example, crocodylian phylogeny, where it looks like paedomorphosis in gharials (and possibly elsewhere) distorts the morphological tree by creating redundant characters.
"but will not get you any closer to the true tree, you are looking for."
Morphological sarcopterygian phylogeny has steadily approached the molecular tree. 25 years ago, the morphological consensus joined the molecular one on the position of lungfish closer to us than Latimeria. 9 years ago, the morphological consensus joined the molecular one on lissamphibian monophyly with respect to Amniota, and the molecular consensus settled on the existing morphological one on batrachian monophyly (frogs + salamanders) excluding the caecilians. And while such an agreement is still lacking on the position of the turtles, their morphological position is now much, much closer to the molecular one than the two morphological positions that were competing 10 years ago were, while of the two molecular positions that competed back then, the morphologically absurd-looking one (inside Archosauria) is no longer found.
"note: that most topological changes you can observe are only because one relies on strict consensus trees of MPTs, when you would have started with a network, you would have seen that there are just two or three alternatives, and the MPTs picked in the first analysis the one, and in the next with a different taxon set, the other"
This does indeed seem to happen a lot, in my unquantified experience, with trees made from successive updates of the same matrix: there are several nearly equally strong signals in the data, and all but the largest updates just bring out one or another. I agree that it is not enough to present the MPTs (or their consensus) as "the result"; there also needs to be – as there is in the present case – some presentation and assessment of alternatives that are nearly as strongly supported.
"Just give it a try!"
I, for one, will definitely look into it!
"so much for Simmon's old claim ML-BS is inferior to MP when facing missing data)"
Simmons in his two papers found that ML and Bayesian inference are inferior to parsimony when missing data has a very specific distribution. That distribution can occur in molecular or mixed supermatrices (where different taxa are sequenced for different genes), but hardly in morphological ones. When missing data is instead distributed by rate of evolution (definitely not gonna happen in morphology), Bayesian inference outperforms all. I've discussed this in my big paper.
Anyway, I'm looking forward to your post! (And I still owe you answers to your answers to my comments on a post of yours from back in January. I'll try to get to that this weekend... by now we've probably rehashed half of that here...)