Noting the low resolution capacity of most gene regions and being aware of plastid differentiation within and between species, I would have assumed, depending on the overall mutation and fixation rate within a lineage, that each gene can only mirror aspects of the true maternal tree. Topological conflict must be expected since fast-evolving (non-)coding regions will perform better at the tips, and conservative regions better in deeper parts of the phylogeny. Plant phylogeneticists love(d) matK, for example, because it provided a good trade off. However, when you only take portions of the trnK/matK coding unit, you already may infer differing topologies for each part.
The fact that the single-gene trees fail or conflict does not necessarily means that they do not share the same evolutionary history. It could just be a consequence of molecular tree inferences not being able to elucidate the true tree because the data violates the used model (which also can explains differences between amino acid vs. nucleotide inferences), doesn't provide discriminative signal.
Just make a simple experiment: use one tree with e.g. 10 tips and simulate along this true tree bits of 1000 nucleotides using strongly different mutation rates, e.g. for 100 simulated 1000-nt long genes (same model, like HKY). Then infer the single-gene trees, and you will get a certain number of false positives (if I remember correctly increasing towards very low and very high mutation rates), some with astonishingly high support.
Another complication is, that in the case of coding genes, mutations potentially don't get fixed but are repaired to maintain the functionality of the genes. Related to this, mutations are apparently much more probable in certain regions than in others (easy to observe in large, well-sampled genes like the matK and commonly used intergenic spacers). Noting the often homoplastic indel patterns within and between well-established evolutionary lineages, one should not exclude the possibility that, incidentally, convergent mutations outnumber phylogenetically sorted mutations. When combining genes, one quickly eliminates such phylogenetic noise, but when using a single gene you may infer wrong bipartitions (clades in the rooted tree) with varying support. Note that the BS-preferred split is not necessarily the one in the inferred tree. In this context, it would be interesting to see x-y-plots of the BS values for comparing two data sets (e.g. single-gene vs. complete plastome). Such a plotting option is implemented, e.g., in RAxML. Also, it would be nice to have some BS consensus networks, are they star-like, or are their consistently competing alternatives of equal, low, < 70 BS support)?
All-in-all, one would not expect that each plastid gene would give the same tree, even if they share the same evolutionary history. Quite the contrary. Many phylogeneticists only like (have) to pretend there is no issue with internal conflict and topogical instability when using plastid data to ease the review process. This makes this paper extremely valuable because it opens a long-kept blind eye.
On the other hand, if they would be the product of different evolutionary histories, I would expect more evidence for heteroplasmy, recombinant sequences (we have that e.g. in the case of the nuclear-encoded rRNA cistron), and consistent topological paralogy (multi-labelled trees). Even in oaks, where we have a lot of hybridisation/introgression, resolution issues and where the plastomes largely ignore species, we haven't found anything like that (yet).
Recombinant plastomes could proof a point here: since the plastid ring is passed on as a single or two copies via the chloroplasts carried by the oocyte and then duplicated as a whole during cytogenesis, different gene histories could only be embedded in a plastome via intra-cellular recombination.