Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on December 13th, 2024 and was peer-reviewed by 2 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on January 29th, 2025.
  • The first revision was submitted on March 31st, 2025 and was reviewed by 1 reviewer and the Academic Editor.
  • A further revision was submitted on April 29th, 2025 and was reviewed by the Academic Editor.
  • The article was Accepted by the Academic Editor on April 29th, 2025.

Version 0.3 (accepted)

· Apr 29, 2025 · Academic Editor

Accept

Dear Dr. Songua,
You have substantially improved your manuscript following the reviewers' comments and responded to all of their comments. Your manuscript is now ready for publication.
All the best,
Alexander Ereskovsky

[# PeerJ Staff Note - this decision was reviewed and approved by Jennifer Vonk, a PeerJ Section Editor covering this Section #]

Version 0.2

· Apr 23, 2025 · Academic Editor

Minor Revisions

Dear Dr. Songhua,

One reviewer has marked your manuscript as a minor revision. We are therefore returning it to you for also a minor revision.

Please pay particular attention to all comments and corrections of the reviewer.
We will be waiting for your revised version of the manuscript.

Best regards,
Alexander Ereskovsky

Reviewer 2 ·

Basic reporting

I appreciate the authors revisions based on my prior suggestions, which are well received. However, I still worry that many of my prior suggestions were not implemented, with no response as to why those suggestions might not have been important or ignored. I understand many of my initial criticisms may have seemed substantial, but I still believe they are necessary to make this review fit for publication. I will highlight a few prior glaring ones again, as well as new notes, below. I do hope the authors take the time to carefully read and consider these criticisms, in addition to my prior criticisms that were left unaddressed, and look forward to their responses.

Basic reporting
I appreciate the initial reorganizing of the sections. However, I still find many of the points follow a seemingly random order without important and necessary context. In addition, a fair amount of the text is unclear, lacking coherence, and inaccurate based on the citations and/or topic sentences or section titles. A few choice examples, of many, which I cannot note in painstaking detail:

In the introduction, it is not obvious why songbirds are a good model for human language acquisition etc. It seems the only rationale is it is easy to manipulate the behavior and get a readout, leaving out many of the more direct similarities between song learning and human language acquisition. In fact, the only real similarities presented seem to be that they both involve dopamine.
Line 134-146: new additions really need to sum up something and have a clear purpose
Line 147-154: first sentence has no citations, but set up “in songbirds”. Second sentence has a single citation, but seems specifically to humans and talking about “speech”, which songbirds do not do. Third sentence is more comparative, but supposition more than anything.
Line 155-158: perhaps these are the most well studied, but references show large number of cells in CG (stylized GCt in some articles) and A8, as well.
Line 159: what is “anterior singing control nucleus”? You just said Area X is not about song control, but is anterior; you are saying the projections simply being there affect the learning and production of singing?
Section 4.1 “The effects of DA and dopaminergic distribution” – why are we talking so much about songbirds here? Save this section as introductions to dopamine
Line 229-232: elevation doesn’t mean it is a driver; could be a readout
Line 234: corvids never defined; a human researcher may not understand
Line 235-238: this sentence is difficult to understand, provides little context, and does not conclude much of anything
Line 245-247: should not include a pigeon citation in a songbird paper, AND there exist plenty of other data on this . . . Woolley etc. . . including some citations already in the manuscript
Line 253-257: these seem to be the same sentence, just different wording?
Line 259: “secondary auditory cortex” should have already been defined prior
Line 273: why is a citation on human music listening closing out a section on “DA-induced plasticity in the auditory cortex of songbird”?
Line 309: what is “period of song learning”? sensory or sensorimotor, as presented before?
Line 354: This topic sentence is somewhat orthogonal to the next sentence . . .
Line 359: it is specifically D1A receptor levels
Line 358-368: two pieces of evidence seem conflicting, and are not discussed or put into context. Bosikova says D1(A)R levels are negatively correlated with mean accuracy, but Lebois and Perkel say their activation is not? How can expression be related to song, but binding of the ligand to the receptor not be related?
Line 382-384: This is a sentence severely in need of context. The phrasing requires intimate knowledge of the source material to understand what it is trying to convey, and without having read the paper, means nothing.
Line 391-393: the article cited is a review that does not cite any primary literature supporting this claim, as vague as “DArgic signals” is. The only related evidence that could fit the sentence as written is that male starlings have a larger POM when the bird is both sexually active and producing higher rates of FD compared to UD song. The only other somewhat similar bit of evidence in the review is actually somewhat counter to what is stated – that less TH label in POM is correlated with FD, but not with UD song.
Line 395-399: this study did not identify A11 as producing DA, and in fact cites the studies that identified this fact. Moreover, “further study” is inappropriate when referencing the same study in the following sentence. Most importantly, no real adequate context is given to A11 as a dopaminergic brain region, especially in light of prior sections on main dopaminergic cell regions in the songbird brain, and is never mentioned again.
Line 401 paragraph: The setup implies this section will be about female responsiveness to male song. Banerjee describes a study using both males and females, but no time is given to disentangle the male vs female result. Moreover, to hearken to the prior point, this study also identifies two separate A11 cell groups
Line 419 paragraph: opening sentence is about male singing. Second sentence sis about hearing song.
Lines 433-438: perhaps I am misunderstanding. If females do not show increased striatal DA neurotransmission after hearing courtship songs, then how can female reinforcement of their mate’s song be dependent on striatal DA neurotransmission?
Line 451 paragraph: what the exact difference is between HCA and LCA is not adequately described, and the reader is left wondering. In addition, it is unclear how the conclusion that these are “dynamic [sic] alterations” is appropriate.
Line 486-487: “NS1” and “BS1” in parentheticals are not acronyms of what precede, and are not used again. Suggest describing if important, or removing if not.
Line 489: this is not my area of expertise, but how can D1R inhibit synaptic transmission if it is increasing the sEPSC frequency? Perhaps this speaks to a lack of adequate time in explaining these seemingly relatively simple electrophysiological terms.
Line 516 paragraph: much of these sentences seem speculative, and/or are in need of citation support. For instance, singing releases dopamine which strengthens this behavior – it is a very heavy-handed claim, and in fact the authors cite other paper in other sections that show it is specific receptor activity that modulates motivation to sing, not simply bulk release of dopamine. Moreover, the final sentence citing the release of dopamine strengthening human expression of ideas is conjecture at best, and editorializing at worst.
Line 530 paragraph is incredibly disjointed. The section is titled “Human speech, songbird vocalizations and dopamine” – why the topic sentence of this paragraph is about social behavior in rats in confusing, moreso why the following sentences are discussing ASD. Not adequate enough time is given to set up the idea that ASD is a human communication disorder that can be modeled in animals, likely because of the poor structure of this paragraph.

Experimental design

I appreciate the refining of the search strategy. Some additional description of why certain found articles were not included could be helpful. However, some notes about the papers that made it into the “final thoroughly checked list of references”:

- The number of final articles included does not match the number citations included in the manuscript
- Barr et al. 2021, a and b, seem to be the same citation.
- Appletants et al. 2000 a and b as well seem to be the same citation, which was noted in the prior review of this article.
- Zhang et al. 2023 a and b, as well, again seem to be the same article.

Validity of the findings

I am a little confused with the figures. For Figure 1, it is curious why no attempt was made to schematize the different regions expressing the different subtypes of dopamine receptors, given the extensive space in the text describing the differing distribution and function of these receptor subtypes? And Figure 2 shows a subset and a different type of data than Figure 1: in Figure 1, there are 6 brain regions that have dopamine receptors, but Figure 2 only shows 5. In addition, “PAG” appears in Fig 1 but not Fig 2, and A11 appears in Fig 2 but not Fig 1. More importantly, the flowchart does not actually show flow of anything, and leaves much to be desired in interpreting the results. It would be much more useful to see which DA-producing regions project to which DA-recipient regions, and the overlay putative functions of each of these -producing and -receiving regions. As of now, it is difficult to disentangle the meaning of the Figure and what new information / insight it adds to the manuscript in a readily digestible manner.

Additional comments

Minor notes:
Line 13-15: this is a tautological argument
Line 17: “singing control nuclei” or “song control nuclei” or “song control circuit”.
Line 21: “DA” would need to be defined prior; substitute to dopamine
Line 22-23: “singing-related behavior in songbirds’ brains” is a weird wording, as the behavior is not in the brain
Line 56: zebra should not be capitalized
Line 57: should be Bengalese; canaries should not be capitalized
Line 70: two periods
Line 96: should be “Dopamine AND Songbird” if this was the search strategy
Line 112: “equivalent” is far too strong
Line 113: “tracheosyringeal”, note
Line 115: “dominates” incorrect terminology
Line 116: “forebrain”, note
Line 134: play a role in what?
Line 134-135: citation needed
Line 190-191: “regulate motor activity” is not exactly physiology as described
Line 211: what is VGlut2? Never defined . . . also then changed to “Vglut2”
Line 212: what is “an intermediate proportion of cells”
Line 217: “crucial brain site” doesn’t mean anything
Line 220: “more abundant” than what?
Line 301: what is the “immediate early gene”?
Line 306-309: which afferents did they stimulate? Assume excitatory?
Line 385: Duffy does not “suggest” this – it is primary data
Line 453-4: “medial preoptic area”, is assumed to be the same as “medial preoptic nucleus (POM)” already introduced, but should be clarified.
Line 461: “spiny neurons” is a form of jargon that is not adequately described, and does not allow the reader to fully absorb the totality of these data and their meaning.
Line 471: assume this is supposed to be “D2R agonists”, but needs to be clarified
Line 476: this data is implied to be found in RA, but is not stated, and not cited
Line 478: this sentence appears to be the same data as is presented in line 471, but a different citation. Clarification needed.
Line 1044: “forbrain” should be “forebrain”
In various places, again, “area X” or “the X region” appears instead of “Area X”

Version 0.1 (original submission)

· Jan 29, 2025 · Academic Editor

Major Revisions

Dear Dr. Songhua,

Both reviewers have marked your paper as a major revision. We are therefore returning it to you for also a major revision.

Please pay particular attention to the following comments of the both reviewers: it is necessary to correct methodological flaws, correct citation inaccuracies in all parts of the manuscript. Substantial efforts are needed to enhance clarity, coherence, and accuracy to make this paper a useful resource for its intended audience. The review lacks a systematic framework for understanding. The methodology seems to be wrong or not properly described.

[# PeerJ Staff Note: It is PeerJ policy that additional references suggested during the peer-review process should *only* be included if the authors are in agreement that they are relevant and useful #]

Reviewer 1 ·

Basic reporting

The topic is timely and highly relevant, as no recent reviews have comprehensively addressed dopamine's role in songbird vocal behaviors and the topic has potential significance for both basic science and clinical applications. However, it requires major revisions to correct methodological flaws, citation inaccuracies, and disorganization. Substantial efforts are needed to enhance clarity, coherence, and accuracy to make this review a useful resource for its intended audience.

1. The order of the sections is well chosen but within sections, the points follow in seemingly random order often without the necessary context to understand it if you are not already familiar with the topic or specific publication, and some of the statements are wrong or lack the appropriate citations. A lot of work is necessary to make the text correct, understandable and useful for a broad and cross-disciplinary audience.

Example only from Section 3.2 but similar criticism could be made about other sections:

o Section 3.2 does not clearly separate between song learning in juveniles and adults and jumps back and forth between the two distinct forms of learning without clearly stating the context.
o Furthermore, “significant deficits when exposed to white noise” is not commonly understood as a form of learning outside of the songbird field and requires further explanation.
o Gadagkar et al. 2016 is the main and first paper showing performance error encoded by dopamine in singing birds that revolutionized the field. This paper is not mentioned at all in the whole paper and should definitively be discussed before talking about the literature from Chen et al (line 205 -211)
o Paragraph line 228 starts with the following sentence: DArgic signaling in the songbird pallium plays a crucial role in motor learning. First of all, motor learning was discussed before and this sentence it not making clear what different aspect the authors look at here. Secondly, the rest of the paragraph does not speak about “pallidal” areas but about HVC and PAG. The next sentence is misleading in a different way: zebra finches can also learn from playback without activating PAG_HVC terminals but is only facilitated by the activation, see cited publication (Tanaka et al. 2018). Next sentence talks about the regulation of attention-associated learning which seems completely out of context without background knowledge about Tanaka et al. Please first tell the reader that the facilitated learning is assumed to be due to increased attention (Tanaka et al 2018) before transitioning to this topic. Furthermore, the cited publication here (Mandelblat-Cerf et al 2014) is about completely different brain areas and does not mention “attention” a single time.

2. 3.1 DArgic system innervation in songbirds:
The section starts with a general overview of dopamine without clearly stating so. Based on this title one would expect information on dopamine in songbirds. Furthermore, the first sentence is wrong: DA is not produced in the basal ganglia.

3. Conclusion: After going through all this literature, it would be great to have a summary in the conclusion of the main known function of DA in the songbird before going into the open questions.

4. Figures of the key experimental findings would be helpful for emphasis and understanding.

Experimental design

5. The methodology seems to be wrong or not properly described (2. Survey methodology). The authors claim that they identified 94 publications on three platforms using the search term [Dopamine OR Songbird OR Singing behavior]. First of all, this is a bad and too broad search term for a review on ‘Neural mechanisms of dopamine modulating singing related behavior in songbirds’. Secondly, a quick search on PubMed with the provided search term gives me roughly 15’000 publications between year 1994 and 2024, orders of magnitude higher than what the authors claim. I thus conclude that their search strategy is not accurately depicted by their summary and their study could not be replicated by the information provided.

6. It was not in my capacity to check every single citation that I am not familiar with. But from the many I know or checked, several ones do not support the statement made by the authors which makes me question the rest of them too. Furthermore, there are many statements made that lack a citation and many missing papers.
o The introduction and overview of the song system contains very few and only single citations often providing just one example study when there are many other that should be cited too.
o Sasaki et al. 2006 (line 251-254) did not measure accuracy nor sequential match as stated in the text.
o The cited publication Das, Goldberg was published in 2022 and not 2021
o Line 322-325 suggests that Barr and Woolley found that “developmental exposure to music influences the density of dopamine-inducing neurons in the anterior VTA”. However, Barr and Woolley did not show that and only mention “musical training during childhood” in their introduction. Alternatively, the authors refer with this sentence to other literature (not cited) and most likely studied in humans not birds (not mentioned), in which case this paragraph needs further clarification.
o Papers that also contributed significantly to the field of DA in singing behavior and do not appear in the review: Chen et al. 2018, Xiao 2018, Kubikova 2010, Gadhakar et al 2016
o Examples for lacking citations: line 96, 125, 130, 212-216 , 245, 275, 319, 323, 334, 389, …
o See my comment above for section 3.2

Validity of the findings

Nothing to add besides the comments above.

Additional comments

Minor points:
1. Line 46: The cited publication (Prather 2013) is just one (random) example. It would be better to provide several example studies to make this point. More such examples in the introduction of the song system.
2. General: It is ‘song learning’ or ‘learning to sing’ not ‘singing learning’
3. Line 89: It is not a trivial question which brain areas in the songbird brain are homologous to which areas in mammals. Please use less certain statements when comparing the two.
4. Line 100: I don’t understand ‘the sending of singing’
5. Line 251-254: Please specify what accuracy and sequential match is or provide citation. Also, citation is wrong, see above.
6. Line 266: Song maintenance is used to refer to the quality of song (a lack of degradation) and not high singing rates. Please use different wording to not mislead the reader.
7. Line 274-280 is misleading. Consider moving the citation to an earlier sentence and end with something along the lines of “AS Duffy et al. suggests, these findings …”
8. Line 386-398: Please clarify that this whole paragraph is paraphrasing the study from Lukacova et al. 2016. The way it is written now implies many missing citations and Lukacova et al only applies to the last sentence.
9. Line 440: DA is a key substance and not ‘the’ key substance. Otherwise, this would need a citation.
10. Line 445: I am missing why songbirds are used (besides just having another model system) and what or for which question songbirds provide an advantage is over other model systems.
11. Line 448: I don’t understand what “our research findings, both domestically and internationally,” means. Could you rephrase this?
12. Line 451-453: Your current sentences sound as if everything is known about DA in Area X and nothing about HVC and RA. I suggest to rephrase this to make clear that there are still open questions about how DA acts in Area X and there are some but fewer studies about DA in HVC and RA. You mentioned several studies in these areas yourself.
13. Line 461: The transition to astrocytes is abrupt and unclear.
14. Line 721: Xiao, L., and Coauthors, 2021:
Authors should be listed

Reviewer 2 ·

Basic reporting

The manuscript offers “an updated review” on dopamine regulation of singing behaviors in songbirds. The topic is of broad general interest, and a review of the proposed scope would be incredibly timely. However, the field has been reviewed recently, in part by this manuscript’s senior author (https://doi.org/10.16476/j.pibb.2020.0189), with only approx. 10% of the cited studies having been published in the time since this author’s last review. I have numerous concerns about the manuscript in its current state, which I have outlined below.

The abstract currently sets the expectation that a significant portion of the review will be dedicated to comparison of songbird vocal learning with human language acquisition. At best, human language acquisition, and similarities to songbird vocal learning, are only briefly broached in the main text. In some places, human data and avian data are intermingled without either noting which data comes from which model, or drawing comparisons. In addition, the abstract does not do much to set up the review of dopamine, the justification for focusing on this single molecule as one of the “mechanisms underlying vocalization behavior”, etc. I suggest reframing the abstract to dedicate more space to WHY focus on dopamine for the review, because as of now, it is not clear that the authors could not substitute the mention of “dopamine” (line 21; line 23) for any other of the many molecules related to vocal learning in one way or another, and such a change would not substantially change the expectation for the rest of the main text.

The introduction spends a large amount of time discussing songbirds and their use in research into vocal learning, but comparatively little time introducing dopamine, dopaminergic systems, why they are important, and why they merit review in the context of vocal learning. This lack of dopaminergic review would be generally fine if there was adequate time in the main text devoted to reviewing dopaminergic systems, but there is not (see below). I suggest adding more to the introduction to introduce dopamine, why it is interesting in the context of vocal learning, etc. in broad strokes - and adding more specific descriptions in these areas to the main text (see below).

In many places the text is cumbersome (e.g., line 172-175), the logic is backwards (e.g., line 175-178), paraphrasing actually provides the opposite argument as what is provided in the associated reference (e.g., line 187-189; line 195-211), or uses inconsistent, conflicting, vague, or undefined terminology that detracts from the reader's ease of understanding (i.e., line 222, "area X" when "Area X" was already used; line 283, "mPOA" implied as a different region as the already-presented "POM"; line 322, "paired" could mean placed in cage with a conspecific or heterospecific, or a juvenile, or mated and in a pair-bond; line 195-197, an "immediate early gene" is not described, nor is the specific gene stated; line 95, "AFP" stands for "Anterior Forebrain Pathway", but in presented as "Anterior Brain Pathway"; line 251-254, "mean accuracy" and "sequential match" are jargon left undefined; and many more examples too numerous to include here). I suggest significant revision for clarity and precision.

Experimental design

Is the Survey Methodology consistent with a comprehensive, unbiased coverage of the subject? If not, what is missing?
I am somewhat confused by the survey methodology, as conducting the same search using the same terms and timeframe in Scopus, for instance, gives me over 50,000 results. Thus there are many potential articles missing from an unbiased coverage of the subject, including the senior author’s previous review (noted in the prior section). Curiously, the survey methodology states 90 articles were used, yet the citations list has only 88 (note, 2 of which are duplicates from Appletants et al., which the authors state to have already removed, and an incomplete citation that the citation manager generated an invalid error from, and which is not cited in the main text). In addition, at least one article cited in this manuscript is a “perspectives” article, offering a sort of opinion piece on a recent publication in the same issue of the publishing journal, yet the article it is offering a perspective on is not in the citation list. Many of the included review articles reference other articles that are additionally missing from this manuscript. Finally, if the authors set out the expectation for a comparative review to human speech, at minimum the search strategy should include something related to human speech, other than the general “dopamine”.

Are sources adequately cited? Quoted or paraphrased as appropriate?
In many places, citations are inappropriate. Most glaringly are the citations from non-songbird avians such as predominantly chickens, ducks, and pigeons (e.g., line 300), or the citations of human/mammal data that are included interspersed in sections on songbirds, but not noted as coming from mammals (e.g., lines 465-467). More specifically, in lines 54-57, these citations do not support claims that these are the primary animal models for vocal learning research, and the Hahn citation is a study looking at mechanisms of song perception, not song production or vocal learning. I do agree with the claims that these are perhaps the most studied systems in songbirds, though I suggest citing a paper that has actually attempted to quantify this - e.g., https://doi.org/10.1186/s12868-024-00919-3. For another example, in Line 90, this citation shows a NON-significant correlation of HVC to the two regions stated. In fact, they show a higher correlation of LMAN to Broca’s area, albeit still non-significant. Note these data are all based on gene expression profile comparisons. This citation DOES actually show a significant correlation of songbird RA to primate LMC, while the Xiao and Roberts 2021 citation (lines 91-92) does not even mention RA, mostly focusing on Area X. In lines 189-192, the Hisey reference does not have any experimental design including social isolation; the juveniles during the song learning phase are always with siblings, and largely with tutors during the majority of the learning phase. A final example (simply to hearken back to the survey methodology), in line 121-124, the citation does not support the claims stated. It is a perspectives / short communication, mostly in response to https://doi.org/10.1152/jn.01053.2004, a paper that did not make it into the author’s citation list. Further, the citation I provided also does not support the claim that Area X receives more dense innervation from VTA and SNc than surrounding striatum - the authors found that Area X has more dopamine content than the overlying pallium (telencephalon/”cortex”), and makes no claims this dopamine is specifically from innervation from VTA/SNc. These are not the only examples of such issues in citations not fitting the claims, but they are too numerous to include here in toto. This requires substantial revision from the authors, so that the references and claims they make are supported by the accompanying literature.

In other places, citations are not provided where they are necessary for claims made (of which there are too many to list here). When sources are appropriately cited, the vast majority of cited sentences are largely lifted from the abstracts of the associated articles. In many of these locations, it is obvious these sentences are copy/pasted with one or few words changed, often transcribing the format of those sentences and, at times, inadvertently copying the typographical errors of the original citation's text. In other locations, paraphrasing has been done in such a way that the meaning of the original sentence is lost - for instance, lines 187-189, “endogenous and exogenous” are not appropriate terminology in this context, but synonyms of the original words, and the paraphrasing loses the point that dopaminergic innervation of Area X is necessary for externally reinforced vocal learning in adults MORE strongly than that type of learning in juveniles, evidence of which is weak at best.

Is the review organized logically into coherent paragraphs/subsections?
I find that the review lacks a systematic framework for understanding, and offers an uncritical summary of a few select articles, mostly the same articles included in the senior author’s previous review, with some few newer additions. The framing in the abstract and the end of the introduction lead to a comparative expectation for the reader, at very least in comparison to human language and speech disorders. Yet, there is no section dedicated to humans, speech, or any non-songbird animal in the main text.

Other than the few sentences noting the similarity of songbird vocal control nuclei to comparable regions in humans, there is only one place in the main text otherwise referencing humans (line 115); the remainder of references to human vocal learning / speech is in the Conclusions section. It is difficult to understand how a Conclusion could be drawn without presenting data in the first place. In broader strokes, the Conclusions is far too long, and presents an astounding amount of reviewed data that merits its own section - with necessary framing, therefore, to be substantially added to the introduction. I suggest moving the bulk of the Conclusions section to the main text, and expand upon it, in a deliberate section on comparisons between human speech, songbird vocalizations, and dopamine. This would satisfy some of the above note about the abstract setting up a comparative expectation.

The main text sections do not do adequate justice to describing dopaminergic systems more broadly (from a cellular, systems, or mechanistic perspective), which makes the following sections difficult to put into context. For instance, no discussion of what the dopamine receptor subtypes actually ARE is included. Are they all ionotropic or metabotropic? G-protein coupled? Gi/Go/Gq? Are their bindings generally excitatory or inhibitory to the cell? Etc. These are all necessary for putting the associated data in the remainder of the text into context, and the authors should dedicate a section to such descriptions, and others, to aide in the reader's ease of understanding.

In some places, section titles do not adequately fit cited content. For instance, in Section 3.1 (note: the second section with the same number), the title of this section is “dopaminergic system innervation in songbirds”. The first paragraph broadly contains references that are, in some cases, particular to mammals, and in other cases very specific to humans. Little effort is made to disentangle these, but the framing of the section title leads the reader to assume all of these claims are true in songbirds, despite significant lack of evidence in some cases. Time should be dedicated to disentangling these, or, at very least, clarifying what is known in songbirds, and what might be predicted based on knowledge from other models. In other places, sections are devoted to “song learning”, and the bulk of their content implies this is developmental song learning, but some citations are from adult song modification. More broadly, the sequencing of sections leaves something to be desired. Song perception is the necessary first step before song production learning, yet the sections begin with song learning, followed by song production, and then song perception. Reorganizing these sections would aide in flow of the theoretical framework. Finally, the title of section 3.6 makes little grammatical sense, and is an entire section devoted to a single citation. I suggest reincorporating this citation and some sentences of the paragraph into another, more appropriate, section.

Validity of the findings

Is there a well developed and supported argument that meets the goals set out in the Introduction?
I find there is very little argument made, other than (paraphrasing) “dopamine is important and should be studied more”. In the main text of the manuscript, the authors draw few, if any, conclusions; the majority of the text is a collection of facts, uncritically summarized, and severely lacking any kind of perspective. Few original claims are made as to what these different cited results might mean for the system or how it works; in places where this type of language is used, it is lifted from other references, and cited, as it is not the authors’ own conclusions or additions. The main goals from the introduction seem to be 1) to review new tools for circuit manipulation and monitoring neural activity of dopamine, 2) discuss the neural mechanisms of dopaminergic modulation of singing, and 3) draw parallels between birdsong and human speech, in particular human speech disorders. The main text does little to directly address these goals. At best, point 2 is addressed in the text, though discussion of "mechanisms" are more broad and some specificity is lacking (e.g., different receptor subtypes and their mechanisms of action mentioned above; more recent evidence of VTA neurons co-releasing dopamine and glutamate; what DARRP-32/DA colocalized circuits might be doing functionally; etc.).

Does the Conclusion identify unresolved questions / gaps / future directions?
As mentioned previously, the Conclusion section does little to draw much specific attention to the data provided in the main text. In some places, the authors make claims that are counter to what is presented in the main text (see: lines 452-453; lines 458-460). In other places, future directions are presented with very little context (see: lines 493-499 on scRNA-seq), and do not describe why this future direction is important in context - nor make any mention of the numerous works already using scRNA-seq in songbirds. As previously mentioned, I suggest moving the majority of the conclusions to a dedicated section in the main text on the comparisons to human language, speech disorders, etc. In addition, I suggest reframing future directions more specifically - e.g., is there a specific brain region, cell type, circuit that the reviewed data indicates would be particularly fruitful for future study, and how would that advance our understanding of the system?

Additional comments

I do not mean to provide excess criticism of the article. I believe the scope, as offered in the abstract and introduction, is incredibly timely and would be of great service to the field. I am generally positive about such an article being published, and its contributions would undoubtedly be of great interest to multiple fields, not solely birdsong researchers. However, the article in its current state sadly falls quite short of fulfilling those expectations. I hope the authors take my above suggestions in reconstructing the manuscript organization, content, and conclusions in such a way that those expectations can be filled for such a potentially interesting and useful review.

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.