Rookie Mistakes: Preemptive Comments on Graduate Student Empirical Research Manuscripts

L. J Zigerell

doi:10.1017/S104909651200131X

Rookie Mistakes: Preemptive Comments on Graduate Student Empirical Research Manuscripts

Published online by Cambridge University Press: 04 January 2013

L. J Zigerell

Show author details

L. J Zigerell*: Affiliation:
University of Pittsburgh

Article contents

Abstract
TITLE
ABSTRACT
INTRODUCTION
LITERATURE REVIEW
THEORY
HYPOTHESES
RESEARCH DESIGN
RESULTS
CONCLUSION
FOOTNOTES
REFERENCES
APPENDICES
TABLES AND FIGURES
IN-TEXT CITATIONS
MANUSCRIPT STYLE
CONCLUSION
References

Rights & Permissions

Abstract

Political science graduate students need to develop strong skills in drafting empirical research manuscripts. Yet, many graduate student manuscripts contain similar shortcomings, which require student peers, faculty advisors, and journal referees to produce the same comments for multiple manuscripts. This article lists common comments on empirical research manuscripts, as a reference to help students revise their manuscripts before presentation to others for review, so that reviewers can focus on the more substantive elements of a manuscript, thus producing better manuscripts that are more likely to be published and thus contribute to knowledge about political phenomena.

Type: The Teacher
Information: PS: Political Science & Politics , Volume 46 , Issue 1 , January 2013 , pp. 142 - 146

DOI: https://doi.org/10.1017/S104909651200131X [Opens in a new window]
Copyright: Copyright © American Political Science Association 2013

Solo-authored peer-reviewed research publications send a strong signal of research abilities. But the nontrivial duration from submission to acceptance and publication limits the number of times that graduate students can submit their manuscripts before entering the job market, thus magnifying the negative impact of manuscript weaknesses that trigger a rejection. Students need resources to increase the probability that their manuscript is accepted for publication. This article, based on an idea from Postman (Reference Postman1988), serves as one such resource.

Postman explained the perceived success of doctors and lawyers relative to teachers as a function of perspective: doctors and lawyers focus on correcting negative outcomes, such as sickness and injustice, while teachers concentrate on producing positive outcomes on less well-defined characteristics such as intelligence. Postman recommended that teachers instead adopt a negative perspective that focuses on correcting stupidity, the much more recognizable opposite of intelligence. In that spirit, this article supplements advice on how to write well (Bem Reference Bem, Darley, Zanna and Roediger2003; Guberman Reference Guberman2010; Kirshner Reference Kirshner1996) with advice on how to not write poorly. This list of 70 comments reflects common flaws in graduate student research manuscripts; students can address similar flaws in their manuscript before requesting a review from colleagues, faculty, or anonymous journal referees so that these reviewers can focus on the more substantive elements of the manuscript.

TITLE

1. Journal referees provide feedback to the journal editor about the execution of the research and the quality of the manuscript that describes the research; journal referees also comment on whether the manuscript and its reported research are important enough to be published in that journal. Journals are ranked by impact factors that indicate the mean number of citations received by articles in the journal, so the editor who ultimately makes the reject-or-publish decision is likely to be more interested in broad and potentially influential manuscripts than in narrow and potentially noninfluential manuscripts. The manuscript title is the first opportunity to signal the breadth and importance of the manuscript, so the manuscript title should not be phrased in hyperspecific language, such as “The Effect of Public Opinion Polls on Presidential Vetoes in Freedonia, 2000–2005,” which implies a narrowly focused and less-theoretically developed manuscript concerning one type of executive-legislative interaction in one particular country during one particular time; instead, titles should indicate the most general level at which the theory can legitimately be applied, such as “The Effect of Public Opinion on Executive-Legislative Conflict.”
2. General titles can be supplemented with a subtitle to indicate the contours of the research, such as “The Effect of Public Opinion on Executive-Legislative Conflict: Presidential Vetoes in Freedonia, 2000–2005.” But the title should not include the much-less-common technique of multiple subtitles, such as “I'm Against It: The Effect of Public Opinion on Executive-Legislative Conflict: Presidential Vetoes in Freedonia, 2000–2005.” A three-headed title might not be incorrect, but a general rule for graduate students is to avoid stylistic choices that are not commonly made by others in the field.

ABSTRACT

3. Early drafts of a manuscript should include an abstract because the process of condensing a manuscript into a few sentences helps an author pinpoint what he or she is trying to accomplish in the manuscript and its reported research. This abstract should be 200 words or fewer because readers use an abstract to quickly decide whether to read the manuscript.

INTRODUCTION

4. The introduction should justify the manuscript and the reader's attention to the manuscript and make the case—not merely assert—that the research concerns an important and/or interesting phenomenon that has been covered incompletely or incorrectly in the literature.
5. The first substantive section of an empirical research manuscript is the introduction, so there is no need to label the introduction as “Introduction” unless manuscript sections are numbered or unless required by the formatting rules of the intended journal.
6. Empirical research manuscripts have a standard template: introduction, theory, hypotheses, research design, results, and conclusion. Unless the manuscript has a unique structure, there is no need for the introduction to conclude with a roadmap, such as “in the first section, I do this; in the second section, I do that.” Readers are most likely prepared for the manuscript to conclude with a conclusion that offers some concluding thoughts.

LITERATURE REVIEW

7. The literature review can be placed into its own section or integrated into the introduction or the theory section. In all cases, the literature review should not merely report a history of research on the topic of the manuscript; rather, the literature review should situate the research reported in the manuscript into the literature on the topic so readers are informed of the manuscript's relationship to the broader literature. For example, if the purpose of the manuscript is to address mixed research findings, then the literature review should support the assertion that the research is mixed.

THEORY

8. Theory is not background facts, a definition, restatement of the hypotheses, or implications of the hypotheses; theory provides an explanation for the expected correctness or incorrectness of the hypotheses.
9. The theory should be conceptualized at the most general level possible and any mention of the observations that were used to test the theory should be postponed until the research design section. For example, based on the previously mentioned hypothetical manuscript on the effect of public opinion on presidential vetoes in Freedonia between 2000 and 2005, there is no need for the theory section to mention Freedonia or the 2000–2005 time period, unless the theory is limited to that context.

HYPOTHESES

10. The theory section should provide a reason for expecting that an explanatory variable affects a dependent variable in some particular manner, so hypotheses should be directional.
11. Each hypothesis should be phrased so that the hypothesis can be completely rejected. Double-barreled hypotheses—such as the claim that presidents are more likely to issue vetoes and make recess appointments when an opposition party controls the legislature—should be split into separate hypotheses unless a joint hypothesis is intended.
12. Hypotheses should not include the words “may” or “might”: hypotheses are claims, and proposing that something “might” be true is not a claim as much as an admission of possibility.
13. Hypotheses should not include vague or undefined terms, such as “substantially,” because the correctness of such hypotheses cannot be independently assessed. For example, it is unclear whether a 5% difference is sufficient to reject the hypothesis that respondents are not substantially more likely to support policy X over policy Y.
14. Hypotheses should indicate the dimensions on which the hypotheses will be evaluated. For example, if a hypothesis concerns the level of a variable, the manuscript should indicate whether level refers to frequency, number, and/or magnitude. For example, if a hypothesis is that some factor will increase the level of protest activity, the manuscript should indicate whether this means a higher frequency of protests, a larger number of protests, more people involved in the protests, more passionate protests, or a longer duration for each protest.
15. Hypotheses should not be proposed for control variables; control variables are included in observational studies only to approximate the randomization of a controlled experiment. Manuscripts are not about the control variables.
16. Hypotheses are causal claims so there is no need to indicate that the hypotheses were tested all else equal. Readers of empirical research manuscripts should understand that the testing of causal claims necessitates that all else be held equal.
17. Hypotheses should be numbered consecutively, such as H₁, H₂, and H₃, instead of being named with number-letter combinations, such as H_1a, H_1b, and H₂, because number-letter combinations foster confusion: H₂ is the third hypothesis in the aforementioned example.
18. Each manuscript should be about one thing, so the number of hypotheses in each manuscript should be limited. Manuscripts with a large number of substantive hypotheses might best be split into multiple manuscripts.

RESEARCH DESIGN

19. The research design should be described in enough detail to permit replication of the research using only the descriptions and directions provided in the manuscript or appendices. For example, the manuscript or appendices should provide the exact text of survey measures, identify assumptions and/or weighting techniques, and describe the handling of missing data.
20. Each element of the research design does not need to be reported in the research design section of the manuscript and can be relegated to an appendix or a footnote if the element is not necessary for the reader to understand the research design. For example, details about the coding of control variables are important to include for replication purposes but do not need to be reported in the main body of the manuscript.
21. The research design should report the context under which the observations were collected, such as the dates on which experiments were performed, and any relevant real-world events that might have influenced the results, such as a political scandal or disaster.
22. Coding choices should be justified with a theoretical argument and with citations to studies that have used the coding choices in a similar context.
23. The scale for each variable should be chosen so that the results tables lack coefficients that must be placed in scientific notation.
24. The coding of variables should be justified, both in terms of fitting the statistical technique and in terms of reflecting the theory of the manuscript. For example, if a manuscript investigates whether being mugged increases political conservatism, coding the mugging variable as a count variable presumes that each mugging increases political conservatism, but coding the mugging variable as a dichotomous mugged-or-not-mugged variable presumes that the first mugging is the sole mugging that influences political conservatism.
25. The choice of methodological technique should be justified, both in terms of the analysis meeting the assumptions of the technique and in terms of the technique reflecting the theory of the manuscript. For example, predicting a count of protest activities might require the use of a zero-inflated negative binomial regression if the hurdle between zero protest activities and one protest activity is presumed to be different than the hurdle between one and two protest activities.
26. The research design should identify and justify data selection decisions, such as temporal starting and ending points for observations and any survey items omitted from a battery. For instance, the hypothetical manuscript referenced earlier should explain the restriction of observations to Freedonia and to the 2000–2005 time period.
27. The research design should provide enough information for readers to construct an equation for the model, so most manuscripts do not need to provide model equations such as Y = β₀ + β₁·education + β₂·female + β₃·religiosity + ε. Model equations might confuse or deter readers who cannot interpret the equation, and readers who can interpret the equation will realize that the equation is merely an ornamental indication that the value of the dependent variable is modeled as a function of education, gender, and religiosity.
28. The research design or results section should indicate whether statistical tests were one-tailed or two-tailed.

RESULTS

29. Results should be reported to a reasonable level of precision based on a common-sense evaluation of the research design. Considering the accumulated assumptions of most research designs and statistical techniques, reporting results to more than two significant digits is seldom justifiable; but even if such precision can be justified for summary statistics, little is gained reporting results to more than two significant digits because readers likely cannot conceptualize the difference between, say, 52.8% and 52.9%.
30. The phrase statistically significant should not be shortened to significant: the phrase significant results implies that the results are important, which might not be the case for results from large datasets that are statistically significant but have little substantive effect.
31. Description of results should report the direction of the effect of a variable and not merely report that a variable mattered.
32. Description of results should report the substantive effect of a variable at reasonable levels of the independent variables. Rejecting or not rejecting the null hypothesis is important, but it is more important to provide a sense of whether the effect of a variable for a typical observation is, say, a 5% increase or a 50% increase. These point estimates should also include a sense of uncertainty in the estimate, so that readers can differentiate an estimate of 50% ± 3% from an estimate of 50% ± 33%. See these sources for ideas on the presentation of substantive results: King, Tomz, and Wittenberg (Reference King, Tomz and Wittenberg2000), Tomz, Wittenberg, and King (Reference Tomz, Wittenberg and King2001), and Long and Freese (Reference Long and Freese2005). See these sources for presentations of substantive effects using counterfactual analyses: Kastellec, Lax, and Phillips (Reference Kastellec, Lax and Phillips2010) and Nyhan et al. (Reference Nyhan, McGhee, Sides, Masket and Greene2012).
33. The results section should report the sensitivity of results to reasonable changes in the research design, such as the use of a different justifiable estimation technique, a different justifiable operationalization, or a different justifiable missing data handling technique.
34. The results section should report the behavior of the data in light of the assumptions of the statistical technique, such as the possibility of multicollinearity among variables that measure similar phenomena. Consult regression diagnostic sources, such as the UCLA Academic Technology Services website, which offers the “Regression with Stata” web book by Chen et al. (Reference Chen, Ender, Mitchell and Wells2003) and companion web books for SAS and SPSS.
35. The results section should report goodness-of-fit measures for the models. For example, results tables for categorical dependent variables should report the proportional reduction of error and either the percent of cases in the modal category or the percent of cases correctly predicted. But every post-estimation statistic from the statistical software does not need to be reported; log likelihood is a strong candidate for omission for a study with a single model.
36. The number of observations should be reported for the dataset and for each statistic or result drawn from the dataset. Reasons for substantial decreases in the number of observations should be provided.
37. Substantive results should be presented visually, if possible, and figures should include error bars to reflect uncertainty. See these sources for ideas on the visual presentation of results: Epstein, Martin, and Schneider (Reference Epstein, Martin and Schneider2006); Epstein, Martin, and Boyd (Reference Epstein, Martin and Boyd2007); Gelman, Pasarica, and Dodhia (Reference Gelman, Pasarica and Dodhia2002); and Kastellec and Leoni (Reference Kastellec and Leoni2007).
38. The p-value indicates the probability that the observed difference or a larger difference might occur by chance if the null hypothesis of no difference is correct; this p-value can be compared to a preset level, such as 0.05, to determine whether the observed difference is statistically significant. Therefore, it is permissible to report that lower p-values provide stronger evidence against the null hypothesis, but it is incorrect to report that lower p-values indicate coefficients that are highly statistically significant, because statistical significance is a binary concept.
39. Lack of statistical significance indicates that there is not enough evidence to reject the null hypothesis of no effect, so inferences based on the direction of nonstatistically significant coefficients should be offered cautiously or not at all.
40. The behavior of control variables that behave as expected do not need to be discussed, but it might be a good idea to mention—and perhaps attempt to explain—the behavior of control variables that do not behave as expected.
41. Many readers of the manuscript will not remember the substance of the hypotheses, so the manuscript should not report that the results provide evidence for H₂; instead, the manuscript should restate the hypothesis and report whether results provide evidence for the hypothesis.

CONCLUSION

42. The conclusion should review the main findings, indicate the contribution of the research, and speculate on implications of the findings. Suggesting avenues for future research is less constructive and less important.

FOOTNOTES

43. Each footnote is an interruption, so footnotes should be limited. The most justifiable footnote is a methodological note that provides information required to replicate or assess the research design or results, such as a list of countries included in a sample. The least justifiable footnote is a substantive note that develops an argument from the main text or provides a counter-argument to an argument from the main text; if the content of a substantive footnote is necessary for the reader to understand the argument of the manuscript, then the content of the footnote should be included in the main text; otherwise, the footnote can be eliminated.

REFERENCES

44. Each source of information that is cited in the main text, footnotes, appendices, tables, or figures should appear in the reference list; and the reference list should contain each source of information that is cited in the main text, footnotes, appendices, tables, and figures.

APPENDICES

45. Appendices should contain information that is not necessary to understand the flow of the manuscript, but is necessary for replicating or assessing the results, or is otherwise relevant to the manuscript, such as the presentation of robustness check results.

TABLES AND FIGURES

46. The independent variables in a table or figure should be listed in the order that the variables are discussed in the manuscript, with the explanatory variables listed first and controls listed second.
47. Variable names should indicate direction, such as conservatism or liberalism instead of ideology. Indicated directions are not necessary for variables with an inherent or presumed direction, such as education or income.
48. Readers might scan only the manuscript tables or figures, so the notes for each table or figure should provide enough information for the table or figure to be understandable independent of the text.
49. Tables and figures should be limited in number and in size: multiple tables with the same set of independent variables should be combined, and tables should omit redundant statistics, such as t-statistics and p-values if standard errors are already reported.
50. Graphs of phenomena with a meaningful zero point that do not start at zero foster a misperception about relative levels of the variables in relation to each other: for example, 30% appears to be half of 60% only if the axis starts at zero. Such graph axes should therefore start at zero or have a broken axis to indicate that relative lengths are misleading.

IN-TEXT CITATIONS

51. In-text citations should be included for any claim that needs justification that is not justified in the manuscript itself.
52. In-text citations interrupt the flow of a sentence, so sentences should be phrased so that in-text citations appear at the end of the sentence, if possible.
53. Ordering of multiple citations in the same parentheses that is not consistently alphabetical or chronological suggests a lack of consideration to the citations and to the manuscript.
54. One to three in-text citations are typically sufficient to justify a claim; long citation lists should be shortened or moved to a footnote to improve manuscript readability.
55. The placement and context of an in-text citation should clearly indicate what the citation refers to and what the citation is being used for. For example, in the sentence

… Some scholars have claimed that prominent research in this area is flawed (e.g., Wagstaff 2010)…

it is unclear if Wagstaff is a scholar who claimed in 2010 that prominent research in this area is flawed, or if Wagstaff (2010) is a prominent publication that some scholars claim is flawed.

MANUSCRIPT STYLE

56. Most political science empirical research manuscripts should be written for an audience of fellow social scientists. Manuscripts should not presume that the reader is familiar with the jargon of a particular subfield or the details of an esoteric methodological technique, but the manuscript should presume that the reader understands basic social science concepts such as control variables and p-values.
57. The manuscript should not contain errors in grammar, spelling, or punctuation; these errors indicate that the writing of the manuscript was not conducted carefully and suggests that the reported research might not have been conducted carefully, either.
58. The manuscript should be consistent in its stylistic choices, such as the number of spaces between sentences and the use of an Oxford comma.
59. Precision in the wording of empirical research manuscripts is more important than spicing up the manuscript with unnecessary variations in wording because multiple words to describe a single thing can create confusion for the reader. For example, a manuscript might use protest and demonstration to describe the same phenomenon, but the variety gained in alternating between protest and demonstration would not be worth the increase in ambiguity for readers who think that demonstration might be used to indicate a subset of protest activity.
60. Multiple direct quotations suggest that the writer has not taken the time or put forth the effort to summarize the quotations, so direct quotations should be omitted unless the precise wording of the quotation is critical to an argument or unless the precise wording is necessary to protect against criticism that the quoted words have been misrepresented.
61. There is no need to introduce an unfamiliar acronym that is not used again or is used again only sparingly: the space that the acronym saves is not worth the extra effort that the reader must exert to remember what the unfamiliar acronym represents.
62. The manuscript should be clearly written to reduce demands on the reader. For example, use of the former and the latter to refer to previous items of contrast often requires a reader to retrace a sentence or paragraph to determine the phenomena to what the former and the latter refer.
63. There is no need to introduce editorial comments, such as describing a cited publication as important or interesting; this is even less necessary when describing research being reported in the manuscript.
64. There is no need to provide a definition that is generally accepted in the literature or the nonacademic world. Terms should be defined only if the term is a specialized term that might not be widely known or if the term is defined differently than is generally accepted.
65. Informing the reader that something will be discussed later is often a clue that the manuscript should be restructured so that the concepts and material are presented in a more logical order.
66. Questions should be rephrased into statements, when possible, because the reader has little recourse to provide an answer. For example, the statement-question

… this manuscript addresses the question: what are the causes of poverty?

can be rephrased as

… this manuscript addresses the causes of poverty.

67. Manuscripts should be formatted to foster readability, which often means that the text should be aligned to the left with a ragged right edge, to avoid the inconsistent gaps of non-hyphenated full-justified text.
68. Subjects and their corresponding verbs are often better placed at the start of a sentence rather than after a lengthy introductory phrase that keeps the reader in limbo about the main idea of the sentence. Consider the alternative:

Rather than after a lengthy introductory phrase that keeps the reader in limbo about the main idea of the sentence, subjects and their corresponding verbs are often better placed at the start of a sentence.

69. Mixture of curved and straight quotation marks and apostrophes suggests that text has been uncritically copied from another location. Therefore, quotation marks in the manuscript should be revised to be consistently curved or straight.
70. Statistics that lack obvious interpretations should be placed in context. For example, if the manuscript reports that a policy is projected to increase a country's oil production by 1 million barrels per day, then the manuscript should indicate the relative effect of this increase.

CONCLUSION

Manuscripts consistent with the preceding advice are not guaranteed publication, but the risk of rejection is reduced. Even if a manuscript is intended for an initial review by departmental colleagues, preemptively revising the manuscript to address the preceding comments permits colleagues to focus on improving the substance of the manuscript, and thus producing a better manuscript that is more likely to be published and to contribute to our knowledge about political phenomena.

References

Bem, Daryl J. 2003. “Writing the Empirical Journal Article.” In The Compleat Academic: A Practical Guide for the Beginning Social Scientist, 2nd Edition, eds. Darley, J.M., Zanna, M.P., and Roediger, H.L. II, 171–201. Washington, DC: American Psychological Association.Google Scholar

Chen, Xiao, Ender, Philip B., Mitchell, Michael, and Wells, Christine. 2003. “Regression with Stata.” UCLA Academic Technology Services. www.ats.ucla.edu/stat/stata/webbooks/reg/.Google Scholar

Epstein, Lee, Martin, Andrew D., and Boyd, Christina L.. 2007. “On the Effective Communication of the Results of Empirical Studies, Part II.” Vanderbilt Law Review 60: 801–46.Google Scholar

Epstein, Lee, Martin, Andrew D., and Schneider, Matthew M.. 2006. “On the Effective Communication of the Results of Empirical Studies, Part 1.” Vanderbilt Law Review 59: 1811–71.Google Scholar

Gelman, Andrew, Pasarica, Cristian, and Dodhia, Rahul. 2002. “Let's Practice What We Preach: Turning Tables into Graphs.” American Statistician 56 (2): 12130.CrossRef Google Scholar

Guberman, Ross. 2010. “Five Ways to Write Like John Roberts.” Legal Writing Pro. http://www.legalwritingpro.com/articles/john-roberts.pdf.Google Scholar

Kastellec, Jonathan P., Lax, Jeffrey R., and Phillips, Justin H.. 2010. “Public Opinion and Senate Confirmation of Supreme Court Nominees.” Journal of Politics 72 (3): 767–84.CrossRef Google Scholar

Kastellec, Jonathan P., and Leoni, Eduardo L.. 2007. “Using Graphs Instead of Tables in Political Science.” Perspectives on Politics 5 (4): 755–71.CrossRef Google Scholar

King, Gary, Tomz, Michael, and Wittenberg, Jason. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44 (2): 341–55.CrossRef Google Scholar

Kirshner, Jonathan. 1996. “Alfred Hitchcock and the Art of Research.” PS: Political Science and Politics 29 (3): 511–13.Google Scholar

Long, J. Scott, and Freese, Jeremy. 2005. Regression Models for Categorical Outcomes Using Stata, 2nd Edition. College Station, TX: Stata Press.Google Scholar

Nyhan, Brendan, McGhee, Eric, Sides, John, Masket, Seth, and Greene, Steven. 2012. “One Vote out of Step? The Effects of Salient Roll Call Votes in the 2010 Election.” American Politics Research. Available online.Google Scholar

Postman, Neil. 1988. “The Educationist as Painkiller.” English Education 20 (1): 7–17.Google Scholar

Tomz, Michael, Wittenberg, Jason, and King, Gary. 2001. CLARIFY: Software for Interpreting and Presenting Statistical Results. Version 2.0. Cambridge, MA: Harvard University, June 1. http://gking.harvard.edu.Google Scholar

Article contents

Rookie Mistakes: Preemptive Comments on Graduate Student Empirical Research Manuscripts

Abstract

TITLE

ABSTRACT

INTRODUCTION

LITERATURE REVIEW

THEORY

HYPOTHESES

RESEARCH DESIGN

RESULTS

CONCLUSION

FOOTNOTES

REFERENCES

APPENDICES

TABLES AND FIGURES

IN-TEXT CITATIONS

MANUSCRIPT STYLE

CONCLUSION

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests