Free Advice about the Subject Pool

Around here, the Fall semester starts in just a few weeks. This means the MSU subject pool will soon be teeming with “volunteers” eager to earn their research participation credits. Like many of my colleagues, I have often wondered about the pros and cons of relying so heavily on college sophomores in the laboratory (e.g., Sears, 1986, 2008). Regardless of your take on these issues, it is hard to imagine that subject pools will go away in the near future. Thus, I think it is important to try to learn more about the characteristics of participants in these subject pools and to think more carefully about issues that may impact the generalizability of these types of studies. I still think college student subject pools generate convenience samples even if a certain researcher disagrees.

I did a paper with my former graduate student Edward Witt and our undergraduate assistant (Matthew Orlando) about differences in the characteristics of subject pool members who chose to participate at different points in the semester (Witt, Donnellan, & Orlando, 2011). We also tested for selection effects in the chosen mode of participation by offering an online and in-person version of the same study (participants were only allowed to participate through one mode).  We conducted that study in the Spring of 2010 with a total sample size of 512 participants.

In the original report, we found evidence that more extraverted students selected the in-person version of the study (as opposed to the online version) and that average levels of Conscientiousness were lower at the end of the semester compared to the beginning. In other words, individuals with relatively lower scores on this personality attribute were more likely to show up at the end of term. We also found that we had a greater proportion of men at the end of the term compared to the start. To be clear, the effect sizes were small and some might even say trivial. Nonetheless, our results suggested to us that participants at the start of the semester are likely to be different than participants at the end of the term in some ways. This result is probably unsurprising to anyone who has taught a college course and/or collected data from a student sample (sometimes naïve theories are credible).

We repeated the study in the Fall semester of 2010 but never bothered to publish the results (Max. N with usable data = 594). (We try to replicate our results when we can.) It is reassuring to note that the major results were replicated in the sense of obtaining similar effect size estimates and levels of statistical significance. We used the same personality measure (John Johnson’s 120-item IPIP approximation of the NEO PI-R) and the same design. Individuals who self-selected into the online version of the study were less extraverted than those who selected into the in-person version (d = -.18, t = 2.072, df = 592, p = .039; Witt et al., 2011: d = -.26).   This effect held controlling for the week of the semester and gender. Likewise, we had a greater proportion of men at the end of the term compared to the start (e.g., roughly 15% of the participants were men in September versus 43% in December).

The more interesting result (to me) was that average levels of Conscientiousness were also lower at the end of the semester rather than at the beginning (standardized regression coefficient for week = -.12, p = .005; model also includes gender). Again, the effect sizes were small and some might say trivial.  However, a different way to understand this effect is to standardize Conscientiousness within-gender (women self-report higher scores) and then plot average scores by week of data collection.

The average for the first two weeks of data collection (September of 2010) was .29 (SD = 1.04) whereas the average for the last three weeks (December of 2010) was -.18 (SD = 1.00).  Viewed in this light, the difference between the beginning of the semester and the end of the semester starts to look a bit more substantial.

So here is my free advice:  If you want more conscientiousness participants, be ready to run early in the term.  If you want to have an easier time recruiting men, wait till the end of the term. (Controlling for C does not wipe out the gender effect).

I would post the data but I am going to push Ed to write this up. We have a few other interesting variables that tried to pick up on careless responding that we need to think through.

Note: Edward Witt helped me prepare this entry.


Author: mbdonnellan

Professor Social and Personality Psychology Texas A &M University

5 thoughts on “Free Advice about the Subject Pool”

  1. Nice post, Brent. I think findings like this underscore the importance of using “true” random assignment to conditions among those who use subject pools for experimental and quasi-experimental research designs. I’m aware of investigators who either “tack” new conditions onto a study after the study has begun (i.e., later in the semester) and/or who try to fill out missing cells in their designs by recruiting at the end of the semester. These kinds of practices have the potential to create confounds that people might not fully appreciate.

  2. Yes, I think it’s important to try and continue studies using undergrad student samples throughout each semester, throughout each year if possible. This was much harder in Toronto, where very tight caps were placed on each study run and once you “use up” your participants, you are closed to further recruitment. Not only did this create a lot of problems with studies that needed larger numbers of participants (well, don’t all studies need larger number of participants?), but it undoubtedly had adverse affects on generalizability of findings across the population. Based on your findings, it almost seems like this should become standard reporting in articles – how issues of caps, etc. might have affected recruitment and those semesters/times within them that participants were recruited for the study.

    I am interested in what the argument of “a certain researcher” is – is there push-back on the idea that college students are convenience samples, or what?

  3. Twenge et al. (2008, Journal of Personality): “This is a term [convenience sampling] with an inexact definition because it is used differently across fields and is a matter of degree—perfectly random samples of people are virtually nonexistent. In psychology, it is most often applied to shopping mall surveys with low response rates, or to samples of one’s friends, and not to samples of college students from subject pools.” (p. 923)

    Twenge (2008, Social and Personality Psychology Compass): “The term ‘convenience sample’ it
    is most often applied to shopping mall surveys with low response rates or to samples of one’s friends, and not to samples of college students from subject pools like those used in cross-temporal meta-analysis.”

    Kali and I wrote this in one of our replies: “we note that it is standard practice to describe college student samples as convenience samples in methodological discussions (see e.g., Maxwell & Delaney, 2004, p. 49; Rosenthal & Rosnow, 2008, p. 214; Schwarz, Groves, & Schuman, 1998). Consider this quotation from a recent methods text designed for undergraduates, ‘‘the available pool of college research participants is usually called a convenience sample by social psychologists [italics in original]’’ (Dunn, 2009, p. 92). The point is that most college student studies conducted by social and personality psychologists were not designed to make generalizations to a defined population of young people (e.g., all college students or otherwise). It is a fundamental principle of research design that probability sampling methods are required when researchers want to make valid generalizations from a sample to a population (see e.g., Pedhazur & Schmelkin, 1991, p. 321).

  4. Sometimes misguided IRB policies make these kinds of self-selection effects even worse. At Harvard (and probably a lot of other universities), researchers are required to write up short descriptions of the study so that students can choose which studies to participate in. My colleague Jennifer Freyd wrote an editorial for the Journal of Trauma & Dissociation where she reviewed a bunch of research (including your paper) on how selection effects can operate within subject pools:

  5. Back in the day (1987 – 1992) when I was there, Jerry Wiggins typically ran a big ongoing self-report battery for students that took 2 sessions. Consistently, the best predictor of deadbeat status (failing to return for session 2) was low Agreeableness. Interestingly, Conscientiousness was not related to attending the second session.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s