Recently I've been trying to be a little bit braver and make an effort to correct small mistakes like typos or phrasing errors I see when I'm on Wikipedia. This is going okay, although I still feel pretty anxious about anything more substantial than adding some commas and far too anxious to do anything of actual weight. I have been bold enough to remove minor factual inaccuracies, though! Like, deleting a sentence type of minor.
More often I find myself feeling stuck and frustrated because I see something that is obviously wrong or poorly put together and needs a substantial overhaul, but I feel completely unequipped to do it myself because Wikipedia is governed by an extremely elaborate arcane tome of rules and guidelines and formats that I find completely inscrutible as well as an equally arcane and mysterious etiquette, and I am aware that people in the Wikipedia Editing Subculture tend to be pretty rigid about these rules.
Of particular frustration to me is something I see not only on Wikipedia but on the internet more broadly, from random posts to news articles: people misrepresenting academic papers. Through lying, misunderstanding, inaccurate paraphrasing, uncritical parroting, skimming, whatever. It frustrates me a lot because I know, and I know the misrepresenter knows, that most people aren't going to check.
Por ejemplo: I remembered a person I'm following on Tumblr made reference to "reparenting" or "self-parenting" (I forget which) as a tool for helping process/recover from childhood trauma, and while the word is relatively self-explanatory I was curious to learn more about what that is, how it works, the particulars and such. So, of course Wikipedia is my go-to, since you can't use search engines for anything.
Here is the wikipedia article for "Reparenting." It's a bad article. Under-cited, awkward phrasing, vague and confusing. Even the opening paragraph isn't good.
Reparenting is a form of psychotherapy in which the therapistactivelyassumes the role of anew orsurrogate parental figure for the client,in order to treat psychological disturbances caused by defective, even abusive, parenting. The underlying assumption is that all mental illness results principally fromsuchparenting, even including schizophrenia and bipolar disorder.
I mean, what a fucking mess of a paragraph. Reads like it was written by a high schooler. Even setting aside that it is wrong and omitting important information, the structure is not good. "Defective" parenting? "Even including" schizophrenia and bipolar disorder?
I've added a multiple issues template to the page since I don't feel comfortable making major changes to articles and this one need basically a complete rewrite and restructuring.
Anyway, the specific thing I wanted to talk about is under the highly-dubious "efficacy" section.
Lilian M. Wissink conducted a study to determine self-reparenting's effect on self-esteem. Human subjects were divided into two groups, a treatment group that consisted of 10 people, and a control group that consisted of 12 people. The sample group was made up of students and staff from a rural university, and only the treatment group went through self-reparenting treatment. The subjects were given questionnaires to measure their level of self-esteem before and after treatment. The result showed that subjects that received self-reparenting had significantly increased levels of self-esteem while the control group had decreased levels of self-esteem.
Top-level problem one: this presents a single study as evidence of efficacy of this type of therapy. Especially in psychology, this is no good; you really need more than one study. People are complicated, psychological studies are - to be frank - often quite poorly put-together, it's extremely difficult to have good "control groups" for psych studies because there are so many factors involved in being a person. A single study can indicate something, maybe, kinda. But if it hasn't been replicated and there's no other studies indicating the same or a similar thing, it doesn't mean much at all.
Top-level problem two: "Reparenting" as a therapy method is a therapist taking on the role of a parent in a very literal sense; they swaddle you, feed you from a baby bottle, that type of shit. So... how is any type of self-reparenting comparable at all? What could it possibly indicate about the efficacy of reparenting, when it is a fundamentally completely different thing? It has the same name, but it is not actually the same.
The APA Dictionary of Psychology has two very different definitions of reparenting;
1. a controversial therapeutic procedure used to provide a client with missed childhood experiences. The client, who typically has severe problems, is treated as a child or infant; for example, they may be fed with a spoon or bottle, hugged, sung to, and provided with other forms of nurturance. Reparenting has been unethically used to justify recreation of the birth process by wrapping a client in a blanket and having them struggle to get out.
2. in self-help and some forms of counseling, a therapeutic technique in which individuals are urged to provide for themselves the kind of parenting attitudes or actions that their own parents did not provide.
The Wikipedia article about reparenting does not mention this. This study is about the second type of reparenting, which has different assumptions, different frameworks, and different methods.
This is a bit like citing a study about the therapeutic benefits of cock and ball torture as evidence for the efficacy of cognitive behavioural therapy because they're both called CBT, or research about migraine aura as scientific evidence of the existence of psychic auras. It's disingenuous and incorrect.
Okay. Now let's get into the specifics.
Lilian M. Wissink conducted a study to determine self-reparenting's effect on self-esteem.
Bzzzzt. You do not conduct a psychological study to determine something. You conduct it to test, investigate, research, explore. This kind of overconfident language suggests intentional misrepresentation or fundamental misunderstanding of what a single psych study is capable of meaning. Especially when it's about a study this small.
Human subjects were divided into two groups, a treatment group that consisted of 10 people, and a control group that consisted of 12 people.
First: it is funny of them to specify "human subjects" like this. Anyway. Here's something I don't like: the paper contradicts itself about the number of participants in the study. In the subjects section it says:
The treatment group consisted of ten participants, including eight females and two males ranging in age from 18 to approximately 50 years. The control group, who participated in the program in August, consisted of eight females and four males of similar age. They all completed the program.
But then in the results section, it says:
Ten out of twelve participants in the treatment group completed the six-week program and the questionnaires. Four participants at different times missed one session because of personal reasons. Six participants attended each session. Although 14 people in the control group completed and returned the questionnaires, two questionnaires were not used because they were incomplete.
So, no, they did not "all complete the program." There weren't actually 10 and 12 participants, there were 12 and 14. Four participants didn't complete the program and were retroactively considered not part of the study. This is bad form and misleading. This matters: the efficacy of a treatment has to take into account people who cannot complete that treatment.
Say you have a medication. You give this medicine to 20 people for a 10-week study. The side effects of this medication are so bad that 15 people stop taking it; it makes their symptoms worse, they have to go to the hospital, it's horrible. Only five people actually complete the whole 10 weeks. Those five people who finished didn't experience any side effects and had a great time, though!
Should you say, then, that your study showed 100% of people love this medication and 0% of participants experienced any negative side effects? No, because that would be a big fat lie. You might tell it anyway if you want to sell your medicine, though.
It is also important to clarify here that the subjects were not "divided into" two groups; this phrasing implies a more typical study structure with randomized groupings divided by the people running the study. This is not how this study was organized. Per the paper:
Subjects were given a choice of two programs: one in May (treatment group) and one in August (control group). Unfortunately, the practicalities of the research precluded the possibility of using randomized allocation.
It should also be noted that the two groups were not equivalent to each other or representative samples of the population, nor is there any information about whether or how many participants in the study actually had adverse or lacking childhood experiences - the thing this treatment is meant to actually treat.
We're also lacking in other demographic information about the subjects, which matters! This is a common mistake in psych studies and it drives me crazy. There's a lot of factors that might influence how somebody experiences a psychological study. How many are psych majors? How many have experience with reparenting? How many are white? Disabled? Middle-class? Were any applicants rejected? Why or why not?
If the subjects aren't distributed equally or randomly, you could very well end up with a control group of people who are all living in poverty and a treatment group of all rich people, for example, or a treatment group of all white people and a control group that's all Black, or any number of other things that could completely invalidate the study.
The sample group was made up of students and staff from a rural university, and only the treatment group went through self-reparenting treatment.
Unnecessary to include on Wikipedia. Obviously only the treatment group went through the treatment. That's how studies work.
The subjects were given questionnaires to measure their level of self-esteem before and after treatment.
And this is an inadequate description of the actual process of the study! How it worked is that the treatment group met weekly for six consecutive sessions for two-and-a-half hours, and the control group did not meet at all. The control group only filled out the pre-treatment and post-treatment self-esteem questionairres.
When you don't provide the control group with weekly sessions, you aren't studying the effect of this specific treatment method. You haven't controlled for placebo effect, the effect of simply having a weekly scheduled activity, or the effect of having a supportive group of people you can rely on!
The result showed that subjects that received self-reparenting had significantly increased levels of self-esteem while the control group had decreased levels of self-esteem.
First off, you mean "results." Secondly, the control group did not show decreased levels of self-esteem. The study says "no changes occurred in the control group" and "the control group perceived that their level of self-esteem stayed the same." This is simply a mistake.
Secondly, the word "significant" is a pretty big sticking point in science communication I think. Because in most circumstances, the word significant means "large and/or important." If I said I was paid a significant amount of money, that means I got paid a lot of money. In statistics, however, "significant" means - more or less - "not nothing." It is not within the margin of error; the result is significant enough to not be attributable to random happenstance. It does not mean "large."
This is a way that many people get away with laundering misinformation without technically lying!
There are some writing problems with the results section of the paper as well. For example, it refers to the treatment group and control group as "groups" but also refers to the pre-study and post-study questionairre results as "groups," making it difficult to parse what they're trying to say.
Another research red flag: when acknowledging drawbacks and limitations of their study, the author is defensive and dismissive. Almost every limitation is accompanied by a "HOWEVER," which insists that this drawback is totally not actually a problem.
For example, it could be argued that the change in the treatment group may have occurred merely because participants were in a group that met for six weeks. Thus, a more rigorous comparison could have involved the control group meeting for a similar period of time in the absence of any therapeutic interventions (Drew & Hardman, 1985). However, ethically and practically this would not have been appropriate.
They do not elaborate on this. Why would it not have been appropriate? What are the ethical drawbacks of actually doing the study? If you do not have the practical resources to actually have a proper control group, why do the study anyway?
AdditionalIy, it cannot be ruled out that posttest scores were higher because of demand characteristics. That is, participants may have answered the questionnaires more favorably because of loyalty to the therapist, knowing that the aim of the program was to increase self-esteem. However, attempts were made to control for this by maintaining confidentiality and by having participants fill in the post-questionnaires in the privacy of their own homes rather than with other group participants at the last group meeting.
"Loyalty to the therapist" is not the only reason that people would be inclined to answer more favorably on the questionairre. The reason that you have to be very careful and skeptical about surveys and questionairres is because the way people respond is extremely sensitive to what they think you want from them, what they think the survey is for, and how you phrase and contextualize the questions.
Self-assessment is also a very tricky research tool, because people aren't always good at assessing themselves and how they assess themselves is highly variable depending on their mood and other circumstantial factors like embarrassment, social taboos, et cetera. People are also not very good at being consciously aware of what factors are influencing their mood or thinking.
For self-esteem, consider how you might rate your self-esteem from 1-10 in the following scenarios:
- It is 6:30AM, you are hungover, and you are going to be late for your 10-hour shift at McDonald's.
- It is 1:00PM, and you are sitting on a bench in the sunshine and enjoying a pastry you baked for yourself.
- It is 9:00PM, and your lover is hugging you from behind and telling you they've never been more in love with you.
- It is 2:45AM, and you have just successfully diffused a fight that was getting ugly at your favorite nightclub. The employees working all thank you profusely. One of them calls you a hero. They offer you free drinks for the rest of the week, whatever you want is on the house.
None of these numbers would be a lie. They just also would not necessarily be indicative of your general sense of self-esteem.
We don't know in what environment or on what timeline the participants filled out their questionairres, since they went home with them. Did they read them carefully? Did they rush through it? Were they comfortable and happy, or tired and irritable? It is a mystery.
Another potential problem is that there may have been certain demand cues in the advertisement and in the information given to participants prior to signing up for the program. However, the main objective was to give potential participants information for ethical reasons.
Balancing informed consent with not biasing your study participants is tough. This is a balance that every single psych study has to strike, and is not unique to this study. I do think that this study probably did it very poorly, since it seems like they did everything else poorly and since they don't actually offer a genuine "however," that might mitigate the effect.
I want to emphasize again that the sample size here is very small. They started from a small sample size, and out of the treatment group only six participants actually went to every group session. That is not a lot of people! That is, in fact, so few people that it is not especially useful. With a sample size that small your study might be interesting or suggestive, but it does not actually rise to the level of "evidence."
Even if you take it as evidence, it's evidence in support of a positive effect of having weekly group therapy, not reparenting, because they didn't actually have a control group.
And even if you take it as evidence of the efficacy of reparenting, it's evidence of the APA's second definition of reparenting, which is not the form of reparenting the Wikipedia article claims to be about.
Psychological studies deserve rigorous critique and don't often receive it. Psychology is a soft science riddled with bad studies based on bad studies based on bad ideas. Garbage in, garbage out. It can be very interesting - even bad studies are interesting to read and do say something about something - but it's also really frustrating. It's frustrating both how often people push bad science and how often people accept bad science just because it was published in a journal or done by scientists. Psychology is one of the fields most affected by the replicability crisis as a result of this proliferation of bad science.
This study and its deployment on Wikipedia is a great example of junk science that gets away with being junk because it's a citation in a peer-reviewed journal. It's a reputable source, so it gets to be on Wikipedia. How many people read this thing on Wikipedia and assumed, well, it must be true, it's on Wikipedia and it has a citation? How many content mills, articles, and forum posters repeated these claims without sourcing them? Over time, "one study said" becomes "studies found" which becomes "scientists agree" which becomes a general consensus, common sense, something "everybody knows." Gyaaaaaarrghhhh!!!!!!
Off the top of my head I can think of four other instances of a similar thing going around - academic writing being misrepresented and that misrepresentation becoming "knowledge" - that I've considered whinging about as well... maybe I'll do that. I've also considered redlining some particularly dreadful pieces published in the OTW's fan studies journal, because they publish a lot of trash that has clearly not been properly proofread. I miss being in school and having an outlet for discussing things like this in a way that felt worthwhile... sighs.