It all started with a tweet.
“Fresh wave of bad reviews for yankee candles,” wrote @drewtoothpaste in December 2021, in a post that included screenshots of Yankee Candle reviews on Amazon.
“This candle had no scent when lit. Very disappointed!” read one one-star review.
“No scent very disappointed and embarrassed as this was a gift,” read another.
This wasn’t a new discovery: Over the past year, other Twitter users had noticed reviews on Amazon claiming that Yankee Candles had “no smell,” and wondered if there was a connection with COVID-19, which can cause anosmia, or loss of smell.
Based on the December 2021 tweet, the trend had continued. And when Nicholas Beauchamp, assistant professor of political science at Northeastern, noticed it, he couldn’t help himself.
“It wouldn’t be that much harder to do this properly,” he decided.
Beauchamp took the Twitter joke and turned it into a full paper—presented at this week’s “International AAAI Conference on Web and Social Media”—that examines the clear link between the “no smell” reviews and upticks in COVID-19 cases.
The work follows a rising trend of researchers using online clues known as “breadcrumbs“—such as Google searches for restaurants that deliver chicken noodle soup—to help predict the next surge in COVID-19 cases. In theory, if one follows trends like this, it could give us information that other data, like the number of hospitalizations in a given period, cannot.
An analysis of Yankee Candle reviews was one way to put this idea into practice. After the first tweet went viral, Harvard University’s Katie Petrova discovered that reviews of scented candles had dropped nearly one full star in 2020, and she noticed a sharp rise in reviews mentioning “no smell.”
Beauchamp also noticed the uptick when he charted the reviews over the past four years. But there was a problem: the surges were happening in the winter months when COVID-19 cases, cold cases and candle purchases all tend to have a natural uptick. Could he prove that the rise in “no smell” reviews was due to COVID-19 and was not just a random correlation?
For a data analyst like Beauchamp, it was small potatoes. He downloaded a Chrome extension that grabbed 9,837 Amazon reviews of the top four Yankee Candles from 172 weeks between 2018 and 2021, then let it run while he watched TV. He then calculated the percentage per week of reviews that mentioned “no smell” or “no scent,” and plotted that over COVID-19 cases over the same period. Finally, he controlled for the seasonal upticks of both candle purchases and illness.
Beauchamp found that the hunch correlation really was causation: after controlling for seasonality, it appeared that COVID-19 cases predicted more “no smell” reviews. For every 100,000 new COVID-19 cases per week, he found, “no smell” reviews increased by a quarter of a percentage point in the next week.
The COVID-19 cases preceded “no smell” reviews, not the other way around, making the information less valuable for public health purposes. When Beauchamp added the past six months, however, there was a reversal.
“When I added those next six months, which includes the Omicron wave, it is now predictive in the sense that, in theory, the reviews give us a slight heads up,” he says. He corroborated the results by conducting the same analysis with perfume reviews.
Beauchamp had verified a potential new source for public health information. Ironically, though, the discovery may also have been bad for the data.
People who leave the “no smell” reviews may not know that they are ill, which can make the data quite useful. But when the tweets about the phenomenon went viral, reviewers became more self-aware. Beauchamp wrote in his paper that the data was “contaminated with self-aware disavowals of Covid” after the tweet went viral, though this did not seem to last long.
“Virality is so ephemeral,” he says, “which is good for science purposes.”
Despite the feat, Beauchamp is self-deprecating when he talks about the project. “It’s not revolutionizing the study of either COVID or smell or social media,” he says.
But it could be a welcome addition to the data analyses that have been helping us understand COVID-19 trends. Loss of smell is a distinctive indicator of the disease, but smell isn’t talked about much online, he says, except in small corners of the internet like candle or perfume reviews.
In addition, the lack of awareness among reviewers means the data “make for a particularly clean signal unconfounded by expectations,” according to the paper, helping us track the trend of the virus before people even know they have it.
Beauchamp might be more interested in the humor of it. Maybe, he says, the joke tweet can even tell us something about class, and the relationship between well educated liberals on Twitter and unaware candle reviewers.
“It starts with a viral tweet, and it ends with a punchline,” he says. “Mostly, I consider the paper to be an extended joke tweet.”
Written by Jessica Taylor Price.