A Philosopher's Blog

Twitter Mining

Posted in Ethics, Philosophy, Technology by Michael LaBossiere on July 11, 2014
Image representing Twitter as depicted in Crun...

Image via CrunchBase

In February, 2014 Twitter made all its tweets available to researchers. As might be suspected, this massive data is a potential treasure trove to researchers. While one might picture researchers going through the tweets for the obvious content (such as what people eat and drink), this data can be mined in some potentially surprising ways. For example, the spread of infectious diseases can be tracked via an analysis of tweets. This sort of data mining is not new—some years ago I wrote an essay on the ethics of mining data and used Target’s analysis of data to determine when customers were pregnant (so as to send targeted ads). What is new about this is that all the tweets are now available to researchers, thus providing a vast heap of data (and probably a lot of crap).

As might be imagined, there are some ethical concerns about the use of this data. While some might suspect that this creates a brave new world for ethics, this is not the case. While the availability of all the tweets is new and the scale is certainly large, this scenario is old hat for ethics. First, tweets are public communications that are on par morally with yelling statements in public places, posting statements on physical bulletin boards, putting an announcement in the paper and so on. While the tweets are electronic, this is not a morally relevant distinction. As such, researchers delving into the tweets is morally the same as a researcher looking at a bulletin board for data or spending time in public places to see the number of people who go to a specific store.

Second, tweets can (often) be linked to a specific person and this raises the stock concern about identifying specific people in the research. For example, identifying Jane Doe as being likely to have an STD based on an analysis of her tweets. While twitter provides another context in which this can occur, identifying specific people in research without their consent seems to be well established as being wrong. For example, while a researcher has every right to count the number of people going to a strip club via public spaces, to publish a list of the specific individuals visiting the club in her research would be morally dubious—at best. As another example, a researcher has every right to count the number of runners observed in public spaces. However, to publish their names without their consent in her research would also be morally dubious at best. Engaging in speculation about why they run and linking that to specific people would be even worse (“based on the algorithm used to analysis the running patterns, Jane Doe is using her running to cover up her affair with John Roe”).

One counter is, of course, that anyone with access to the data and the right sorts of algorithms could find out this information for herself. This would simply be an extension of the oldest method of research: making inferences from sensory data. In this case the data would be massive and the inferences would be handled by computers—but the basic method is the same. Presumably people do not have a privacy right against inferences based on publically available data (a subject I have written about before). Speculation would presumably not violate privacy rights, but could enter into the realm of slander—which is distinct from a privacy matter.

However, such inferences would seem to fall under privacy rights in regards to the professional ethics governing researchers—that is, researchers should not identify specific people without their consent whether they are making inferences or not. To use an analogy, if I infer that Jane Doe and John Roe’s public running patterns indicate they are having an affair, I have not violated their right to privacy (assuming this also covers affairs). However, if I were engaged in running research and published this in a journal article without their permission, then I would presumably be acting in violation of research ethics.

The obvious counter is that as long as a researcher is not engaged in slander (that is intentionally saying untrue things that harm a person), then there would be little grounds for moral condemnation. After all, as long as the data was publically gathered and the link between the data and the specific person is also in the public realm, then nothing wrong has been done. To use an analogy, if someone is in a public park wearing a nametag and engages in specific behavior, then it seems morally acceptable to report that. To use the obvious analogy, this would be similar to the ethics governing journalism: public behavior by identified individuals is fair game. Inferences are also fair game—provided that they do not constitute slander.

In closing, while Twitter has given researchers a new pile of data the company has not created any new moral territory.

My Amazon Author Page

My Paizo Page

My DriveThru RPG Page

5 Responses

Subscribe to comments with RSS.

  1. Ian James said, on August 8, 2014 at 3:02 pm

    …while a researcher has every right to count the number of people going to a strip club via public spaces, to publish a list of the specific individuals visiting the club in her research would be morally dubious—at best.

    Why so? The only difference would be that some people – those on the street at the time – would have direct sensory evidence of who entered the club while those reading the research would acquire that same evidence indirectly. If the method of data capture is accurate enough then the data in both instances are the same.

    The question of morality, being a matter of opinion and invariably flawed to some extent, resides not with the data but with the act of entering a strip club. Would the psychological Force of knowing a greater number of people being potentially aware of the act deter any particular individual from entering? If so, were all information accessible to all the people all the time, would it lead to a less ‘morally dubious’ society?

    The best one can do is not to have any prejudices or preconceived ideas or principles – oh, moral principles, fixed codes of conduct, “what must be done” and “what must not be done,” and preconceived ideas with regard to morals, with regard to progress, and then all the social and mental conventions – there’s no obstacle worse than that.
    ~ The Mother, The Mother’s Agenda.

    • Michael LaBossiere said, on August 8, 2014 at 3:25 pm

      That is a good question.

      As you correctly point out, the researchers would simply be presenting information from a public space. As such, there would be no violation of privacy, etc. As such, it could not be immoral on the grounds of violating privacy.

      So, the ethics of the matter would need to hinge on research ethics. That is, whether or not it would be moral for a professional researcher to publish such a list in a professional context (like a journal article). My leaning is that publishing such a list would generally be against professional ethics, mainly because it would seem to be gossip or shaming that would lack scientific merit. If the list did have relevance to the research, then it could be warranted.

      That said, my inclination could be in error or perhaps fueled more by etiquette or professionalism than by ethics. That is, publishing such a list in a professional journal would be impolite and unprofessional (but not unethical). To use an analogy, using sexy pictures of strippers to “punch up” an academic journal article would not be immoral, but would presumably be unprofessional (but would probably help with selling issues).

    • T. J. Babson said, on August 8, 2014 at 11:38 pm

      How would you determine *who* entered the strip club? Facial recognition? Would you be liable for damages if you misidentified someone?

      • Michael LaBossiere said, on August 12, 2014 at 4:39 pm

        There would be many options. For example, a researcher could take photos then send her grad students to scour Facebook. Or, as you suggested, facial recognition software (perhaps courtesy of Facebook). I would suppose that a person could be subject to a lawsuit for listing someone incorrectly-this might fall under a libel or slander suit.

      • Ian James said, on August 13, 2014 at 9:29 am

        The Devil’s in the detail as usual…
        But Devils make excellent Griefers who work in mysterious ways, His wonders to perform.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: