The Ethics of Data Science
The Ethics of Data Science
The Ethics of Data Science
Foundations
This course is also part of the core sequence in the Data Science program. In its role in this
context, the course provides an introduction to social, ethical, and moral issues surrounding
data and society. It blends social and historical perspectives on data with ethics, policy, and
case examples to help students develop a workable understanding of current ethical and policy
issues in data science. The course examines the ethics and morality of studying human
subjects, documenting workflows, and communicating results. We will critically assess issues
surrounding privacy, surveillance, discrimination, transparency, responsibility, and trust
throughout the data lifecycle - from collection and creation, to storage and analysis, to the
application and sharing of data.
1
2. Analyze the data-driven decisions of political and media elites in the American political
context with an eye toward their consequences on the attitudes and behavior of the
American public
3. Identify where politically-relevant data collection, analytics, and algorithms depend on
human judgment or assumption in order to assess the ethical, privacy, and policy
concerns of a given course of action in data-driven political decision-making
4. Apply your understanding of the course material to real world situations.
The subject matter of a course serves as a tool to help you develop skills to become a better
thinker and communicator. Therefore, to serve their function in a liberal arts curriculum, these
courses are designed to facilitate critical thinking and communication skills. Long after you’ve
forgotten what “non-response bias” is, for example, I hope that what will endure is improvement
in these foundational learning goals:
Assess and synthesize information. Use the course material to arrive at informed
opinions.
General Expectations
I recognize that we are immersed in unprecedented global, societal, and local uncertainty (more
on that below). I aim to make this experience worth your time, and I ask for the same in return. I
expect that you will stay on top of the course material according to the schedule provided and
come prepared to participate actively in our synchronous sessions. I am eager to help you
succeed. If you take your education seriously and communicate with me (with plenty of advance
notice) about obstacles or challenges that may affect your performance in the course, I am
happy to work with you to find mutually agreeable solutions. It is imperative in a remote learning
environment to keep lines of communication clear and open. If you discover that you are
struggling, I want to hear about it so that we can work together to get you back on track.
Writing Expectations
My standards for writing are high and I expect students to produce concise and precise prose.
Because of this, I do my best to make my expectations clear at the outset of the course and
offer you low stakes opportunities to get feedback on your writing early in the semester. I
encourage you to take advantage of the fact that I like to help students improve their writing;
office hours are a great time to meet with me to discuss your work before or after due dates. I
also recommend that you consult at least one of the following writing guides if you are
consistently receiving negative feedback about the quality of your writing.
Strunk Jr., William I. and E.B. White. 1999. The Elements of Style, 4th Edition. Longman.
Course Structure
This course will keep you busy! You are receiving three credit hours for the course, and you
should anticipate spending between nine and 12 hours a week on this course each week. (The
2
standard calculation is one contact hour and two-three hours of additional work for every credit
hour.) You will need to manage your time efficiently and responsibly.
Course Units
The experience will be divided into five units. We will have a different substantive focus for each
unit, but each unit will follow a similar schedule, blending together several different types of
activities. These include:
Asynchronous Components
Each unit will include material through which you will work independently. This will include
approximately 100-120 minutes of asynchronous content designed to substitute for one class
session a week (two class sessions a unit). This will also include ~4 hours of reading (or ~2
hours of reading/week). Some units will have slightly more, some will have slightly less.
Course Readings: You will be assigned a set of academic journal articles or book
chapters to read for each unit. I strongly recommend you read them in the week in which
they are assigned.
Lectures: I will prerecord a set of lectures (each ~15 minutes in length) that connect the
course readings for that week to the broader academic literature on the topic. You need
to watch these in advance of the discussion at the end of the week. I strongly
recommend you read them in the week in which they are assigned.
Other Multimedia: Some units will include podcasts, documentaries, or video clips of
speakers.
Unit Note Outline: a structured handout I provide to help you synthesize together the unit
material. While I will not collect or grade these, the note outlines will be essential for you
as you prepare for the midterm and final exam.
Pause and Connect Group Blog: each unit will include two or more activities where you
connect with your peers to share your ideas. This will utilize the group blog feature on
Blackboard.
Clarification Discussion Board: a discussion board where you can post questions and
clarifications you need from the readings, lecture, and supplementary course material.
Synchronous Components
Although we will not be meeting in person, I hope to use a combination of Zoom and Google
Docs to structure our remote synchronous sessions in a worthwhile way.
Group Office Hours: I encourage you to attend synchronously on Tuesdays during our
assigned class time, but with a handful of exceptions, attendance on Tuesdays with be
OPTIONAL. We will convene at 12:30 for you to ask questions about the course
materials or for us to talk about connections between the course material and events in
the news. When there are no more student-driven questions or discussion topics, we'll
break. I will record the session and make it available after the fact. Students who would
like to meet to work on their group projects can stay on the class Zoom or move to a
different Zoom. You should keep your calendars open on Tuesdays for the duration
3
of the semester because you will occasionally be required to attend. Those dates
are noted on the syllabus but may change.
Weekly Discussion: You are required to attend the weekly course session during our
assigned time on Thursdays. We’ll meet as a whole class—sometimes breaking out to
smaller groups—to address questions you have about the course material and to work
together to synthesize what we’re learning. Attendance is REQUIRED and I will NOT
record the session.
Week 2 Complete Attend group Reconcile your Attend class Polish unit
weekly course OH lecture and note outline
readings and reading notes
take notes Meet with on the unit Work on long
long-term note outline term course
Watch/listen to project group assignments
asynchronous Complete
materials Pause &
Connect
activity
Course Assignments
All course assignments are described in detail in separate handouts. Some assignments have
interim, ungraded deadlines.
4
Midterm (20%) March 25th
Final (25%) May 14th
Course Policies
Grades
I reserve A’s for excellent work. B’s are for solid, above-average work while C’s are for work of
average quality. D’s indicate work that is below average, and F’s indicate work that is
substantially below expectations.
100-93 A 89-87 B+ 79-77 C+
92-90 A- 86-83 B 76-73 C
82-80 B- 72-70 C- etc.
Attendance
Class attendance in synchronous sessions is required, though participation points will not be
awarded simply for showing up. You have two unexcused absence for our Thursday
synchronous sessions. Each unexcused absence after the second will result in a two-point
deduction in your participation grade. Habitual tardiness to class bothers me and extreme cases
can affect your participation grade; if you anticipate that you will be late with some frequency
(for example, you have a class right before ours), please make me aware of the situation.
If you, a member of your family, or someone you live with contracts COVID-19, please alert me
immediately. I will accommodate these attendance policies in the event of your own illness or
the illness of someone in your family or residence.
Late Policies—Exams
The only valid reasons for missing and rescheduling an exam are due to a university-approved
reasons (a documented illness, religious observance, death in the family or similarly grave
family emergency, or a W&M-sponsored commitment that you have discussed with me before
the assignment is due), or, during final exams only (as W&M allows), you have several exams in
a row. If you are sick enough to miss a test, you are sick enough to go to the doctor. You must
1) email me before the exam to let me know about your illness; and 2) make every effort to take
the test in the most expeditious manner possible. I prefer to give students the benefit of the
doubt, but if I perceive that you are taking advantage of the situation, you will be subject to a
penalty.
If you miss an exam for another reason, you can take a makeup exam for which the maximum
grade you can earn is a C (75%).
For the individual writing assignment, you can formally request an extension of up to one week
in length. Up to two weeks before the due date, you will receive only a 1% deduction on the
assignment for making the request. An additional 1% deduction is added each day you delay
5
your request within the two-week window. Therefore, if you ask for a change on the day the
assignment is due, the maximum grade you can receive is an 85%.
Because I give you this option in advance, I do not grant extensions without penalty on
assignments except in the case of the university-approved reasons outlined above. (The earlier
you let me know about a situation that may affect your ability to turn in your paper on time, the
better.) Computer malfunctions will not be considered a legitimate excuse for the late
submission of assignments, so plan accordingly.
All assignments should be submitted to Blackboard by 11:59 p.m. on the date they are due.
Assignments turned in after the deadline are subject to a 5% penalty for each day (or fraction
there of) they are late until the maximum grade possible is a 60. Weekend days count. So, if you
turn in an assignment the day after the assignment is due (even at 1:00 a.m.), the maximum
grade possible is a 95. An assignment turned in two days after the due date will receive a
maximum score of 90; three days late will receive a maximum of 85; three days late, 80, etc. If
you are submitting your paper late, you must email it to me for time-stamping purposes.
I will not accept assignments after the Friday of the last week of regularly scheduled
classes.
Extra Credit
Extra credit will rarely, if ever, be available. Consequently, it is imperative that you do your best
on each and every assignment.
Grade Appeals
If you are dissatisfied with your grade on an assignment, you can choose between two options.
If you want to talk about your work and discuss ways you can improve on future assignments, I
am happy to meet with you in office hours or by appointment. You cannot appeal your grade
after we have this conversation. Therefore, if you are positive that you want to appeal your
grade, you need to write a one-page double-spaced explanation of why you think your work
merits a higher grade. After reading your appeal, I will re-grade your assignment. Your grade
can go up, stay the same, or go down. We will then schedule a meeting to talk about your work.
6
The College of William & Mary has had an honor code since at least 1779. Academic integrity is
at the heart of the College, and we all are responsible for upholding the ideals of honor and
integrity. I assume that students take the Honor Code and plagiarism as seriously as I do and
that academic misconduct will not become an issue in this class. The student-led honor system
is responsible for resolving any suspected violations of the Honor Code, and we will report all
suspected instances of academic dishonesty to the honor system. The Student Handbook
(www.wm.edu/studenthandbook) includes your responsibilities as a student and the full Code.
Your full participation and observance of the Honor Code is expected. To read the Honor Code,
see www.wm.edu/honor.
I will make clear for the exams what resources/aids you may use while still being compliant with
the Honor Code. I do NOT intend to use Honorlock or any other proctoring technology for the
exams.
Misc. Policies
“These Unprecedented Times” Accommodations
We are in the middle stages of a global pandemic. Our country is deeply politically fractured.
Collectively we are grappling with a much-needed and overdue racial reckoning. In combination,
these factors greatly increase the uncertainty and discomfort we experience in our daily lives.
How do we account for this in our learning experience?
First, I recommend that you appropriately define what success this fall will look like to you given
these circumstances. Some of you see major benefits in remote learning, while others find it
hard to stay motivated or connected. Some of you will be deeply personally affected by COVID-
19, some of you will simply be deeply inconvenienced. Some of you will find that real world
matters are more important to you than academic matters, while some of you will find solace in
pouring yourself into schoolwork. Acknowledge your reality and set your goals accordingly.
Second, communication with me is imperative. I’d love to hear more about your priorities, goals,
and circumstances when we meet in office hours at the beginning of the semester. You must
keep me posted as situations develop this fall. I can’t help you if I don’t know what is going on in
your life that will affect your performance in this course. I will be as flexible as I can, and the
earlier you communicate the better.
Third, I expect you to comply with the University’s Healthy Together Community Statement.
Although we will not be meeting face-to-face this semester, you stand a better chance of
keeping yourself, the people you care about, and our broader community safe if you abide by
these policies and norms.
Technology
I prefer that students keep their cameras on during synchronous sessions. I understand there
are reasons this may present a challenge for some students. If you need to keep your camera
off, I will respect your choice. In return, I ask that you compensate in some way to signal that
you are as engaged with me, your peers, and the course material as are the students with their
cameras turned on: writing longer or higher quality blog posts or using the clarification
discussion board on Blackboard; making use of the chat function or the reaction buttons on
Zoom; or coming more regularly to individual or group office hours.
7
It will be obvious to me and your peers if you are doing something on your computer other than
engaging in class. Please be conscious of the signals you are sending to me and to your peers
with your body language and eye contact.
Turn off your cell phones before signing into a synchronous session. If you are expecting an
important call, tell me before class, keep your phone on vibrate, and turn your camera off when
you receive the call.
Omnibus Project
You will be required to participate in the Social Science Research Methods Center’s Omnibus
Project. The project is a collaborative subject pool for survey and experimental research
conducted by students and faculty. To help introduce you to the field of political science, you will
have the opportunity to participate as a subject in one or more research projects this semester.
An alternative writing assignment will be offered to students who do not want to participate in
the Omnibus Project or are not old enough to participate. The total time required will be
approximately one hour.
Contact Policies
My preference is to meet during office hours on Zoom for all substantive questions. Email is my
preferred form of communication for anything about which I will need a record (an excused
absence, setting up a meeting time outside of office hours, etc.) If we need to schedule a phone
call, I will email you my cell number.
Add/Drop Policies
The add/drop deadline is Friday, February 5th. The last day to withdraw is Monday, March 29th.
Disability Services
William & Mary accommodates students with disabilities in accordance with federal laws and
university policy. Any student who feels they may need an accommodation based on the impact
of a learning, psychiatric, physical, or chronic health diagnosis should contact Student
Accessibility Services staff at 757-221-2512 or at [email protected] to determine if accommodations
are warranted and to obtain an official letter of accommodation. For more information, please
see www.wm.edu/sas.
Course Materials
Books to Purchase
Wachter-Boettcher, Sara. Technically Wrong: Sexist Apps, Biased Algorithms, and Other
Threats of Toxic Tech
Posted on Blackboard
8
Course Schedule
Note: I reserve the right to make modifications to this schedule, but I will not increase the workload.
Unit #1
People as Data: Data Generating Processes from Humans
Unit Introduction
In this unit, we will study the way that people make data as it relates to their political attitudes and behaviors. Traditionally,
quantitative social scientists have collected data about people using survey research. However, the advent of digital technologies
and the Internet have created a plethora of new ways for people to generate data, whether they are aware they are doing so, or not.
After we assess the evolving methods used to collect systematic information about people’s political behavior, we will think about the
role of human choice and bias in the way that data are collected, and how we should interpret data given what we know about the
processes that generated it. Finally, we’ll consider some of the privacy implications involved in the creation of human-generated “big
data.”
Core Questions
1. How have the data collection procedures in the field of political behavior evolved over time?
2. What are the strengths and weaknesses of unobtrusively collected data? Of social media data in particular?
3. What principles should guide the way we collect “big data” in the digital political sphere?
4. What privacy concerns are raised with the collection and analysis of “big data” in the digital political sphere?
5. In what ways are biases built into the platforms and algorithms that are used to collect and analyze behavior in the digital
political sphere?
Core Outcomes
By the end of this unit, you should be able to:
CO1 Explain the key terms and concepts from the Unit 1 Note Outline
CO2 Analyze the role of human choice and bias in the creation and interpretation of political data generating processes
CO3 Identify the ethical considerations and privacy concerns involved in the collection of politically-relevant digital trace
data
CO4 Formulate questions for our speakers about the processes, decisions, and interpretations they make in their daily
work
9
Asynchronous Content Tuesday Thursday
12:30-1:50p.m. 12:30-1:50p.m.
Week 1 1.1 Foundations of Data Collection in American Political Behavior 2/2: Group OH 2/4: Discussion
(February • Atkeson 2010. “The State of Survey Research as a Research Tool in
1st) American Politics.” (12 pages)
Week 2 1.5 Garbage In, Garbage Out 2/9: Group OH 2/11: Discussion
(February • Read Technically Wrong: Sexist Apps, Biased Algorithms, and Other
8th) Threats of Toxic Tech by Sara Wachter-Boettcher (~200 pages, but easy,
light reading)
• Listen to Data Science Ethics podcast (16 minutes):
http://datascienceethics.com/podcast/collect-carefully/
10
Unit #2
Campaigns and Big Data
Unit Introduction
In this unit, we will study the way that political campaigns in the United States use data to communicate with voters in order to
influence their electorate’s political attitudes and behavior. A series of reforms around the turn of the 21st century made possible, for
the first time, digital voter registration records available by state. Paired with increased access to other forms of public and
proprietary records, and increased data processing capabilities, political campaigns underwent a revolution in their ability to identify
and target potential voters. After we assess the evolving methods used by campaigns, we will think about the possible new frontiers
created by pairing this information with digital trace data. Finally, we’ll consider the normative implications of the use and abuse of
this data for our understanding of democratic representation.
Core Questions
1. How have electoral campaigns historically identified and targeted their electorate?
2. How are campaigns making use of newer forms of data to refine their targeting approaches?
3. In what ways has consumer and proprietary data shaped our understanding of political attitudes and behaviors?
4. What assumptions are built into data-driven campaign strategies?
5. What are the normatively desirable outcomes of campaigns’ strategic behavior? The normatively undesirable outcomes?
Core Outcomes
By the end of this unit, you should:
CO1 Explain the key terms and concepts from the Unit 2 note handout
CO2 Analyze how the data available to political elites shapes their interpretation of the attitudes and behavior of the
American public
CO3 Identify the implications for democratic representation in the application of politically-relevant data in campaign
analytics
CO4 Formulate questions for our speakers about the processes, decisions, and interpretations they make in their daily
work
11
attendance for
2.2 The Evolution of Voter Targeting group projects
• Kreiss and McGregor: “Technology Firms Shape Political
Communication: The Work of Microsoft, Facebook, Twitter, and Google
with Campaigns During the 2016 Presidential Cycle” (21 pages)
• Watch interview excerpts with Meg Schwenzfeier, Data Science Lead at
Biden for America 2020 (~12 minutes)
Week 2 2.4 Predictions from Big Data 2/23: Group OH 2/25: Discussion
(February • Kosinski et al. 2013. “Private Traits and Attributes are Predictable from
22nd) Digital Records of Human Behavior.” Proceedings of the National Required
Academy of Sciences (4 pages) attendance for
• Watch interview excerpts with Michael Frias, CEO of Catalist (~22 group projects
minutes)
“SPRING BREAK” – No asynchronous work or class session the week of March 1st
12
Unit #3
Market Forces in Media Production
Unit Introduction
In this unit, we will explore the production of news, applying theories from economics to explain the evolution of the media ecosystem
over time. We’ll consider factors that have remained relatively consistent over the changing mediums of the 20th and 21st centuries—
such as news values and the properties of news as market goods—as well as factors that have changed, such as the
democratization of news production, the identity of editorial gatekeepers, and the availability of information about audience
preferences. Our case study will focus on the idea of “clickbait” and audience metrics, assessing the way that data have altered the
incentives for news production and examining how journalistic ethics might need to adapt to accommodate those changes.
Core Questions
1. How does the market inform the production of news?
2. What is “news” in the social media era and who produces it?
3. What are the consequences of changes in news production?
4. Can the media fulfill its role as the “fourth branch”?
Core Outcomes
By the end of this unit, you should:
CO1 Explain the key terms and concepts from the Unit 3 note handout
CO2 Analyze how the data available to media elites about demand affects the supply of news that is produced and
disseminated
CO3 Identify the implications for democratic representation of changes in the way that news that is produced and
disseminated
CO4 Formulate questions for our speakers about the processes, decisions, and interpretations they make in their daily
work
13
• Munger, Kevin. 2020. “All the News That’s Fit to Click: The Economics of
Clickbait Media”
• Just. 2013. “What’s News: A View from the Twenty-First Century.”
Chapter 7 in the Oxford Handbook of American Public Opinion and the
Media.
• Watch interview excerpts with Josh Lederman, reporter for NBC News
14
Midterm Study for midterm 3/23: Midterm 3/25: Midterm
Week Q&A
(March 22nd)
15
Unit #4
Psychology meets Computer Science: Media Consumption in the Digital Media Era
Unit Introduction
What kinds of effects does the media have on the way people think about politics, and how has that changed in the era of social
media? Which factor is a more powerful influence on the information we encounter online: our own choices or the choices that
algorithms make for us? In this unit we will explore how human psychology affects the way people process political information and
the manner in which human choice interacts with the algorithms underpinning digital media technologies. We’ll assess the normative
implications of this interaction as it relates to 1) echo chambers and polarization, 2) algorithmic harm more broadly, and 3) the
potential of algorithm audits to identify problematic or discriminatory algorithms.
Core Questions
1. What kinds of effects does the media have on public opinion?
2. What motivates political news consumption?
3. What is algorithmic harm? How can it be recognized and how might it be remedied?
4. How do algorithms and human psychology interact to affect news exposure?
Core Outcomes
By the end of this unit, you should:
CO1 Explain the key terms and concepts from the Unit 4 note handout
CO2 Analyze how algorithms alter a user’s news experience on social media
CO3 Identify the implications of the use of algorithms for the effect of news on public opinion
CO4 Formulate questions for our speakers about the processes, decisions, and interpretations they make in their daily
work
16
Feed Era: Effects from Digital, Social, and Mobile Media”
Week 2 4.3 Algorithms and Algorithmic Harm SPRING BREAK 4/8: Discussion
(April 5th) • Data and Discrimination: Collected Essays DAY
• Eslami et al. 2016. “First I ‘like’ it, then I hide it: Folk Theories of Social
Feeds.”
• Read interview with Safiya Umoja Noble
• Listen to Kathy O’Neill keynote talk (first 40 minutes)
17
Unit #5
Pathologies of Media
Core Questions
1. What concerns do Americans have about the media, and how have those concerns evolved over time?
2. What are the consequences of declining trust in media?
3. What are the most sinister current and future threats to our media ecosystem?
4. Who should be responsible for addressing those threats? And how do they do so, without infringing on people’s liberty or
causing deleterious unintended consequences?
Core Outcomes
By the end of this unit, you should:
CO1 Explain the key terms and concepts from the Unit 5 note handout
CO2 Analyze how “bad actors” influence the media ecosystem
CO3 Identify the implications of the use of cutting-edge technology for how people process political information
CO4 Formulate questions for our speakers about the effects of the media ecosystem on the way they do their jobs
Week 2 5.5 Censorship, Repression, and Propaganda 4/20: Group OH 4/22: Discussion
(April 19th) • Tucker, Joshua et al. 2017. “From Liberation to Turmoil: Social Media and
Democracy.” Journal of Democracy 28(4): 46-59.
18
Correction. In Social Media and Democracy: The State of the Field, eds.
Joshua Tucker and Nathaniel Persily New York, NY: Cambridge University
Press.
• Watch interview excerpts with Zarine Kharazian, The Atlantic Council
5.7 Factchecking
• Reading TBD
• Watch interview excerpts with Angie Holan, Politifact
Tuesday, April 27th, Thursday April 29th, and Tuesday May 4th
• Group Projects
19