Freshman Seminar 154
Center for Digital Humanities
Center for Statistics & Machine Learning
Spring 2019, Thursdays 1:30 – 4:20 PM

Come and explore the wide world of data. Learn to read datasets, first with examples drawn from historical and literary datasets and then by creating your own datasets with classmates. Learn the foundational skills of working with data by reflecting critically on those methods and tools through the lenses of race, class, gender, and power. By Dean’s Date you will join the community of data researchers.

Hearty acknowledgements are due to Jean Bauer.

Table of contents


Course Policies

Please read these policies carefully. All information on this syllabus is subject to change. Any changes will be announced in advance.

Assignments

Weekly readings will explore many perspectives on data. Assignments will include short responses, exercises, reflections, as well as two projects working with data during the middle and end of the term. When applicable, all assignments are due by the start of class.

Readings

All course readings are available online and should be completed before the start of class.

Attendance

Mandatory and crucial. Attendance and active participation are essential components of the learning process. My class is structured to encourage your success through in-class discussions, exercises, and conversations with your peers. If you have difficulty speaking up in class, please talk to me.

Because absences are sometimes unavoidable, you will exchange email addresses with classmates. Keep these in case you need to communicate about missed work in class. I consider it your responsibility to find out from classmates (not me) what took place. Due to the importance of our in-class conversations and exercises, you may miss two classes during the semester, after which one point will be deducted from the participation grade for each missed meeting.

Class Contacts

Name:_______________________ E-mail: _____________________________

Name:_______________________ E-mail: _____________________________

Tardiness

Given the precious little time we have together, it is important that we make full use of every class period. This includes the beginning of class. Consistent tardiness will adversely impact your grade.  Being tardy three times will register as one missed class.

Evaluation Policies

The assignments throughout the class will be a mix of collaborative efforts and individual assignments. I will provide feedback on every assignment. While completing any of your work, you are encouraged and welcome to visit office hours. I ask that you wait 24 hours before discussing any grade with me. As the major projects will be a large portion of the course evaluations, I will circulate a more detailed schedule and evaluation rubrics in class. Please note that every major step of the final assignment will require a critical reflection on how it fulfills the project designs, goals, and rubrics. To show that you have read this syllabus thoroughly, please email me with a GIF of a dinosaur to receive extra credit.

Extra Credit

If you attend a campus event or talk relevant to our class, you may submit a one-page reflection on how the experience adds to our class learning. Some of these opportunities will be announced in class. Reflections will count for one third of a participation grade letter.

Readings

This course is built around the readings assigned for each class meeting. Please come to class prepared to talk about the readings, including with regard to the debates, materials, and audiences that each author engages. For more, see this handy guide for “How to Read a Book” by Paul Edwards http://pne.people.si.umich.edu/PDF/howtoread.pdf.

Plagiarism & Academic Honesty

It is your duty to be familiar with the Princeton University Principles of General Conduct and Regulations, along with the Rights, Rules, Responsibilities, 2017 edition www.princeton.edu/pub/rrr/. I quote: “The central purposes of a university are the pursuit of truth, the discovery of new knowledge through scholarship and research, the teaching and general development of students, and the transmission of knowledge and learning to society at large. Free inquiry and free expression within the academic community are indispensable to the achievement of these goals. The freedom to teach and to learn depends upon the creation of appropriate conditions and opportunities on the campus as a whole as well as in classrooms and lecture halls. All members of the academic community share the responsibility for securing and sustaining the general conditions conducive to this freedom.”

Assignments

  • Participation                20%
  • Weekly exercises        20%
  • Proposals                    10%
  • In-class workshop       10%
  • Reflections                  10%
  • Unessays                     30%
  • Total                            100% 

Some advice from past students

  • “Manage your time, work on projects and assignments daily, even if only for a little while.”
  • “Don’t BS your first attempts and peer projects. The projects that I got the most out of were the ones that I worked really hard on the first stab and my group dug in together. If you put in the effort in the beginning your life will be easier and you will get a better grade.”
  • “Just make sure you plan ahead.”
  • “Participate in class, otherwise you will be bored.”
  • “Work hard on the project rough drafts so that you lighten your load and aren’t up all night the night before something is due.”
  • “Don’t wait until the last minute.”
  • “Begin the research process early.”
  • “Do it.” 

Class Session Leaders

Students will co-teach many of our course sessions. You will work in small groups to prepare and lead a class session. This work will count as part of participation grades.

Part of the responsibility of leading a class session will be gaining a greater familiarity with the materials, ideas, and debates around a topic. While I do not expect you to learn everything there is to know about a topic in a few weeks, I will expect that you have thought intensively and expansively about what we might know and discuss.

Timeline:

  • 2 weeks prior: meet with me outside of class
  • 1 week prior: send out a related reading to the rest of the class
  • Tuesday before class: send me your class plans
  • Lead your class session
  • 1 week after: submit your debrief statement (paper or email)

Responsibilities:

  • Find and share an additional reading that relates to the topic of the day.
  • Find & share one data source, tool, or application that relates to the topic of the day.
  • Take primary responsibility for facilitating class discussion.
  • Bring a list of discussion questions (at least 5-6; email to me to print out)

Debrief statement:
After the class is over, please compose a written statement (2+ pages) that debriefs on your experiences and ideas. While you are welcome to engage any range of topics, please make sure to respond to these questions:

  • Describe what each group member contributed to the preparation.
  • What are the major takeaways from the class session?
    What did we leave out of the class session?
  • What might we have done differently?
    Do you have any unanswered questions? 


Course Calendar

*Note: URLs are provided, but most readings are easy to find by searching online for the author and title.

Week 1 – What is data?

  • Introductions, policies, and planning

Week 2 – Categorization and Classification of Data, part 1

Readings

  • Foucault, preface from The Order of Things (Course Site)
  • Gitelman and Jackson. “Introduction.” “Raw Data” Is an Oxymoron, (Course Site)
  • Mimi Onuoha, “The Library of Missing Datasets” https://github.com/MimiOnuoha/missing-datasets

Week 3 – Categorization and Classification of Data, part 2

Readings

Week 4 – Data Curation

Readings

Lab: Tools for processing data (E.g. GitHub, OpenRefine, Breve, Excel, Regex)

Week 5 – Data Visualization

Readings

Lab: Data visualization tools and Unessays

Class session leaders: ______________________________________________________

Week 6 – Data Translation

Readings

Moritz Stefaner, “Data Cuisine” watch the entire video https://truth-and-beauty.net/appearances/talks/eyeo-2016

Catherine D’Ignazio and Lauren F. Klein, “Feminist Data Visualization.” www.kanarinka.com/wp-content/uploads/2015/07/IEEE_Feminist_Data_Visualization.pdf

Lab: Unessays (cont’d)

Week 7 – Networks

Readings

Lab: Social network analysis with Palladio, Gephpi, Cytoscape, or others.

Class session leaders: ______________________________________________________

*Midterm projects are due by the start of class.*

Week 8 – Maps

Readings

Before class, please browse these mapping projects (links embedded):

Lab: Google Maps and Google Earth tutorial, among others.

Class session leaders: ______________________________________________________

Week 9 – Designing with Data

Readings

  • Chimero, excerpts from The Shape of Design (Course site)

Lab: Tactics for data & design practices

Class session leaders: ______________________________________________________

Week 10 – Studio workshop

  • No reading. We will complete an intensive in-class lab assignment and may be joined by other people from around campus.

Week 11 – Data, Lately

  • Rather than pre-assign readings, together we will create a reading list for this day based on the most recent news about data and its consequences.

Week 12 – Data in the World

  • In-Class Presentations and closing conversations

All final projects must be submitted by email on Dean’s Day at midnight. Each group member is required to submit the additional reflection essay. We may be joined by guests from around campus.


Datasets Collections & Repositories

  1. Australian GLAM (Galleries, Libraries, Archives, Museums) datasets
  2. Awesome OpenAccess Data Projects
  3. Data Collections and Datasets
  4. A collection of museum, gallery, library, archive, archaeology and assorted sources for machine-readable data
  5. Data Is Plural Newsletter Archive
  6. The Magazine of Early American Datasets (MEAD)
  7. Library of Congress Labs – experimental tools, art, applications, and visualizations
  8. African American Digital Projects – http://bit.ly/Black-DH-List
  9. http://data.gov
  10. http://datarefuge.org
  11. Bureau of Labor Statistics https://www.bls.gov/
  12. NYC Open Data (also in many cities too)
  13. Open Data Philly
  14. UN Data
  15. Google Public Data Directory
  16. ProPublica DataStore
  17. CDC Data & Statistics
  18. Awesome Public Datasets
  19. Makeover Monday Data Challenges – Datasets
  20. The European Backpackers Index 2018
  21. Viz for Social Good (browse around for links to various datasets)
  22. DataHub Core Datasets
  23. Yelp Dataset Challenge
  24. Modern Data Catalog
  25. Pitchfork Reviews 1999-2019 (warning: not yet in csv)
  26. Stanford Large Network Dataset Collection
  27. Various Golden Globe, Nobel, and Oscar Award Datasets
  28. Bigfoot Field Researchers – Comprehensive Sightings Database
  29. Kaggle CSV Datasets for competitions
  30. Newspaper presidential endorsements, 1980-present
  31. Weapons confiscated at airports by TSA in 2015
  32. Data from the Pentagon’s surplus-equipment-to-local-law-enforcement program
  33. Network Repository. An Interactive Scientific Data Repository
  34. Inside AirBnB Data
  35. Grand Comics Database
  36. BuzzFeedNews Data
  37. The Marvel Universe Social Network
  38. ICIJ Offshore Leaks Database
  39. Russian Ads on Facebook for US Elections
  40. Student Loan Debt Per Graduate by School by State 2017

Resources for working with data

Guides and references

Approaching & Refining Data

Creating Graphs & Charts

Tools for building unessays

Timelines

Maps

These tools generate maps with points (note: we find static maps often reach more people)

Digital Stories / Interactive Narratives

Story Maps

(interactive maps that move to a sequence of locations)

Text Analysis

Social Network Analysis

  • Palladio – basic network graphs for quick prototypes
  • Gephi – network graphs (note: desktop software with a learning curve)

Geocoding

Also known as converting a list of place names into latitude/longitude

Data Collection / Scraping