Panel: Combining Survey Social Science with Data Science Methods: Fragile Families Challenge and Beyond

Distinguished Panelists:
Matthew Salganik
(Princeton University)
Colter Mitchell (University of Michigan)
Jeremy Freese (Stanford University)

Friday, 6 October, 3:10 PM
ISR-Thompson 1430

Abstract: New Data Science methods and mass collaborations pose both exciting opportunities and important challenges for social science research. This panel will explore the relationship between these new approaches and traditional survey methodology. Can they coexist, or even enrich one another? Dr. Mathew Salganik is one of the lead organizers of the Fragile Families Challenge, which uses data science approaches such as predictive modeling, mass collaboration, and ensemble techniques in the context of the long-running Fragile Families and Child Wellbeing panel survey. Dr. Jeremy Freese is co-PI of the General Social Survey and of a project on collaborative research in the social sciences. Dr. Colter Mitchell is research faculty at the Institute for Social Research and has done innovative work combining biological data and methods with Fragile Families and other survey data sets.

Please promote this event widely.

Graduate students are invited to an informal lunch with Drs. Salganik and Freese from 12 – 1 PM on the same day. Please RSVP so we know how much food to order.

Faculty may schedule meetings with Drs. Salganik and Freese by emailing (@umich.edu) jwlock.

Sponsored by
Computational Social Science Rackham Interdisciplinary Workshop
Population Studies Center (PSC) Freedman Fund
Michigan Institute for Data Science (MIDAS)

Event Poster

Other events on campus this Fall

Michigan’s a vibrant place, and there are many CSS-related events this fall besides those our workshop organizes. We’ve gathered a partial list of them here:

* Free CSCAR workshops are only announced one month in advance, so stay tuned for more.

Resources from Data Camp Online!

Resources from this year’s ICOS Big Data Camp are now online! Data Camp 2017 was taught and attended by numerous CSS Workshop members.

On their home page, you will find

  • Slides from all of the talks
  • Example code and data from all lessons
  • Links to important tools and their setup instructions (e.g. SQL, Python, BASH, Jupyter…)

On their resources page, you will find

  • Example papers
  • Free online books
  • Relevant blogs
  • Many, many data sets on a huge variety of things
  • Useful tools
  • API examples and documentation

Data camp will run again next summer, so stay tuned for announcements!

Python Skills Workshops

We are excited to announce two more python skills workshops in partnership with CSCAR! In order to attend, participants should register for them as soon as they become available on the CSCAR website. Registration is free to UM affiliated people.

Data Science with Social Science data: an introduction to Python’s Pandas
Thursday, March 30th, 2-4 pm, MLB 2001A
Register
This workshop introduces participants to Python’s NumPy, Pandas DataFrames, Matplotlib and StatsModels using an advertising dataset. Participants will use these tools to model (OLS) associations between advertising expenditures and product sales in example data. We will start with an introductory explanation of Anaconda and the Jupyter notebook environment (although not required for the participant, the instructor will be using these tools). We will proceed with topics including: reading data files; creation, indexing and slicing of Pandas DataFrames; creation and handling of Matplotlib objects; and creation and interpretation of models using Python’s StatsModels. Although not required, we recommend that participants have a basic knowledge of Python.

Data Science with Social Science data: building predictive models using Python’s Scikit-learn
Thursday, April 6th, 2-4 pm, MLB 2001A
We will use Python’s Pandas DataFrames, Matplotlib and Scikit-learn to analyze census data. Participants will use Scikit-learn tools to predict whether income exceeds a particular dollar amount based on the census data. This workshop covers the essential steps to building a predictive model in Python. We will start with an introductory explanation of Anaconda and the Jupyter notebook environment (although not required for the participant, the instructor will be using these tools). We will proceed with topics including: data analysis; creation and manipulation of Pandas DataFrames and Matplotlib objects and; creation and interpretation of predictive models using Python’s Scikit-learn. Although not required, we recommend that participants have a basic knowledge of Python and Pandas DataFrames.

Mini-Conference CFP

We are excited to announce that the Computational Social Science RIW will be hosting a mini-conference on Thursday and Friday, April 6-7, 2017!

Submission Guidelines:

  • The conference is open to people from the UM community at all levels and departments. 
  • We particularly encourage submissions from graduate students looking to get feedback on their works in progress.
  • Work in early stages (e.g. project proposal or data analysis) is welcome–that’s often when we need feedback.
  • Late-stage work is also welcome–this is a great opportunity to get feedback on that conference or journal paper before you send it out.
  • Work about CSS (e.g. STS or research on data privacy) is welcome, even if you don’t personally use CSS methods.

Important Dates:

  • Sunday, February 5: CFP released.
  • Wednesday, March 15, 11:59pm: Proposal and registration deadline.
  • Thursday, April 6th, 2pm: Skills workshop: python for data analysis (with CSCAR)
  • Thursday, April 6th, 6pm: Invited panel (more details soon!)
  • Friday, April 7th, 11am-4pm: Open sessions and round tables (central campus)

Registration: https://goo.gl/forms/iDZd7nRM6asNSq5h1

Please share this invitation with anyone who may be interested.

List of Summer CSS Opportunities

There are a host of things one can do this summer related to computational social science, both within and without the university. We have compiled a partial list below (and we welcome suggestions!):

Within UM:

  • ICOS Big Data Summer Camp. The camp will run again this summer here at UM. A number of our members have been participants or leaders there in the past and found it to be a great way to build skills and network with other Michigan people. Information from last year’s workshop is here.
  • ICPSR Summer Courses teach a variety of CSS relevant skills and methods, including network analysis, machine learning, text analysis, math for social scientists, R, and more! They also have scholarships.

Outside UM:

Women in Data Science Event

Dear Computational Social Sciences Workshop Members,

On Feb. 3, 2017, an all-day Women in Data Science (WiDS) conference will be held at the University of Michigan in conjunction with a main event at Stanford University. This one-day technical conference will combine local faculty presentations with the live-stream of the opening keynote and select technical vision talks from the WiDS Conference at Stanford University.  WiDS features female speakers from multiple domains, covering machine learning, data visualization, bioinformatics, geosciences, and more. Our goals are to educate, inspire and motivate students, faculty, and industry partners to enter or stay in the data science field.  Stanford has an exciting line up of speakers and will live stream around the globe with Michigan being one of the partners.

Part of the activities here at UMICH will be faculty technical vision talks and a panel addressing Data Science challenges and technical issues. The UM faculty speakers include Stephanie TeasleyAmy CohnEmily Mower ProvostMingyan LiuYi Lu Murphey (UM Dearborn, Chair), and moderated by Anna Gilbert. These faculty will showcase their exciting research and discuss the current and future challenges in Data Science. As this is an event to showcase the breadth and strength of women in data science, we will keep our focus on the research challenges at hand from a UMICH perspective.

The day will start off with a networking breakfast followed by the 5 technical talks and then a panel for questions and answers. After the panel will be a luncheon during which we will begin the live stream from Stanford University. At 3:30pm we will relocate to Rackham for a MIDAS Seminar by Dr. Yao Xie, Georgia Tech, which will be followed by a networking reception.

Please consider attending and help us spread the word about the conference. As seating is limited to 96, registration is required. While the conference speakers are women, the conference is open to all faculty, students, and staff.

Thank you so much for your consideration. Please let me know if you have any questions.

Warm regards,

Moira C. Dowling, MPH

Sr. Project Manager

Michigan Institute for Data Science (MIDAS)

Resources for Python Data Science Skills Sessions

Resources including data and sample code for our Data Science Skills workshop series are posted in our GitHub as they become available. If you are familiar with GitHub, you may clone and use the repositories as normal. If you’re not familiar with git, don’t worry! You can just click the files and view or download them manually from the website.

https://github.com/UM-CSS

Skills Sessions: Data Science with Social Science Data

The CSS organizers are proud to present two workshops we which will be held in January in collaboration with CSCAR:

Data Science with Social Science Data

This workshop covers the essential steps to data analysis in Python, using social science data as a case study. The workshop is divided into two parts. The first session includes an introduction to Python’s numpy and Pandas data analysis library. This session requires no previous experience with python. We will cover common steps involved in any data analysis: from loading the data to running a regression and interpreting outcomes.

The second session requires some background knowledge in python provided by the first session. The second session covers more advanced features, from various potential preprocessing steps to using Machine Learning Scikit-learn tools to analyze the data. As in the first session, we will be using an example from the social sciences.

The two sessions will be held in a computer lab and participants will be able to work either individually or in small groups on a few practice exercises.

For event dates and details, please check our calendar and your email.

Princeton Resources for CSS

Our videoconference with Princeton Sociology Professor Matthew Salganik about his new book, Bit by Bit, left us with a wealth of information and resources. Check them out below!

Princeton workshops and tutorials (with data, code, slides…):

Relevant conferences:

A few examples of published work in CSS: