MSSISS 2017

MSSISS 2017 extended and enhanced the success of previous symposia. In 2017, an inaugural Thursday night event showcased a junior faculty keynote (Assistant Professor Eric Schwartz), speed oral and poster combo presentations, and undergraduate posters. We were also honored to have Professor Dimitris Bertsimas, the Boeing Professor of Operations Research at MIT challenge traditional thinking in our field and propose the use of advanced computing and optimization as practable (practically tractable) solutions to modern challenges.

Professor Bertsimas

Machine Learning and Statistics via a modern optimization lens (slides)

The field of Statistics has historically been linked with Probability Theory. However, some of the central problems of classification,  regression and estimation can naturally be written as optimization problems. While continuous optimization approaches has had a significant impact in Statistics,   mixed integer optimization (MIO)  has played a very limited role, primarily based on the belief that MIO models are computationally intractable.

The period 1991–2015 has witnessed a) algorithmic advances in mixed integer optimization (MIO), which coupled with hardware improvements have resulted in an astonishing 2 trillion factor speedup in solving MIO problems, b) significant advances in our ability to model and solve very high dimensional robust and convex optimization models.

In this talk, we demonstrate that modern  convex,  robust and especially  mixed integer  optimization methods, when applied to a variety of classical Machine Learning (ML) /Statistics (S) problems can lead to  certifiable optimal solutions for large scale instances   that have often significantly improved out of sample accuracy compared to heuristic methods used in ML/S.Specifically, we report results on

  • The classical variable selection problem in regression currently solved by Lasso heuristically.
  • We show that robustness and not sparsity is the major reason of the success of Lasso in contrast to widely held beliefs in ML/S.
  • A systematic approach to design linear and logistic regression models based on MIO.
  • Optimal trees for classification solved by CART heuristically.
  • Robust classification including robust Logistic regression, robust optimal trees and robust support vector machines.
  • Sparse matrix estimation problems: Principal Component Analysis,   Factor Analysis and Covariance matrix estimation.

In all cases we demonstrate that optimal solutions to large scale instances (a) can be found in seconds,   (b) can be certified to be optimal in minutes and (c) outperform classical approaches. Most importantly, this body of work suggests that linking ML/S to modern optimization leads to significant advances.

Bio

Dimitris Bertsimas is currently the Boeing Professor of  Operations Research and the co-director of the Operations Research Center at the Massachusetts Institute of Technology. His research interests include analytics, optimization and their applications  in a variety of industries. He has co-authored more than 200 scientific papers and recently published the book “The Analytics Edge’’.  He is a member of the US National Academy of Engineering, and an INFORMS fellow. He has received several research awards including the Philip Morse lectureship award 2013), the William  Pierskalla award for best paper in health care (2013),  the best paper award in Transportation Science (2013),  the Farkas prize (2008), the Erlang prize (1996),  the SIAM prize in optimization (1996), the Bodossaki prize (1998) and  the Presidential Young Investigator award (1991-1996).

Professor Schwartz

How is data science helping Flint recovery efforts? A Data-driven approach to infrastructure improvement

The Flint water crisis highlights a number of serious problems: a public health outbreak, inadequate urban infrastructure, operational failures, political mistrust, and environmental injustice. But a key challenge that has received less attention in Flint’s recovery is a lack of information: Who is most at risk? What predicts that risk? Where should resources be allocated? These problems bear a surprising similarity to those in management science and customer analytics. Households differ in their lead contamination, but without testing every home’s water, how can we predict which homes are at greatest risk? Further, city officials face a dynamic resource allocation problem: given uncertain records and costly construction, which homes’ pipes should be replaced next?  Support for recovery continues, as Congress appropriated $100 million for Flint out of $9 billion for nationwide water infrastructure, but the efficient use of these funds is critical. To contribute to the recovery efforts, we assembled rich datasets, including thousands of water samples, information on pipe materials, and city records. Working with local government, we have been able to more accurately estimate the greatest risks, to develop a clearer picture of the source of the problems, and to more efficiently direct resources towards recovery. Specifically, we employ ensembles of classifiers and active learning, and we developed apps for coordination with contractors and residents. We illustrate our approach, involving statistical machine learning tools and data collection efforts, as a replicable method for other cities to follow. We contend that Flint can serve as a national model for how to improve water infrastructure with a data-driven approach.

Bio

Eric Schwartz is an Assistant Professor of Marketing at the Ross School of Business at the University of Michigan.  He draws on the areas statistical machine learning and optimization, ranging from Bayesian statistics, dynamic programming, and adaptive experiments. Some current projects lie at the intersection of optimization in management science and public policy. He earned his Ph.D. in Marketing from the Wharton School.

 

MSSISS 2017 Presentation Awards:

Best Oral Presentation:

Selin Merdan & Christine Barnett (IOE) – Data Analytics for Optimal Staging Decisions for Newly-Diagnosed Prostate Cancer Patients

Oral Presentation Honorable Mention:

Morteza Noshad (EECS) – Direct Estimation of Information Divergence Using Nearest Neighbor Ratios

Best Speed Session:

Arya Fahari (EECS) – On the Search for Lead Pipes in Flint

ASA Prize for Best Poster Presentation:

Wesley Marrero (IOE) – Projections of Non-Alcoholic Steatohepatitis Related Liver Transplantation Waitlist Additions

Departmental Poster Presentation Winners:

  • Brian Segal (Biostatistics) – Tests of Matrix Structure for Construct Validation
  • David Hong (EECS) – Asymptotic Performance of PCA for High-Dimensional Heteroscedastic Data
  • Nicholas Seewald (Statistics) – Sample Size Considerations for the Analysis of Continuous Repeated-Measures Outcomes in Sequential Multiple-Assignment Randomized Trials
  • Iago Santos Muraro (Survey Methodology) – Optimal Timing for Incentive Changes in a Long-Standing Panel Survey with High Calling Volume

Best Undergraduate Poster Presentation:

Katherine Li (Statistics) – ReVibe: Recalling everyday moments with context

MSSISS 2017 Full Schedule

MSSISS 2017 Program  

MSSISS 2017 Student Organizing Committee

NAME DEPARTMENT EMAIL/WEBSITE
Krithika Suresh Biostatistics ksuresh@umich.edu
John Lipor EECS lipor@umich.edu
Tom Logan IOE website
Jesús Arroyo
Statistics
jarroyor@umich.edu
Colleen McClain Survey Methodology camcclai@umich.edu

MSSISS 2017 Faculty Advisory Committee

NAME DEPARTMENT EMAIL
Peter Song Biostatistics pxsong@umich.edu
Clayton Scott EECS clayscot@umich.edu
Eunshin Byon IOE ebyon@umich.edu
Susan Murphy Statistics samurphy@umich.edu
Brady West Survey Methodology bwest@umich.edu