ASA Conference on Statistical Practice 2018, Saturday 2 of 4, Causal Inference

CSP Conf Logo

Highlights from Conference on Statistical Practice.

Friday 2/16/2018

Saturday 2/17/2018

11:00 AM Causal Inference

Causal Inference with Multilevel Data Structures, Luke Keele, Georgetown University

Problem: observational studies often are the only realistic option, with bias than prevents normal statistical methods from being reliable. When observational studies have clustering occurring among who receives or doesn’t receive a treatment, that introduces a multi-layered issue that makes it yet more complex.

He thinks multilevel matching is more ideal than multilevel regression for clustered observational studies. He has an R package multiMatch/matchMulti that does this matching work and has several options to handle more complex cases.

The second study shown was cool. It looked at an actual RCT, but with bias in schools that choose to participate. He removed the randomly assigned control group, and tried to see if the randomly assigned treatment group could show the same effect in comparison to the non-participating group after adjusting for selection bias. Since the effect size was known, it is a cool example case to see how well the matching algorithm works to get at the treatment effect in an observational study setting.

Since students could be matched or schools could be matched, it is a multi-level situation requiring more complex matching either by student or school or both. Don’t remember the results though.

A Decision Tool for Causal Inference and Observational Data Analysis Methods in Comparative Effectiveness Research, Douglas Landsittel, University of Pittsburgh

Propensity scores are a prediction problem, but predict on treatment - as opposed to on the response. So they predict if someone received the treatment. In a randomized control trial, an individual’s probability of treatment or control is the same. He suggested propensity scores may be a perfect application of machine learning, since a black box is fine. I’ve heard this before, and think random forest does make sense, to handle automatic non-linearity managing, and getting to a number quickly.

He emphasized people use propensity score methods not just propensity scores. Once having a propensity score, there are four approaches to using them. Of 4 approaches, adjust is the worst. One method is inverse of propensity probability as a weight. I thought this looked interesting. There is a lot of work on pros and cons of these methods. He thinks it depends on area of common support and distribution of propensity scores.

Up next: 2:00 PM Deploying Quantitative Models as ‘Visuals’ in Popular Data Visualization Platforms

Written on