Proceedings of Machine Learning ResearchProceedings of The KDD'21 Workshop on Causal Discovery
Held in Singapore on 15 August 2021
Published as Volume 150 by the Proceedings of Machine Learning Research on 02 August 2021.
Volume Edited by:
Thuc Duy Le
Jiuyong Li
Greg Cooper
Sofia Triantafyllou
Elias Bareinboim
Huan Liu
Negar Kiyavash
Series Editors:
Neil D. Lawrence
* Mark Reid
https://proceedings.mlr.press/v150/
Fri, 20 Aug 2021 05:57:31 +0000Fri, 20 Aug 2021 05:57:31 +0000Jekyll v3.9.0Dirac Delta Regression: Conditional Density Estimation with Clinical TrialsPersonalized medicine seeks to identify the causal effect of treatment for a particular patient as opposed to a clinical population at large. Most investigators estimate such personalized treatment effects by regressing the outcome of a randomized clinical trial (RCT) on patient covariates. The realized value of the outcome may however lie far from the conditional expectation. We therefore introduce a method called Dirac Delta Regression (DDR) that estimates the entire conditional density from RCT data in order to visualize the probabilities across all possible outcome values. DDR transforms the outcome into a set of asymptotically Dirac delta distributions and then estimates the density using non-linear regression. The algorithm can identify significant differences in patient-specific outcomes even when no population level effect exists. Moreover, DDR outperforms state-of-the-art algorithms in conditional density estimation by a large margin even in the small sample regime. An R package is available at https://github.com/ericstrobl/DDR.Mon, 02 Aug 2021 00:00:00 +0000
https://proceedings.mlr.press/v150/strobl21a.html
https://proceedings.mlr.press/v150/strobl21a.htmlA Recursive Markov Boundary-Based Approach to Causal Structure LearningConstraint-based methods are one of the main approaches for causal structure learning that are particularly valued as they are asymptotically guaranteed to find a structure that is Markov equivalent to the causal graph of the system. On the other hand, they may require an exponentially large number of conditional independence (CI) tests in the number of variables of the system. In this paper, we propose a novel recursive constraint- based method for causal structure learning that significantly reduces the required number of CI tests compared to the existing literature. The proposed approach aims to use Markov boundary information to identify a specific variable that can be removed from the set of variables without affecting the statistical dependencies among the other variables. Having identified such a variable, we discover its neighborhood, remove that variable from the set of variables, and recursively learn the causal structure over the remaining variables. We further provide a lower bound on the number of CI tests required by any constraint-based method. Comparing this lower bound to our achievable bound demonstrates the efficiency of the proposed approach. Our experimental results show that the proposed algorithm outperforms state-of-the-art both on synthetic and real-world structures.Mon, 02 Aug 2021 00:00:00 +0000
https://proceedings.mlr.press/v150/mokhtarian21a.html
https://proceedings.mlr.press/v150/mokhtarian21a.htmlInteractive Causal Structure Discovery in Earth System Sciences Causal structure discovery (CSD) models are making inroads into several domains, including Earth system sciences. Their widespread adaptation is however hampered by the fact that the resulting models often do not take into account the domain knowledge of the experts and that it is often necessary to modify the resulting models iteratively. We present a workflow that is required to take this knowledge into account and to apply CSD algorithms in Earth system sciences. At the same time, we describe open research questions that still need to be addressed. We present a way to interactively modify the outputs of the CSD algorithms and argue that the user interaction can be modelled as a greedy finding of the local maximum-a-posteriori solution of the likelihood function, which is composed of the likelihood of the causal model and the prior distribution representing the knowledge of the expert user. We use a real-world data set for examples constructed in collaboration with our co-authors, who are the domain area experts. We show that finding maximally usable causal models in the Earth system sciences or other similar domains is a difficult task which contains many interesting open research questions. We argue that taking the domain knowledge into account has a substantial effect on the final causal models discovered.Mon, 02 Aug 2021 00:00:00 +0000
https://proceedings.mlr.press/v150/melkas21a.html
https://proceedings.mlr.press/v150/melkas21a.htmlPreface: The 2021 ACM SIGKDD Workshop on Causal Discovery Preface to the 2021 KDD Workshop on Causal Discovery (CD 2021)Mon, 02 Aug 2021 00:00:00 +0000
https://proceedings.mlr.press/v150/le21a.html
https://proceedings.mlr.press/v150/le21a.htmlEstimating individual-level optimal causal interventions combining causal models and machine learning modelsWe introduce a new statistical causal inference method to estimate individual-level optimal causal intervention, that is, to which value we should set the value of a certain variable of an individual to obtain a desired value of another variable. This is defined as an optimization problem to minimize the error between a desired value and the value that would have been attained under the setting for the individual. To solve the optimization problem, we first train a machine learning model to predict the value of an objective variable and then estimate the causal structure of variables. We then combine the machine learning model and causal structure into a single causal model to estimate counterfactual value of the predicted objective variable. This is effective in achieving a more accurate estimation of individual-level optimal causal intervention. We further propose a gradient descent algorithm to compute the optimal causal intervention. Our method is generally applicable to continuous variables that are linearly and non-linearly related. In experiments, we evaluate the effectiveness of our method using artificial data generated by non-linear causal structures and real data.Mon, 02 Aug 2021 00:00:00 +0000
https://proceedings.mlr.press/v150/kiritoshi21a.html
https://proceedings.mlr.press/v150/kiritoshi21a.html