<h1> <a href="https://econml.azurewebsites.net/"> <img src="doc/econml-logo-icon.png" width="80px" align="left" style="margin-right: 10px;", alt="econml-logo"> </a> EconML: A Python Package for ML-Based Heterogeneous Treatment Effects Estimation </h1>

EconML is a Python package for estimating heterogeneous treatment effects from observational data via machine learning. This package was designed and built as part of the ALICE project at Microsoft Research with the goal to combine state-of-the-art machine learning techniques with econometrics to bring automation to complex causal inference problems. The promise of EconML:

Implement recent techniques in the literature at the intersection of econometrics and machine learning
Maintain flexibility in modeling the effect heterogeneity (via techniques such as random forests, boosting, lasso and neural nets), while preserving the causal interpretation of the learned model and often offering valid confidence intervals
Use a unified API
Build on standard Python packages for Machine Learning and Data Analysis

One of the biggest promises of machine learning is to automate decision making in a multitude of domains. At the core of many data-driven personalized decision scenarios is the estimation of heterogeneous treatment effects: what is the causal effect of an intervention on an outcome of interest for a sample with a particular set of features? In a nutshell, this toolkit is designed to measure the causal effect of some treatment variable(s) T on an outcome variable Y, controlling for a set of features X, W and how does that effect vary as a function of X. The methods implemented are applicable even with observational (non-experimental or historical) datasets. For the estimation results to have a causal interpretation, some methods assume no unobserved confounders (i.e. there is no unobserved variable not included in X, W that simultaneously has an effect on both T and Y), while others assume access to an instrument Z (i.e. an observed variable Z that has an effect on the treatment T but no direct effect on the outcome Y). Most methods provide confidence intervals and inference results.

For detailed information about the package, consult the documentation at https://econml.azurewebsites.net/.

For information on use cases and background material on causal inference and heterogeneous treatment effects see our webpage at https://www.microsoft.com/en-us/research/project/econml/

<details> <summary><strong><em>Table of Contents</em></strong></summary>

News
Getting Started
- Installation
- Usage Examples
For Developers
- Running the tests
- Generating the documentation
Blogs and Publications
Citation
Contributing and Feedback
Community
References

</details>

News

If you'd like to contribute to this project, see the Help Wanted section below.

July 3, 2024: Release v0.15.1, see release notes here

<details><summary>Previous releases</summary>

February 12, 2024: Release v0.15.0, see release notes here

November 11, 2023: Release v0.15.0b1, see release notes here

May 19, 2023: Release v0.14.1, see release notes here

November 16, 2022: Release v0.14.0, see release notes here

June 17, 2022: Release v0.13.1, see release notes here

January 31, 2022: Release v0.13.0, see release notes here

August 13, 2021: Release v0.12.0, see release notes here

August 5, 2021: Release v0.12.0b6, see release notes here

August 3, 2021: Release v0.12.0b5, see release notes here

July 9, 2021: Release v0.12.0b4, see release notes here

June 25, 2021: Release v0.12.0b3, see release notes here

June 18, 2021: Release v0.12.0b2, see release notes here

June 7, 2021: Release v0.12.0b1, see release notes here

May 18, 2021: Release v0.11.1, see release notes here

May 8, 2021: Release v0.11.0, see release notes here

March 22, 2021: Release v0.10.0, see release notes here

March 11, 2021: Release v0.9.2, see release notes here

March 3, 2021: Release v0.9.1, see release notes here

February 20, 2021: Release v0.9.0, see release notes here

January 20, 2021: Release v0.9.0b1, see release notes here

November 20, 2020: Release v0.8.1, see release notes here

November 18, 2020: Release v0.8.0, see release notes here

September 4, 2020: Release v0.8.0b1, see release notes here

March 6, 2020: Release v0.7.0, see release notes here

February 18, 2020: Release v0.7.0b1, see release notes here

January 10, 2020: Release v0.6.1, see release notes here

December 6, 2019: Release v0.6, see release notes here

November 21, 2019: Release v0.5, see release notes here.

June 3, 2019: Release v0.4, see release notes here.

May 3, 2019: Release v0.3, see release notes here.

April 10, 2019: Release v0.2, see release notes here.

March 6, 2019: Release v0.1, welcome to have a try and provide feedback.

</details>

Getting Started

Installation

Install the latest release from PyPI:

pip install econml

To install from source, see For Developers section below.

Usage Examples

Estimation Methods

<details> <summary>Double Machine Learning (aka RLearner) (click to expand)</summary>

Linear final stage

from econml.dml import LinearDML
from sklearn.linear_model import LassoCV
from econml.inference import BootstrapInference

est = LinearDML(model_y=LassoCV(), model_t=LassoCV())
### Estimate with OLS confidence intervals
est.fit(Y, T, X=X, W=W) # W -> high-dimensional confounders, X -> features
treatment_effects = est.effect(X_test)
lb, ub = est.effect_interval(X_test, alpha=0.05) # OLS confidence intervals

### Estimate with bootstrap confidence intervals
est.fit(Y, T, X=X, W=W, inference='bootstrap')  # with default bootstrap parameters
est.fit(Y, T, X=X, W=W, inference=BootstrapInference(n_bootstrap_samples=100))  # or customized
lb, ub = est.effect_interval(X_test, alpha=0.05) # Bootstrap confidence intervals

Sparse linear final stage

from econml.dml import SparseLinearDML
from sklearn.linear_model import LassoCV

est = SparseLinearDML(model_y=LassoCV(), model_t=LassoCV())
est.fit(Y, T, X=X, W=W) # X -> high dimensional features
treatment_effects = est.effect(X_test)
lb, ub = est.effect_interval(X_test, alpha=0.05) # Confidence intervals via debiased lasso

Generic Machine Learning last stage

from econml.dml import NonParamDML
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier

est = NonParamDML(model_y=RandomForestRegressor(),
                  model_t=RandomForestClassifier(),
                  model_final=RandomForestRegressor(),
                  discrete_treatment=True)
est.fit(Y, T, X=X, W=W) 
treatment_effects = est.effect(X_test)

</details> <details> <summary>Dynamic Double Machine Learning (click to expand)</summary>

from econml.panel.dml import DynamicDML
# Use defaults
est = DynamicDML()
# Or specify hyperparameters
est = DynamicDML(model_y=LassoCV(cv=3), 
                 model_t=LassoCV(cv=3), 
                 cv=3)
est.fit(Y, T, X=X, W=None, groups=groups, inference="auto")
# Effects
treatment_effects = est.effect(X_test)
# Confidence intervals
lb, ub = est.effect_interval(X_test, alpha=0.05)

</details> <details> <summary>Causal Forests (click to expand)</summary>

from econml.dml import CausalForestDML
from sklearn.linear_model import LassoCV
# Use defaults
est = CausalForestDML()
# Or specify hyperparameters
est = CausalForestDML(criterion='het', n_estimators=500,       
                      min_samples_leaf=10, 
                      max_depth=10, max_samples=0.5,
                      discrete_treatment=False,
                      model_t=LassoCV(), model_y=LassoCV())
est.fit(Y, T, X=X, W=W)
treatment_effects = est.effect(X_test)
# Confidence intervals via Bootstrap-of-Little-Bags for forests
lb, ub = est.effect_interval(X_test, alpha=0.05)

</details> <details> <summary>Orthogonal Random Forests (click to expand)</summary>

from econml.orf import DMLOrthoForest, DROrthoForest
from econml.sklearn_extensions.linear_model import WeightedLasso, WeightedLassoCV
# Use defaults
est = DMLOrthoForest()
est = DROrthoForest()
# Or specify hyperparameters
est = DMLOrthoForest(n_trees=500, min_leaf_size=10,
                     max_depth=10, subsample_ratio=0.7,
                     lambda_reg=0.01,
                     discrete_treatment=False,
                     model_T=WeightedLasso(alpha=0.01), model_Y=WeightedLasso(alpha=0.01),
                     model_T_final=WeightedLassoCV(cv=3), model_Y_final=WeightedLassoCV(cv=3))
est.fit(Y, T, X=X, W=W)
treatment_effects = est.effect(X_test)
# Confidence intervals via Bootstrap-of-Little-Bags for forests
lb, ub = est.effect_interval(X_test, alpha=0.05)

</details> <details> <summary>Meta-Learners (click to expand)</summary>

XLearner

from econml.metalearners import XLearner
from sklearn.ensemble import GradientBoostingClassifier, GradientBoostingRegressor

est = XLearner(models=GradientBoostingRegressor(),
              propensity_model=GradientBoostingClassifier(),
              cate_models=GradientBoostingRegressor())
est.fit(Y, T, X=np.hstack([X, W]))
treatment_effects = est.effect(np.hstack([X_test, W_test]))

# Fit with bootstrap confidence interval construction enabled
est.fit(Y, T, X=np.hstack([X, W]), inference='bootstrap')
treatment_effects = est.effect(np.hstack([X_test, W_test]))
lb, ub = est.effect_interval(np.hstack([X_test, W_test]), alpha=0.05) # Bootstrap CIs

SLearner

from econml.metalearners import SLearner
from sklearn.ensemble import GradientBoostingRegressor

est = SLearner(overall_model=GradientBoostingRegressor())
est.fit(Y, T, X=np.hstack([X, W]))
treatment_effects = est.effect(np.hstack([X_test, W_test]))

TLearner

from econml.metalearners import TLearner
from sklearn.ensemble import GradientBoostingRegressor

est = TLearner(models=GradientBoostingRegressor())
est.fit(Y, T, X=np.hstack([X, W]))
treatment_effects = est.effect(np.hstack([X_test, W_test]))

</details> <details> <summary>Doubly Robust Learners (click to expand) </summary>

Linear final stage

from econml.dr import LinearDRLearner
from sklearn.ensemble import GradientBoostingRegressor, GradientBoostingClassifier

est = LinearDRLearner(model_propensity=GradientBoostingClassifier(),
                      model_regression=GradientBoostingRegressor())
est.fit(Y, T, X=X, W=W)
treatment_effects = est.effect(X_test)
lb, ub = est.effect_interval(X_test, alpha=0.05)

Sparse linear final stage

from econml.dr import SparseLinearDRLearner
from sklearn.ensemble import GradientBoostingRegressor, GradientBoostingClassifier

est = SparseLinearDRLearner(model_propensity=GradientBoostingClassifier(),
                            model_regression=GradientBoostingRegressor())
est.fit(Y, T, X=X, W=W)
treatment_effects = est.effect(X_test)
lb, ub = est.effect_interval(X_test, alpha=0.05)

Nonparametric final stage

from econml.dr import ForestDRLearner
from sklearn.ensemble import

EconML

News

Getting Started

Installation

Usage Examples

Estimation Methods

编辑推荐精选

扣子-AI办公

堆友

码上飞

Vora

Refly.AI

酷表ChatExcel

TRAE编程

AIWritePaper论文写作

博思AIPPT

潮际好麦

探索AI的无限可能

推荐工具精选

TRAE编程

扣子-AI办公

码上飞

商汤小浣熊

讯飞绘文

讯飞绘镜

iTerms

AI云服务特惠

火山引擎

阿里云

腾讯云

华为云

百度智能云

AWS

关注微信公众号