### Stefan Wager - Efficient Policy Learning

### 4 Answers

0

I saw a condition of "no overlap" on a slide, but didn't catch the explanation. could you remind me?

0

0

-pi appeared to be described as the "anti policy", and by implication (or my inference) the "worst policy". This seems reasonable, but is it really true that swapping policies for everyone is really the worse policy?
In the notation from the slides, Q(-pi)=-Q(pi), so assuming that Pi is symmetric, the claim is true. (I should have stressed the symmetry assumption more.)

written
12 months ago by
Stefan Wager

0

Is your method sensitive to the weights? If the weights are poorly estimated, is your method still robust? For double robustness method, it requires that either of weighting or imputation is doing a good job.
Yes that's right. I didn't mention it in the talk, but the assumption we really need is that the product of the RMSE for estimating the propensity and outcome models is o(n^-0.5). In the talk, I had simply assumed that both are o(n^-0.25) estimable.

written
12 months ago by
Stefan Wager

Please login to add an answer/comment or follow this question.

eta < P[W=1 | X=x] < 1-eta

for all x. This condition enforces the idea that the observational is at least a little bit randomized, which is necessary for causal inference to be possible.