Discovering and exploiting the causal structure in the environment is a
crucial challenge for intelligent agents. Here we explore whether causal
reasoning can emerge via meta-reinforcement learning. We train a recurrent
network with model-free reinforcement learning to solve a range of problems
that each contain causal structure. We find that the trained agent can perform
causal reasoning in novel situations in order to obtain rewards. The agent can
select informative interventions, draw causal inferences from observational
data, and make counterfactual predictions. Although established formal causal
reasoning algorithms also exist, in this paper we show that such reasoning can
arise from model-free reinforcement learning, and suggest that causal reasoning
in complex settings may benefit from the more end-to-end learning-based
approaches presented here. This work also offers new strategies for structured
exploration in reinforcement learning, by providing agents with the ability to
perform — and interpret — experiments.

Source link