People frequently face challenging decision-making problems in which outcomes
are uncertain or unknown. Artificial intelligence (AI) algorithms exist that
can outperform humans at learning such tasks. Thus, there is an opportunity for
AI agents to assist people in learning these tasks more effectively. In this
work, we use a multi-armed bandit as a controlled setting in which to explore
this direction. We pair humans with a selection of agents and observe how well
each human-agent team performs. We find that team performance can beat both
human and agent performance in isolation. Interestingly, we also find that an
agent’s performance in isolation does not necessarily correlate with the
human-agent team’s performance. A drop in agent performance can lead to a
disproportionately large drop in team performance, or in some settings can even
improve team performance. Pairing a human with an agent that performs slightly
better than them can make them perform much better, while pairing them with an
agent that performs the same can make them them perform much worse. Further,
our results suggest that people have different exploration strategies and might
perform better with agents that match their strategy. Overall, optimizing
human-agent team performance requires going beyond optimizing agent
performance, to understanding how the agent’s suggestions will influence human

Source link