Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences

There is a great need for models capable of simulating the effect of interventions on collective action. Here we introduce a new model, based on multi-agent reinforcement learning, where many properties of collective action emerge endogenously. In this model, heterogeneity of agent objectives creates rich dynamics of coordination and conflict, both within and between groups. There are multiple equilibria that are desirable for all players, but each group prefers a particular equilibrium over all others. This creates a between-group coordination problem that can be solved by establishing a convention. Within each group, working towards their own preferred equilibrium poses both a start-up problem and a free rider problem. We investigate the effects of group size, intensity of intrinsic preference, and salience on the emergence dynamics of coordination conventions. Results of our simulations show agents establish and switch between conventions, even working against their own preferred outcome when doing so is necessary for effective coordination.