2472 - Probing relationships between reinforcement learning and simple behavioral strategies to understand probabilistic reward learning
Reinforcement learning (RL) and win-stay/lose-shift are models of decision making widely used to describe how individuals learn about and interact with loss and reward in dynamic environments. Though mutually informative, these accounts are often conceptualized as independent processes, obscuring potential relationships between win-stay/lose-shift tendencies and RL parameters. By simulating the win-stay/lose-shift tendencies of an RL agent across the parameter space, we demonstrate novel relationships between win-stay/lose-shift tendencies and RL parameters that challenge conventional interpretations of lose-shift as a metric of loss sensitivity. We provide a methodology to directly relate RL parameters to behavioral strategy. Using the simulated win-stay/lose-shift tendencies to calculate and maximize a truncated multivariate normal distribution of RL parameters given win-stay/lose-shift tendencies, we demonstrate that win-stay/lose-shift tendencies can be used to approximate RL parameters. We demonstrate in both simulated and simulated noisy data that this method of parameter approximation yields reliable parameter recovery comparable to the conventionally used maximum likelihood estimation method. For empirical data, however, this method provides a more reliable approximation of RL parameters than maximum likelihood estimation.