When a rat runs a maze and finds a reward, the first thing he does is pause to enjoy it, especially when it’s a spot of Nesquik chocolate. But inside his brain, the work isn’t done. In a study published August 25 in Neuron, researchers show that the rat’s memories of reaching the reward play forward and backward in its hippocampus, but it’s the backwards replay that increases with the size of the reward, possibly as a way to reinforce learning.

“You don’t learn everything you need to learn during the actual experience itself,” says senior author David Foster, associate professor of neuroscience at Johns Hopkins University School of Medicine, in a release. “Experience is expensive. When rats run around, it’s dangerous. So experience should be treated as valuable, processed offline, and remembered.”

As the animal pauses and enjoys his drink, his hippocampus engages in a series of very fast reverse simulations. If it took two seconds to run down a track to get a drink of chocolate, the simulation that runs in the hippocampus will take just a tenth of a second, about 20 times faster. “We had the notion that this could have something to do with memory because the animal is quite literally replaying sequences of places from his past,” says Foster.

Fast-motion replays involve neurons called place cells that represent locations in the current environment. Replays in place cells were initially found in rats during sleep, but years ago Foster found that these same patterns also happen when the animals are awake. Soon after, it was clear that they run as both forward and backward simulations. This study is the first to determine how the animal’s experience influences the direction of the replays.

Foster and lead author Ellen Ambrose, a graduate student in Foster’s lab, enlisted five rats to participate in two experiments. The setups were the same. The rats ran back and forth along a straight path and found liquid chocolate at either end. In one experiment, the amount of chocolate was increased during certain runs. In the other, the amount of reward was decreased.

The rats were outfitted with electrodes to record the activity of over a hundred hippocampal place cells in close proximity during their explorations. By recording large swaths of cells, the team was able to capture rapid bursts of cell-to-cell communication rippling through the place cells, indicating a fast-motion replay running forward or in reverse.

Many studies that record the neural activity that occurs as rats navigate mazes focus on linking neural activity to behavior, but this study focuses on the neural processing that occurs during chocolate-sipping breaks. “When the animal arrives at the food, only then does he truly understand what the previous experiences meant,” says Foster. “The offline processing that happens during these brief rest periods has meaning and may contribute to memory and other functions.”

Foster’s team found that the number of reverse replays rose and fell with the size of the reward, but the number of forward replays remained constant. The findings suggest that the two forms of fast-motion simulation play different roles. Reverse replay appears to be related to reward-based learning, while forward replay might be about planning for the future.

While it is too invasive to record reverse fast-motion playbacks in healthy humans, the sharp-wave ripple oscillations that occur during replays have been observed in epilepsy patients with electrodes implanted in their brains for the purpose of surgery treatment. Non-invasive imaging studies have also shown that humans have place cells that represent the current environment, so it is possible that human brains employ a similar mechanism. In fact, reverse replays are also used in machine learning. “This is an ideal algorithm for learning about rewards,” says Foster.