A novel Large Memory Model (LM2) architecture enhances Transformers with an auxiliary memory module, significantly outperforming existing models in multi-hop inference and numerical reasoning tasks. The model demonstrates a 37.1% improvement over RMT and 86.3% over Llama-3.2 on the BABILong benchmark while maintaining strong performance on general tasks.
https://arxiv.org/abs/2502.06049
#aiarchitecture #machinelearning #memorysystems #performanceanalysis #neuralnetworks