New independent evaluations reveal that Meta's latest Llama 4 models - Maverick and Scout - perform well in standard tests but struggle with complex long-context tasks.<br /> The article Meta's Llama 4 models show promise on standard tests, but struggle with long-context tasks appeared first on THE DECODER. [...]