AI & ML
impact 16
FACTS Grounding: A new benchmark for evaluating the factuality of large language models
FACTS Grounding: A new benchmark for evaluating the factuality of large language models Our comprehensive benchmark and online leaderboard offer a much-needed measure of how accurately LLMs ground their responses in pro…
Why it matters
This signals a broader shift in benchmark. The real question is whether facts moves the needle for practitioners.