The internet search engine of the future will be powered by artificial intelligence. One can already choose from a host of AI-powered or AI-enhanced search engines—though their reliability often still leaves much to be desired. However, a team of computer scientists at the University of Massachusetts Amherst recently published and released a novel system for evaluating the reliability of AI-generated searches.
Called “eRAG,” the method is a way of putting the AI and search engine in conversation with each other, then evaluating the quality of search engines for AI use. The work is published as part of the Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval.
“All of the search engines that we’ve always used were designed for humans,” says Alireza Salemi, a graduate student in the Manning College of Information and Computer Sciences at UMass Amherst and the paper’s lead author.
“They work pretty well when the user is a human, but the search engine of the future’s main user will be one of the AI Large Language Models (LLMs), like ChatGPT. This means that we need to completely redesign the way that search engines work, and my research explores how LLMs and search engines can learn from each other.”
The basic problem that Salemi and the senior author of the research, Hamed Zamani, associate professor of information and computer sciences at UMass Amherst, confront is that humans and LLMs have very different informational needs and consumption behavior.
For instance, if you can’t quite remember the title and author of that new book that was just published, you can enter a series of general search terms, such as, “what is the new spy novel with an environmental twist by that famous writer,” and then narrow the results down, or run another search as you remember more information (the author is a woman who wrote the novel “Flamethrowers”), until you find the correct result (“Creation Lake” by Rachel Kushner—which Google returned as the third hit after following the process above).
But that’s how humans work, not LLMs. They are trained on specific, enormous sets of data, and anything that is not in that data set—like the new book that just hit the stands—is effectively invisible to the LLM.
Furthermore, they’re not particularly reliable with hazy requests, because the LLM needs to be able to ask the engine for more information; but to do so, it needs to know the correct additional information to ask.
Computer scientists have devised a way to help LLMs evaluate and choose the information they need, called “retrieval-augmented generation,” or RAG. RAG is a way of augmenting LLMs with the result lists produced by search engines. But of course, the question is, how to evaluate how useful the retrieval results are for the LLMs?
So far, researchers have come up with three main ways to do this: the first is to crowdsource the accuracy of the relevance judgments with a group of humans. However, it’s a very costly method and humans may not have the same sense of relevance as an LLM.
One can also have an LLM generate a relevance judgment, which is far cheaper, but the accuracy suffers unless one has access to one of the most powerful LLM models. The third way, which is the gold standard, is to evaluate the end-to-end performance of retrieval-augmented LLMs.
But even this third method has its drawbacks. “It’s very expensive,” says Salemi, “and there are some concerning transparency issues. We don’t know how the LLM arrived at its results; we just know that it either did or didn’t.” Furthermore, there are a few dozen LLMs in existence right now, and each of them work in different ways, returning different answers.
Instead, Salemi and Zamani have developed eRAG, which is similar to the gold-standard method, but far more cost-effective, up to three times faster, uses 50 times less GPU power and is nearly as reliable.
“The first step towards developing effective search engines for AI agents is to accurately evaluate them,” says Zamani. “eRAG provides a reliable, relatively efficient and effective evaluation methodology for search engines that are being used by AI agents.”
In brief, eRAG works like this: a human user uses an LLM-powered AI agent to accomplish a task. The AI agent will submit a query to a search engine and the search engine will return a discrete number of results—say, 50—for LLM consumption.
eRAG runs each of the 50 documents through the LLM to find out which specific document the LLM found useful for generating the correct output. These document-level scores are then aggregated for evaluating the search engine quality for the AI agent.
While there is currently no search engine that can work with all the major LLMs that have been developed, the accuracy, cost-effectiveness and ease with which eRAG can be implemented is a major step toward the day when all our search engines run on AI.
This research has been awarded a Best Short Paper Award by the Association for Computing Machinery’s International Conference on Research and Development in Information Retrieval (SIGIR 2024). A public python package, containing the code for eRAG, is available at https://github.com/alirezasalemi7/eRAG.
More information: Alireza Salemi et al, Evaluating Retrieval Quality in Retrieval-Augmented Generation, Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (2024). DOI: 10.1145/3626772.3657957
News
Why Some Drinkers Suffer Devastating Liver Damage While Others Don’t
A study from Keck Medicine of USC found that heavy drinkers with diabetes, high blood pressure, or a large waistline are up to 2.4 times more likely to develop advanced liver disease. These conditions may amplify [...]
“Good” Cholesterol Could Be Bad for Your Eyes – New Study Raises Concerns
‘Good’ cholesterol may be linked to an increased risk of glaucoma in individuals over 55, while, paradoxically, ‘bad’ cholesterol may be associated with a lower risk. These findings challenge conventional beliefs about factors that [...]
Reawakening Dormant Nerve Cells: Groundbreaking Neurotechnology Restores Motor Function
A new electrical stimulation therapy for spinal muscle atrophy (SMA) has shown promise in reactivating motor neurons and improving movement. In a pilot clinical trial, three patients who received spinal cord stimulation for one [...]
AI’s Energy Crisis Solved? A Revolutionary Magnetic Chip Could Change Everything
AI is evolving at an incredible pace, but its growing energy demands pose a major challenge. Enter spintronic devices—new technology that mimics the brain’s efficiency by integrating memory and processing. Scientists in Japan have [...]
Nanotechnology for oil spill response and cleanup in coastal regions
(Nanowerk News) Cleaning up after a major oil spill is a long, expensive process, and the damage to a coastal region’s ecosystem can be significant. This is especially true for the world’s Arctic region, [...]
The Role of Nanotechnology in Space Exploration
Nanotechnology, which involves working with materials at the atomic or molecular level, is becoming increasingly important in space exploration. By improving strength, thermal stability, electrical conductivity, and radiation resistance, nanotechnology is helping create lighter, more [...]
New Study Challenges Beliefs About CBD in Pregnancy, Reveals Unexpected Risks
CBD is gaining popularity as a remedy for pregnancy symptoms like nausea and anxiety, but new research suggests it may not be as safe as many believe. A study from McMaster University found that [...]
Does COVID increase the risk of Alzheimer’s disease?
Scientists discover that even mild COVID-19 can alter brain proteins linked to Alzheimer’s disease, potentially increasing dementia risk—raising urgent public health concerns. A recent study published in the journal Nature Medicine investigated whether both mild and [...]
New MRI Study Reveals How Cannabis Alters Brain Activity and Weakens Memory
A massive new study sheds light on how cannabis affects the brain, particularly during cognitive tasks. Researchers analyzed over 1,000 young adults and found that both heavy lifetime use and recent cannabis consumption significantly reduced brain [...]
How to Assess Nanotoxicity: Key Methods and Protocols
With their high surface area and enhanced physicochemical properties, nanomaterials play a critical role in drug delivery, consumer products, and environmental technologies. However, their nanoscale dimensions enable interactions with cellular components in complex and [...]
Nanotech drug delivery shows lasting benefits, reducing need for repeat surgeries
A nanotechnology-based drug delivery system developed at UVA Health to save patients from repeated surgeries has proved to have unexpectedly long-lasting benefits in lab tests – a promising sign for its potential to help human patients. [...]
Scientists Just Found DNA’s Building Blocks in Asteroid Bennu – Could This Explain Life’s Origins?
Japanese scientists detected all five nucleobases — building blocks of DNA and RNA — in samples returned from asteroid Bennu by NASA’s OSIRIS-REx mission. NASA’s OSIRIS-REx mission brought back 121.6 grams of asteroid Bennu, unveiling nitrogen-rich organic matter, including DNA’s essential [...]
AI-Designed Proteins – Unlike Any Found in Nature – Revolutionize Snakebite Treatment
Scientists have pioneered a groundbreaking method to combat snake venom using newly designed proteins, offering hope for more effective, accessible, and affordable antivenom solutions. By utilizing advanced computational techniques and deep learning, this innovative [...]
New nanosystem offers hope for improved diagnosis and treatment of tongue cancer
A pioneering study has unveiled the Au-HN-1 nanosystem, a cutting-edge approach that promises to transform the diagnosis and treatment of tongue squamous cell carcinoma (TSCC). By harnessing gold nanoparticles coupled with the HN-1 peptide, [...]
Global Trust in Science Is Stronger Than Expected – What’s Next?
A landmark global survey conducted across 68 countries has found that public trust in scientists remains robust, with significant support for their active involvement in societal and political matters. The study highlights the public’s [...]
Microplastics in the bloodstream may pose hidden risks to brain health
In a recent study published in the journal Science Advances, researchers investigated the impact of microplastics on blood flow and neurobehavioral functions in mice. Using advanced imaging techniques, they observed that microplastics obstruct cerebral blood [...]