Google Research Paper Reveals a Shortcoming in Search

A recent Google research paper on Long Form Question Answering illustrates how difficult it is to answer questions that need longer and nuanced answers. While the researchers were able to improve the state of the art of this kind of question answering, they also admitted that their results needed significant improvements.
I read this research paper last month when it was published and have been wanting to share it because it focuses on solving a shortcoming in search that isn’t discussed much at all.
I hope you find it as fascinating as I did!
What Search Engines Get Right
This research centers on Long Form Open-Domain Question Answering, an area that Natural Language processing continues to see improvements.
What search engines are good at is called, Factoid Open-domain Question Answering or simply Open-domain Question Answering.
Open Domain Question Answering is a task wherein an algorithm responds with an answer to a question in natural language.
What color is the sky? The sky is blue.
Long Form Question Answering (LFQA)
The research paper states that Long-form Question Answering (LFQA) is important but a challenge and that progress in being able to achieve this kind of question answering is not as far along as Open-domain Question Answering.
According to the research paper:
“Open-domain long-form question answering (LFQA) is a fundamental challenge in natural language processing (NLP) that involves retrieving documents relevant to a given question and using them to generate an elaborate paragraph-length answer.
While there has been remarkable recent progress in factoid open-domain question answering (QA), where a short phrase or entity is enough to answer a question, much less work has been done in the area of long-form question answering.
LFQA is nevertheless an important task, especially because it provides a testbed to measure the factuality of generative text models. But, are current benchmarks and evaluation metrics really suitable for making progress on LFQA?”
Search Engine Question Answering
Question answering by search engines typically consists of a searcher asking a question and the search engine returning a relatively short text of information.
Questions like “What’s the phone number of XYZ store?” is an example of a typical question that search engines are good at answering, especially because the answer is objective and not subjective.
Long Form Question Answering is harder because the questions demand answers in the form of paragraphs, not short texts.
Facebook is also working on long form question answering and came up with interesting solutions like using a question and answer subreddit called Explain Like I’m 5 (a dataset called ELI5). Facebook also admits that there more work to do. (Introducing Long-form Question Answering)
Examples of Long Form Questions
Once you read these examples of long form questions it’s going to be clearer how we’ve been trained by search engines to ask a limited set of queries. It might even seem shocking how almost infantile our questions are compared to long form questions.
The Google research paper offers these examples of long form questions:
- What goes on in those tall tower buildings owned by major banks?
- What exactly is fire, in detail? How can light and heat come from something we can’t really touch?
- Why do Britain and other English empire countries still bow to monarchs? What real purpose does the queen serve?
Facebook offers these examples of long form questions:
- Why are some restaurants better than others if they serve basically the same food?
- What are the differences between bodies of water like lakes, rivers, and seas?
- Why do we feel more jet lagged when traveling east?
Are Searchers Trained to Ask Short Questions for Factoids?
Google (and Bing) have a difficult time answering these long form types of questions. This may impact their ability to surface content that provides complex answers for complex questions.
Maybe people don’t ask these questions because they’ve been trained not to because of the poor responses. But if search engines were able to answer these kinds of questions then people would begin to ask them.
It’s a whole wide world of questions and answers that are missing from our search experience.
If I shorten the phrase “Why are some restaurants better than others if they serve basically the same food?” to “Why are some restaurants better than others?” Google and Bing still fail to provide an adequate answer.
The top Google search result for that question comes from the (HTTP insecure) blog of a Canadian Indian.
Google cites this section of the Indian restaurant in the SERP:
“People pay for the overall experience and not just the food and that is why some restaurants charge much more than others. Restaurant customers expect the prices to reflect the type of food, level of service and the overall atmosphere of the restaurant.”
What if the person had Popeye’s Fried Chicken versus KFC in mind when they asked that question?
There’s a certain amount of subjectivity that can creep into answering these kinds of questions that demands a long and coherent answer.
I can’t help thinking that there’s a better answer out there somewhere. But Google and Bing are unable to surface that kind of content.
Google Uses Signals to Identify High Quality Content
In a How Search Works explainer that Google published in September 2020, Google admits that it does not use the content itself to identify if it is reliable or trustworthy.
Google explains that it uses signals in a blog post titled, “How Google Delivers Reliable Information in Search.”
“…when it comes to high-quality, trustworthy information… We often can’t tell from the words or images alone if something is exaggerated, incorrect, low-quality or otherwise unhelpful.
Instead, search engines largely understand the quality of content through what are commonly called “signals.” You can think of these as clues about the characteristics of a page that align with what humans might interpret as high quality or reliable.
For example, the number of quality pages that link to a particular page is a signal that a page may be a trusted source of information on a topic.”
Unfortunately, that part of Google’s algorithm is unable to provide a correct answer to these kinds of long form questions.
And that’s an interesting and important fact to understand because it helps to be aware of what the limits are to search technology today.
What About Passage Ranking?
Passage Ranking is about ranking long web pages that contain the short answers for normal short queries needing an objective answer.
Martin Splitt used the example of finding a relevant answer about tomatoes in a web page that is mostly about gardening in general.
Passage ranking cannot solve the hard questions that Google currently cannot answer.
Both Google and Bing generally fail to answer LFQA type queries because this is an area that search engines still need to improve.
Hurdles to Progress
The research paper itself acknowledges that shortcoming in the title:
“Hurdles to Progress in Long-form Question Answering“
The research paper concludes by stating that its approach to solving this task “achieves state of the art performance” but that there are still issues to resolve and more research that needs to be done.
This is how the paper concludes:
“We present a “retrieval augmented” generation system that achieves state of the art performance on the ELI5 long-form question answering dataset. However, an in-depth analysis reveals several issues not only with our model, but also with the ELI5 dataset & evaluation metrics. We hope that the community works towards solving these issues so that we can climb the right hills and make meaningful progress.”
Questions and Speculation
It’s not possible to provide a definitive answer but one has to wonder if there are web pages out there that are missing out on traffic because both Google and Bing are not able to surface their long form content in answer to long form questions.
Also, some publisher mistakenly overwrite their articles in a quest to be authoritative. Is it possible that those publishers are over-writing themselves out of search traffic from queries that demand shorter answers since search engines can’t deliver nuanced answers available in longer documents?
There’s no way of knowing these answers for certain.
But one thing this research paper makes clear is that long-form question answering is a shortcoming in search engines today.
Citations
Google AI Blog Post
Progress and Challenges in Long-Form Open-Domain Question Answering
PDF Version of Research Paper
Hurdles to Progress in Long-form Question Answering
Facebook Web Page About LFQA
Introducing Long-form Question Answering
Google’s Next-Gen AI Chatbot, Gemini, Faces Delays: What to Expect When It Finally Launches

In an unexpected turn of events, Google has chosen to postpone the much-anticipated debut of its revolutionary generative AI model, Gemini. Initially poised to make waves this week, the unveiling has now been rescheduled for early next year, specifically in January.
Gemini is set to redefine the landscape of conversational AI, representing Google’s most potent endeavor in this domain to date. Positioned as a multimodal AI chatbot, Gemini boasts the capability to process diverse data types. This includes a unique proficiency in comprehending and generating text, images, and various content formats, even going so far as to create an entire website based on a combination of sketches and written descriptions.
Originally, Google had planned an elaborate series of launch events spanning California, New York, and Washington. Regrettably, these events have been canceled due to concerns about Gemini’s responsiveness to non-English prompts. According to anonymous sources cited by The Information, Google’s Chief Executive, Sundar Pichai, personally decided to postpone the launch, acknowledging the importance of global support as a key feature of Gemini’s capabilities.
Gemini is expected to surpass the renowned ChatGPT, powered by OpenAI’s GPT-4 model, and preliminary private tests have shown promising results. Fueled by significantly enhanced computing power, Gemini has outperformed GPT-4, particularly in FLOPS (Floating Point Operations Per Second), owing to its access to a multitude of high-end AI accelerators through the Google Cloud platform.
SemiAnalysis, a research firm affiliated with Substack Inc., expressed in an August blog post that Gemini appears poised to “blow OpenAI’s model out of the water.” The extensive compute power at Google’s disposal has evidently contributed to Gemini’s superior performance.
Google’s Vice President and Manager of Bard and Google Assistant, Sissie Hsiao, offered insights into Gemini’s capabilities, citing examples like generating novel images in response to specific requests, such as illustrating the steps to ice a three-layer cake.
While Google’s current generative AI offering, Bard, has showcased noteworthy accomplishments, it has struggled to achieve the same level of consumer awareness as ChatGPT. Gemini, with its unparalleled capabilities, is expected to be a game-changer, demonstrating impressive multimodal functionalities never seen before.
During the initial announcement at Google’s I/O developer conference in May, the company emphasized Gemini’s multimodal prowess and its developer-friendly nature. An application programming interface (API) is under development, allowing developers to seamlessly integrate Gemini into third-party applications.
As the world awaits the delayed unveiling of Gemini, the stakes are high, with Google aiming to revolutionize the AI landscape and solidify its position as a leader in generative artificial intelligence. The postponed launch only adds to the anticipation surrounding Gemini’s eventual debut in the coming year.
Google Brings Bard Students Math and Coding Education in the Summer

Google is stepping up its AI efforts this summer by sending Bard, its high-profile chatbot, to summer school. The aim? To boost the bot’s math and coding smarts. These developments are excellent news— when Bard first debuted, it was admittedly not a finished product. But Google is steadily plugging away at it, and have now implemented implicit code execution for logical prompts, and handy Google Sheets’ integration to take it to the next level.
Thanks to implicit code execution, Bard can respond to inquiries requiring calculation or computation with Python code snippets running in the background. What’s even more amazing is that coders can take this generated code and modify it for their projects. Though Google is still apprehensive about guaranteeing the accuracy of Bard’s answers, this feature is said to improve the accuracy of math and word problems by an impressive 30%.
In addition to this, Bard can now export directly to Sheets when asked about tables. So, you don’t need to worry about copying and pasting, which comes with the risk of losing formatting or data.
From the company’s I/O keynote address, it is clear that they are focused on making the most of what Bard can offer. As they continue to speak highly of the chatbot, we’re sure to expect more features and capabilities when the summer comes around.
Google Bard vs. ChatGPT: which is the better AI chatbot?

Google Bard and ChatGPT are two of the most prominent artificial intelligence (AI) chatbots available in 2023. But which is better? Both offer natural language responses to natural language inputs, using machine learning and millions of data points to craft useful, informative responses. Most of the time. These AI tools aren’t perfect yet, but they point to an exciting future of AI assistant search and learning tools that will make information all the more readily available.
As similar as these chatbots are, they also have some distinct differences. Here’s how ChatGPT and Google Bard measure up against one another.
Which is better, Google Bard or ChatGPT?
This is a tricky question to answer, as at the time of writing, you can only use Google Bard if you’re part of a select group of early beta testers. As for its competition, you can use ChatGPT right now, completely for free. You may have to contend with a waitlist, but if you want to skip that, there’s a paid-for Plus version offering those interested in a more complete tool the option of paying for the privilege.
Still, when Google Bard becomes more widely available, it should offer credible competition for ChatGPT. Both use natural language models — Google Bard uses Google’s internal LaMDA (Language Model for Dialogue Applications), whereas ChatGPT uses an older GPT-3 language model. Google Bard bases its responses to questions on more recent data, with ChatGPT mainly trained on data that was available prior to 2021. This is similar to how Microsoft’s Bing Chat works.
We’ll have to reserve judgment on which is the more capable AI chatbot until we get time to play with Google Bard ourselves, but it looks set to be a close contest when it is more readily available.
Are Google Bard and ChatGPT available yet?
As mentioned, ChatGPT is available in free and paid-for tiers. You might have to sit in a queue for the free version for a while, but anyone can play around with its capabilities.
Google Bard is currently only available to limited beta testers and is not available to the wider public.

What’s the difference between Google Bard and ChatGPT?
ChatGPT and Google Bard are very similar natural language AI chatbots, but they have some differences, and are designed to be used in slightly different ways — at least for now. ChatGPT has been used for answering direct questions with direct answers, mostly correctly, but it’s caused a lot of consternation among white collar workers, like writers, SEO advisors, and copy editors, since it has also demonstrated an impressive ability to write creatively — even if it has faced a few problems with accuracy and plagiarism.
Still, Microsoft has integrated ChatGPT into its Bing search engine to give users the ability to ask direct questions of the search engine, rather than searching for terms of keywords to find the best results. It has also built it into its Teams communications tool, and it’s coming to the Edge browser in a limited form. The Opera browser has also pledged to integrate ChatGPT in the future.
ChatGPT | Google Bard |
Accessible through ChatGPT site. Only text responses are returned via queries. | Integrated with Google Search. You only need to change a Google setting to get your regular search results when using Google Bard AI, and vice versa. |
ChatGPT produces answers from its trained database from 2021 and before. | Google Apprentice Bard AI will be able to answer real-time questions. |
Based on GPT (Generative Pre-trained Transformer). | Based on LaMDA (Language Model for Dialogue Applications). |
Service has a free and paid plan option (called ChatGPT Plus). | Service is free. |
Has built-in plagiarism tool called GPT-2 Output Detector. | No built-in plagiarism detection tool. |
Available now | Still in beta test phase |
Google Bard was mainly designed around augmenting Google’s own search tool, however it is also destined to become an automated support tool for businesses without the funds to pay for human support teams. It will be offered to customers through a trained AI responder. It is likely to be integrated into the Chrome browser and its Chromium derivatives before long. Google is also expected to open up Google Bard to third-party developers in the future.
Under the hood, Google Bard uses Google’s LaMDA language model, while ChatGPT uses its own GPT3 model. ChatGPT is based on slightly older data, restricted in its current GPT3 model to data collected prior to 2022, while Google Bard is built on data provided on recent years too. However, that doesn’t necessarily make it more accurate, as Google Bard has faced problems with incorrect answers to questions, even in its initial unveiling.
ChatGPT also has a built-in plagiarism checker, while Google Bard does not, but Google Bard doesn’t have the creative applications of ChatGPT just yet.
-
SEO6 days ago
GPT Store Set To Launch In 2024 After ‘Unexpected’ Delays
-
SEARCHENGINES6 days ago
Google Core Update Done Followed By Intense Search Volatility, New Structured Data, Google Ads Head Steps Down & 20 Years Covering Search
-
MARKETING6 days ago
The Complete Guide to Becoming an Authentic Thought Leader
-
PPC6 days ago
How to Get Clients for Your Agency (That You’ll Love Working With)
-
SEARCHENGINES6 days ago
Google Discover Showing Older Content Since Follow Feature Arrived
-
WORDPRESS2 days ago
8 Best Zapier Alternatives to Automate Your Website
-
MARKETING5 days ago
How Does Success of Your Business Depend on Choosing Type of Native Advertising?
-
MARKETING5 days ago
OpenAI’s Drama Should Teach Marketers These 2 Lessons
You must be logged in to post a comment Login