GOOGLE

Google Research Paper Reveals a Shortcoming in Search

Published

3 years ago

April 15, 2021

Google Research Paper Reveals a Shortcoming in Search

A recent Google research paper on Long Form Question Answering illustrates how difficult it is to answer questions that need longer and nuanced answers. While the researchers were able to improve the state of the art of this kind of question answering, they also admitted that their results needed significant improvements.

I read this research paper last month when it was published and have been wanting to share it because it focuses on solving a shortcoming in search that isn’t discussed much at all.

I hope you find it as fascinating as I did!

What Search Engines Get Right

This research centers on Long Form Open-Domain Question Answering, an area that Natural Language processing continues to see improvements.

What search engines are good at is called, Factoid Open-domain Question Answering or simply Open-domain Question Answering.

Open Domain Question Answering is a task wherein an algorithm responds with an answer to a question in natural language.

What color is the sky? The sky is blue.

Long Form Question Answering (LFQA)

The research paper states that Long-form Question Answering (LFQA) is important but a challenge and that progress in being able to achieve this kind of question answering is not as far along as Open-domain Question Answering.

According to the research paper:

“Open-domain long-form question answering (LFQA) is a fundamental challenge in natural language processing (NLP) that involves retrieving documents relevant to a given question and using them to generate an elaborate paragraph-length answer.
While there has been remarkable recent progress in factoid open-domain question answering (QA), where a short phrase or entity is enough to answer a question, much less work has been done in the area of long-form question answering.
LFQA is nevertheless an important task, especially because it provides a testbed to measure the factuality of generative text models. But, are current benchmarks and evaluation metrics really suitable for making progress on LFQA?”

Search Engine Question Answering

Question answering by search engines typically consists of a searcher asking a question and the search engine returning a relatively short text of information.

Questions like “What’s the phone number of XYZ store?” is an example of a typical question that search engines are good at answering, especially because the answer is objective and not subjective.

Long Form Question Answering is harder because the questions demand answers in the form of paragraphs, not short texts.

Facebook is also working on long form question answering and came up with interesting solutions like using a question and answer subreddit called Explain Like I’m 5 (a dataset called ELI5). Facebook also admits that there more work to do. (Introducing Long-form Question Answering)

Examples of Long Form Questions

Once you read these examples of long form questions it’s going to be clearer how we’ve been trained by search engines to ask a limited set of queries. It might even seem shocking how almost infantile our questions are compared to long form questions.

The Google research paper offers these examples of long form questions:

What goes on in those tall tower buildings owned by major banks?
What exactly is fire, in detail? How can light and heat come from something we can’t really touch?
Why do Britain and other English empire countries still bow to monarchs? What real purpose does the queen serve?

Facebook offers these examples of long form questions:

Why are some restaurants better than others if they serve basically the same food?
What are the differences between bodies of water like lakes, rivers, and seas?
Why do we feel more jet lagged when traveling east?

Are Searchers Trained to Ask Short Questions for Factoids?

Google (and Bing) have a difficult time answering these long form types of questions. This may impact their ability to surface content that provides complex answers for complex questions.

Maybe people don’t ask these questions because they’ve been trained not to because of the poor responses. But if search engines were able to answer these kinds of questions then people would begin to ask them.

It’s a whole wide world of questions and answers that are missing from our search experience.

If I shorten the phrase “Why are some restaurants better than others if they serve basically the same food?” to “Why are some restaurants better than others?” Google and Bing still fail to provide an adequate answer.

The top Google search result for that question comes from the (HTTP insecure) blog of a Canadian Indian.

Google cites this section of the Indian restaurant in the SERP:

“People pay for the overall experience and not just the food and that is why some restaurants charge much more than others. Restaurant customers expect the prices to reflect the type of food, level of service and the overall atmosphere of the restaurant.”

What if the person had Popeye’s Fried Chicken versus KFC in mind when they asked that question?

There’s a certain amount of subjectivity that can creep into answering these kinds of questions that demands a long and coherent answer.

I can’t help thinking that there’s a better answer out there somewhere. But Google and Bing are unable to surface that kind of content.

Google Uses Signals to Identify High Quality Content

In a How Search Works explainer that Google published in September 2020, Google admits that it does not use the content itself to identify if it is reliable or trustworthy.

Google explains that it uses signals in a blog post titled, “How Google Delivers Reliable Information in Search.”

“…when it comes to high-quality, trustworthy information… We often can’t tell from the words or images alone if something is exaggerated, incorrect, low-quality or otherwise unhelpful.
Instead, search engines largely understand the quality of content through what are commonly called “signals.” You can think of these as clues about the characteristics of a page that align with what humans might interpret as high quality or reliable.
For example, the number of quality pages that link to a particular page is a signal that a page may be a trusted source of information on a topic.”

Unfortunately, that part of Google’s algorithm is unable to provide a correct answer to these kinds of long form questions.

And that’s an interesting and important fact to understand because it helps to be aware of what the limits are to search technology today.

What About Passage Ranking?

Passage Ranking is about ranking long web pages that contain the short answers for normal short queries needing an objective answer.

Martin Splitt used the example of finding a relevant answer about tomatoes in a web page that is mostly about gardening in general.

Passage ranking cannot solve the hard questions that Google currently cannot answer.

Both Google and Bing generally fail to answer LFQA type queries because this is an area that search engines still need to improve.

Hurdles to Progress

The research paper itself acknowledges that shortcoming in the title:

“Hurdles to Progress in Long-form Question Answering“

The research paper concludes by stating that its approach to solving this task “achieves state of the art performance” but that there are still issues to resolve and more research that needs to be done.

This is how the paper concludes:

“We present a “retrieval augmented” generation system that achieves state of the art performance on the ELI5 long-form question answering dataset. However, an in-depth analysis reveals several issues not only with our model, but also with the ELI5 dataset & evaluation metrics. We hope that the community works towards solving these issues so that we can climb the right hills and make meaningful progress.”

Questions and Speculation

It’s not possible to provide a definitive answer but one has to wonder if there are web pages out there that are missing out on traffic because both Google and Bing are not able to surface their long form content in answer to long form questions.

Also, some publisher mistakenly overwrite their articles in a quest to be authoritative. Is it possible that those publishers are over-writing themselves out of search traffic from queries that demand shorter answers since search engines can’t deliver nuanced answers available in longer documents?

There’s no way of knowing these answers for certain.

But one thing this research paper makes clear is that long-form question answering is a shortcoming in search engines today.

Citations

Google AI Blog Post
Progress and Challenges in Long-Form Open-Domain Question Answering

PDF Version of Research Paper
Hurdles to Progress in Long-form Question Answering

Facebook Web Page About LFQA
Introducing Long-form Question Answering

Searchenginejournal.com

Share on Facebook

Post on X

Save

AI

Exploring the Evolution of Language Translation: A Comparative Analysis of AI Chatbots and Google Translate

Published

2 months ago

February 26, 2024

Max

A Comparative Analysis of AI Chatbots and Google Translate

According to an article on PCMag, while Google Translate makes translating sentences into over 100 languages easy, regular users acknowledge that there’s still room for improvement.

In theory, large language models (LLMs) such as ChatGPT are expected to bring about a new era in language translation. These models consume vast amounts of text-based training data and real-time feedback from users worldwide, enabling them to quickly learn to generate coherent, human-like sentences in a wide range of languages.

However, despite the anticipation that ChatGPT would revolutionize translation, previous experiences have shown that such expectations are often inaccurate, posing challenges for translation accuracy. To put these claims to the test, PCMag conducted a blind test, asking fluent speakers of eight non-English languages to evaluate the translation results from various AI services.

The test compared ChatGPT (both the free and paid versions) to Google Translate, as well as to other competing chatbots such as Microsoft Copilot and Google Gemini. The evaluation involved comparing the translation quality for two test paragraphs across different languages, including Polish, French, Korean, Spanish, Arabic, Tagalog, and Amharic.

In the first test conducted in June 2023, participants consistently favored AI chatbots over Google Translate. ChatGPT, Google Bard (now Gemini), and Microsoft Bing outperformed Google Translate, with ChatGPT receiving the highest praise. ChatGPT demonstrated superior performance in converting colloquialisms, while Google Translate often provided literal translations that lacked cultural nuance.

For instance, ChatGPT accurately translated colloquial expressions like “blow off steam,” whereas Google Translate produced more literal translations that failed to resonate across cultures. Participants appreciated ChatGPT’s ability to maintain consistent levels of formality and its consideration of gender options in translations.

The success of AI chatbots like ChatGPT can be attributed to reinforcement learning with human feedback (RLHF), which allows these models to learn from human preferences and produce culturally appropriate translations, particularly for non-native speakers. However, it’s essential to note that while AI chatbots outperformed Google Translate, they still had limitations and occasional inaccuracies.

In a subsequent test, PCMag evaluated different versions of ChatGPT, including the free and paid versions, as well as language-specific AI agents from OpenAI’s GPTStore. The paid version of ChatGPT, known as ChatGPT Plus, consistently delivered the best translations across various languages. However, Google Translate also showed improvement, performing surprisingly well compared to previous tests.

Overall, while ChatGPT Plus emerged as the preferred choice for translation, Google Translate demonstrated notable improvement, challenging the notion that AI chatbots are always superior to traditional translation tools.

Source: https://www.pcmag.com/articles/google-translate-vs-chatgpt-which-is-the-best-language-translator

Share on Facebook

Post on X

Save

GOOGLE

Google Implements Stricter Guidelines for Mass Email Senders to Gmail Users

Published

2 months ago

February 13, 2024

Entireweb News Bot

Beginning in April, Gmail senders bombarding users with unwanted mass emails will encounter a surge in message rejections unless they comply with the freshly minted Gmail email sender protocols, Google cautions.

Fresh Guidelines for Dispatching Mass Emails to Gmail Inboxes In an elucidative piece featured on Forbes, it was highlighted that novel regulations are being ushered in to shield Gmail users from the deluge of unsolicited mass emails. Initially, there were reports surfacing about certain marketers receiving error notifications pertaining to messages dispatched to Gmail accounts. Nonetheless, a Google representative clarified that these specific errors, denoted as 550-5.7.56, weren’t novel but rather stemmed from existing authentication prerequisites.

Moreover, Google has verified that commencing from April, they will initiate “the rejection of a portion of non-compliant email traffic, progressively escalating the rejection rate over time.” Google elaborates that, for instance, if 75% of the traffic adheres to the new email sender authentication criteria, then a portion of the remaining non-conforming 25% will face rejection. The exact proportion remains undisclosed. Google does assert that the implementation of the new regulations will be executed in a “step-by-step fashion.”

This cautious and methodical strategy seems to have already kicked off, with transient errors affecting a “fraction of their non-compliant email traffic” coming into play this month. Additionally, Google stipulates that bulk senders will be granted until June 1 to integrate “one-click unsubscribe” in all commercial or promotional correspondence.

Exclusively Personal Gmail Accounts Subject to Rejection These alterations exclusively affect bulk emails dispatched to personal Gmail accounts. Entities sending out mass emails, specifically those transmitting a minimum of 5,000 messages daily to Gmail accounts, will be mandated to authenticate outgoing emails and “refrain from dispatching unsolicited emails.” The 5,000 message threshold is tabulated based on emails transmitted from the same principal domain, irrespective of the employment of subdomains. Once the threshold is met, the domain is categorized as a permanent bulk sender.

These guidelines do not extend to communications directed at Google Workspace accounts, although all senders, including those utilizing Google Workspace, are required to adhere to the updated criteria.

Augmented Security and Enhanced Oversight for Gmail Users A Google spokesperson emphasized that these requisites are being rolled out to “fortify sender-side security and augment user control over inbox contents even further.” For the recipient, this translates to heightened trust in the authenticity of the email sender, thus mitigating the risk of falling prey to phishing attempts, a tactic frequently exploited by malevolent entities capitalizing on authentication vulnerabilities. “If anything,” the spokesperson concludes, “meeting these stipulations should facilitate senders in reaching their intended recipients more efficiently, with reduced risks of spoofing and hijacking by malicious actors.”

Share on Facebook

Post on X

Save

GOOGLE

Google’s Next-Gen AI Chatbot, Gemini, Faces Delays: What to Expect When It Finally Launches

Published

5 months ago

December 4, 2023

Max

In an unexpected turn of events, Google has chosen to postpone the much-anticipated debut of its revolutionary generative AI model, Gemini. Initially poised to make waves this week, the unveiling has now been rescheduled for early next year, specifically in January.

Gemini is set to redefine the landscape of conversational AI, representing Google’s most potent endeavor in this domain to date. Positioned as a multimodal AI chatbot, Gemini boasts the capability to process diverse data types. This includes a unique proficiency in comprehending and generating text, images, and various content formats, even going so far as to create an entire website based on a combination of sketches and written descriptions.

Originally, Google had planned an elaborate series of launch events spanning California, New York, and Washington. Regrettably, these events have been canceled due to concerns about Gemini’s responsiveness to non-English prompts. According to anonymous sources cited by The Information, Google’s Chief Executive, Sundar Pichai, personally decided to postpone the launch, acknowledging the importance of global support as a key feature of Gemini’s capabilities.

Gemini is expected to surpass the renowned ChatGPT, powered by OpenAI’s GPT-4 model, and preliminary private tests have shown promising results. Fueled by significantly enhanced computing power, Gemini has outperformed GPT-4, particularly in FLOPS (Floating Point Operations Per Second), owing to its access to a multitude of high-end AI accelerators through the Google Cloud platform.

SemiAnalysis, a research firm affiliated with Substack Inc., expressed in an August blog post that Gemini appears poised to “blow OpenAI’s model out of the water.” The extensive compute power at Google’s disposal has evidently contributed to Gemini’s superior performance.

Google’s Vice President and Manager of Bard and Google Assistant, Sissie Hsiao, offered insights into Gemini’s capabilities, citing examples like generating novel images in response to specific requests, such as illustrating the steps to ice a three-layer cake.

While Google’s current generative AI offering, Bard, has showcased noteworthy accomplishments, it has struggled to achieve the same level of consumer awareness as ChatGPT. Gemini, with its unparalleled capabilities, is expected to be a game-changer, demonstrating impressive multimodal functionalities never seen before.

During the initial announcement at Google’s I/O developer conference in May, the company emphasized Gemini’s multimodal prowess and its developer-friendly nature. An application programming interface (API) is under development, allowing developers to seamlessly integrate Gemini into third-party applications.

As the world awaits the delayed unveiling of Gemini, the stakes are high, with Google aiming to revolutionize the AI landscape and solidify its position as a leader in generative artificial intelligence. The postponed launch only adds to the anticipation surrounding Gemini’s eventual debut in the coming year.

Share on Facebook

Post on X

Save