Connect with us

SEO

Research Shows Tree Of Thought Prompting Better Than Chain Of Thought

Published

on

Research Shows Tree Of Thought Prompting Better Than Chain Of Thought

Researchers discovered a way to defeat the safety guardrails in GPT4 and GPT4-Turbo, unlocking the ability to generate harmful and toxic content, essentially beating a large language model with another large language model.

The researchers discovered that the use of tree-of-thought (ToT)reasoning to repeat and refine a line of attack was useful for jailbreaking another large language model.

What they found is that the ToT approach was successful against GPT4, GPT4-Turbo, and PaLM-2, using a remarkably low number of queries to obtain a jailbreak, on average less than thirty queries.

Tree Of Thoughts Reasoning

A Google research paper from around May 2022 discovered Chain of Thought Prompting.

Chain of Thought (CoT) is a prompting strategy used on a generative AI to make it follow a sequence of steps in order to solve a problem and complete a task. The CoT method is often accompanied with examples to show the LLM how the steps work in a reasoning task.

So, rather than just ask a generative AI like Midjourney or ChatGPT to do a task, the chain of thought method instructs the AI how to follow a path of reasoning that’s composed of a series of steps.

Tree of Thoughts (ToT) reasoning, sometimes referred to as Tree of Thought (singular) is essentially a variation and improvement of CoT, but they’re two different things.

Tree of Thoughts reasoning is similar to CoT. The difference is that rather than training a generative AI to follow a single path of reasoning, ToT is built on a process that allows for multiple paths so that the AI can stop and self-assess then come up with alternate steps.

Tree of Thoughts reasoning was developed in May 2023 in a research paper titled Tree of Thoughts: Deliberate Problem Solving with Large Language Models (PDF)

The research paper describes Tree of Thought:

“…we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving.

ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices.

Our experiments show that ToT significantly enhances language models’ problem-solving abilities…”

Tree Of Attacks With Pruning (TAP)

This new method of jailbreaking large language models is called Tree of Attacks with Pruning, TAP. TAP uses two LLMs, one for attacking and the other for evaluating.

TAP is able to outperform other jailbreaking methods by significant margins, only requiring black-box access to the LLM.

A black box, in computing, is where one can see what goes into an algorithm and what comes out. But what happens in the middle is unknown, thus it’s said to be in a black box.

Tree of thoughts (TAP) reasoning is used against a targeted LLM like GPT-4 to repetitively try different prompting, assess the results, then if necessary change course if that attempt is not promising.

This is called a process of iteration and pruning. Each prompting attempt is analyzed for the probability of success. If the path of attack is judged to be a dead end, the LLM will “prune” that path of attack and begin another and better series of prompting attacks.

This is why it’s called a “tree” in that rather than using a linear process of reasoning which is the hallmark of chain of thought (CoT) prompting, tree of thought prompting is non-linear because the reasoning process branches off to other areas of reasoning, much like a human might do.

The attacker issues a series of prompts, the evaluator evaluates the responses to those prompts and then makes a decision as to what the next path of attack will be by making a call as to whether the current path of attack is irrelevant or not, plus it also evaluates the results to determine the likely success of prompts that have not yet been tried.

What’s remarkable about this approach is that this process reduces the number of prompts needed to jailbreak GPT-4. Additionally, a greater number of jailbreaking prompts are discovered with TAP than with any other jailbreaking method.

The researchers observe:

“In this work, we present Tree of Attacks with Pruning (TAP), an automated method for generating jailbreaks that only requires black-box access to the target LLM.

TAP utilizes an LLM to iteratively refine candidate (attack) prompts using tree-of-thoughts reasoning until one of the generated prompts jailbreaks the target.

Crucially, before sending prompts to the target, TAP assesses them and prunes the ones unlikely to result in jailbreaks.

Using tree-of-thought reasoning allows TAP to navigate a large search space of prompts and pruning reduces the total number of queries sent to the target.

In empirical evaluations, we observe that TAP generates prompts that jailbreak state-of-the-art LLMs (including GPT4 and GPT4-Turbo) for more than 80% of the prompts using only a small number of queries. This significantly improves upon the previous state-of-the-art black-box method for generating jailbreaks.”

Tree Of Thought (ToT) Outperforms Chain Of Thought (CoT) Reasoning

Another interesting conclusion reached in the research paper is that, for this particular task, ToT reasoning outperforms CoT reasoning, even when adding pruning to the CoT method, where off topic prompting is pruned and discarded.

ToT Underperforms With GPT 3.5 Turbo

The researchers discovered that ChatGPT 3.5 Turbo didn’t perform well with CoT, revealing the limitations of GPT 3.5 Turbo. Actually, GPT 3.5 performed exceedingly poorly, dropping from 84% success rate to only a 4.2% success rate.

This is their observation about why GPT 3.5 underperforms:

“We observe that the choice of the evaluator can affect the performance of TAP: changing the attacker from GPT4 to GPT3.5-Turbo reduces the success rate from 84% to 4.2%.

The reason for the reduction in success rate is that GPT3.5-Turbo incorrectly determines that the target model is jailbroken (for the provided goal) and, hence, preemptively stops the method.

As a consequence, the variant sends significantly fewer queries than the original method…”

What This Mean For You

While it’s amusing that the researchers use the ToT method to beat an LLM with another LLM, it also highlights the usefulness of ToT for generating surprising new directions in prompting in order to achieve higher levels of output.

  • TL/DR Takeaways:
  • Tree of Thought prompting outperformed Chain of Thought methods
  • GPT 3.5 worked significantly poorly in comparison to GPT 4 in ToT
  • Pruning is a useful part of a prompting strategy
  • Research showed that ToT is superior to CoT in an intensive reasoning task like jailbreaking an LLM

Read the original research paper:

Tree of Attacks: Jailbreaking Black-Box LLMs Automatically (PDF)

Featured Image by Shutterstock/THE.STUDIO

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address

SEO

Google Quietly Ends Covid-Era Rich Results

Published

on

By

Google Quietly Ends Covid-Era Rich Results

Google removed the Covid-era structured data associated with the Home Activities rich results that allowed online events to be surfaced in search since August 2020, publishing a mention of the removal in the search documentation changelog.

Home Activities Rich Results

The structured data for the Home Activities rich results allowed providers of online livestreams, pre-recorded events and online events to be findable in Google Search.

The original documentation has been completely removed from the Google Search Central webpages and now redirects to a changelog notation that explains that the Home Activity rich results is no longer available for display.

The original purpose was to allow people to discover things to do from home while in quarantine, particularly online classes and events. Google’s rich results surfaced details of how to watch, description of the activities and registration information.

Providers of online events were required to use Event or Video structured data. Publishers and businesses who have this kind of structured data should be aware that this kind of rich result is no longer surfaced but it’s not necessary to remove the structured data if it’s a burden, it’s not going to hurt anything to publish structured data that isn’t used for rich results.

The changelog for Google’s official documentation explains:

“Removing home activity documentation
What: Removed documentation on home activity structured data.

Why: The home activity feature no longer appears in Google Search results.”

Read more about Google’s Home Activities rich results:

Google Announces Home Activities Rich Results

Read the Wayback Machine’s archive of Google’s original announcement from 2020:

Home activities

Featured Image by Shutterstock/Olga Strel

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

SEO

Google’s Gary Illyes: Lastmod Signal Is Binary

Published

on

By

Google's Gary Illyes: Lastmod Signal Is Binary

In a recent LinkedIn discussion, Gary Illyes, Analyst at Google, revealed that the search engine takes a binary approach when assessing a website’s lastmod signal from sitemaps.

The revelation came as Illyes encouraged website owners to upgrade to WordPress 6.5, which now natively supports the lastmod element in sitemaps.

When Mark Williams-Cook asked if Google has a “reputation system” to gauge how much to trust a site’s reported lastmod dates, Illyes stated, “It’s binary: we either trust it or we don’t.”

No Shades Of Gray For Lastmod

The lastmod tag indicates the date of the most recent significant update to a webpage, helping search engines prioritize crawling and indexing.

Illyes’ response suggests Google doesn’t factor in a website’s history or gradually build trust in the lastmod values being reported.

Google either accepts the lastmod dates provided in a site’s sitemap as accurate, or it disregards them.

This binary approach reinforces the need to implement the lastmod tag correctly and only specify dates when making meaningful changes.

Illyes commends the WordPress developer community for their work on version 6.5, which automatically populates the lastmod field without extra configuration.

Accurate Lastmod Essential For Crawl Prioritization

While convenient for WordPress users, the native lastmod support is only beneficial if Google trusts you’re using it correctly.

Inaccurate lastmod tags could lead to Google ignoring the signal when scheduling crawls.

With Illyes confirming Google’s stance, it shows there’s no room for error when using this tag.

Why SEJ Cares

Understanding how Google acts on lastmod can help ensure Google displays new publish dates in search results when you update your content.

It’s an all-or-nothing situation – if the dates are deemed untrustworthy, the signal could be disregarded sitewide.

With the information revealed by Illyes, you can ensure your implementation follows best practices to the letter.


Featured Image: Danishch/Shutterstock

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

SEO

How to Persuade Your Boss to Send You to Ahrefs Evolve

Published

on

How to Persuade Your Boss to Send You to Ahrefs Evolve

There’s one thing standing between you and several days of SEO, socializing, and Singaporean sunshine: your boss (and their Q4 budget 😅).

But don’t worry—we’ve got your back. Here are 5 arguments (and an example message) you can use to persuade your boss to send you to Ahrefs Evolve.

About Ahrefs Evolve

  • 2 days in sunny Singapore (Oct 24–25)
  • 500 digital marketing enthusiasts
  • 18 top speakers from around the world

Learn more and buy tickets.

SEO is changing at a breakneck pace. Between AI Overviews, Google’s rolling update schedule, their huge API leak, and all the documents released during their antitrust trial, it’s hard to keep up. What works in SEO today?

You could watch a YouTube video or two, maybe even attend an hour-long webinar. Or, much more effective: you could spend two full days learning from a panel of 18 international SEO experts, discussing your takeaways live with other attendees.

How to Persuade Your Boss to Send You to AhrefsHow to Persuade Your Boss to Send You to Ahrefs
Evolve speakers from around the world.

Our world-class speakers are tackling the hardest problems and best opportunities in SEO today. The talk agenda covers topics like:

  • Responding to AI Overviews: Amanda King will teach you how to respond to AI Overviews, Google Gemini, and other AI search functions.
  • Surviving (and thriving) Google’s algo updates: Lily Ray will talk through Google’s recent updates, and share data-driven recommendations for what’s working in search today.
  • Planning for the future of SEO: Bernard Huang will talk through the failures of AI content and the path to better results.

(And attendees will get video recordings of each session, so you can share the knowledge with your teammates too.)

View the full talk agenda here.

There’s no substitute for meeting with influencers, peers, and partners in real life. 

Conferences create serendipity: chance encounters and conversations that can have a huge positive impact on you and your business. By way of example, these are some of the real benefits that have come my way from attending conferences:

  • Conversations that lead to new customers for our business,
  • Invitations to speak at events,
  • New business partnerships and co-marketing opportunities, and
  • Meeting people that we went on to hire.

There’s a “halo” effect that lingers long after the event is over: the people you meet will remember you for longer, think more highly of you, and be more likely to help you out, should you ask.

(And let’s not forget: there’s a lot of information, particularly in SEO, that only gets shared in person.)

The “international” part of Evolve matters too. Evolve is a different crowd to your local run-of-the-mill conference. It’s a chance to meet with people from markets you wouldn’t normally meet—from Australia to Indonesia and beyond.

How to Persuade Your Boss to Send You to AhrefsHow to Persuade Your Boss to Send You to Ahrefs
Evolve attendees by home country.

If you’re an Ahrefs customer (thank you!), you’ll learn tons of tips, tricks and workflow improvements from attending Evolve. You’ll have opportunities to:

  • Attend talks from the Ahrefs team, showcasing advanced features and strategies that you can use in your own business.
  • Pick our brains at the Ahrefs booth, where we’ll offer informal 1:1 coaching sessions and previews of up-coming releases (like our new content optimization tool 🤫).
  • Join dedicated Ahrefs training workshops, hosted by the Ahrefs team and Ahrefs power users (tickets for these workshops will sold separately).

As a manager myself, there are two questions I need answered when approving expenses:

  • Is this a reasonable cost?
  • Will we see a return on this investment?

To answer those questions: early bird tickets for Evolve start at $570. For context, “super early bird” tickets for MozCon (another popular SEO conference) this year were almost twice as much: $999.

There’s a lot included in the ticket price too:

  • World-class international speakers,
  • 5-star hotel venue,
  • 5-star hotel food (two tea breaks with snacks & lunch),
  • Networking afterparty, and
  • Full talk recordings to later share with your team.

SEO is a crucial growth channel for most businesses. If you can improve your company’s SEO performance after attending Evolve (and we think you will), you’ll very easily see a positive return on the investment.

Traveling to tropical Singapore (and eating tons of satay) is great for you, but it’s also great for your team. Attending Evolve is a chance to break with routine, reignite your passion for marketing, and come back to your job reinvigorated.

This would be true for any international conference, but it goes double for Singapore. It’s a truly unique place: an ultra-safe, high-tech city that brings together dozens of different cultures.

1718123166 301 How to Persuade Your Boss to Send You to Ahrefs1718123166 301 How to Persuade Your Boss to Send You to Ahrefs
Little India in Singapore

You’ll discover different beliefs, working practices, and ways of business—and if you’re anything like me, come back a richer, wiser person for the experience.

If you’re nervous about pitching your boss on attending Evolve, remember: the worst that can happen is a polite “not this time”, and you’ll find yourself in the same position you are now.

So here goes: take this message template, tweak it to your liking, and send it to your boss over email or Slack… and I’ll see you in Singapore 😉

Email template

Hi [your boss’ name],

Our SEO tool provider, Ahrefs, is holding an SEO and digital marketing conference in Singapore in October. I’d like to attend, and I think it’s in the company’s interest:

  • The talks will help us respond to all the changes happening in SEO today. I’m particularly interested in the talks about AI and recent Google updates. 
  • I can network with my peers. I can discover what’s working at other companies, and explore opportunities for partnerships and co-marketing.
  • I can learn how we can use Ahrefs better across the organization.
  • I’ll come back reinvigorated with new ideas and motivation, and I can share my top takeaways and talk recordings with my team after the event.

Early bird tickets are $570. Given how important SEO is to the growth of our business, I think we’ll easily see a return from the spend.

Can we set up time to chat in more detail? Thanks!

Source link

Keep an eye on what we are doing
Be the first to get latest updates and exclusive content straight to your email inbox.
We promise not to spam you. You can unsubscribe at any time.
Invalid email address
Continue Reading

Trending