Connect with us

NEWS

The real threat of fake voices in a time of crisis

Published

on

As federal agencies take increasingly stringent actions to try to limit the spread of the novel coronavirus pandemic within the U.S., how can individual Americans and U.S. companies affected by these rules weigh in with their opinions and experiences? Because many of the new rules, such as travel restrictions and increased surveillance, require expansions of federal power beyond normal circumstances, our laws require the federal government to post these rules publicly and allow the public to contribute their comments to the proposed rules online. But are federal public comment websites — a vital institution for American democracy — secure in this time of crisis? Or are they vulnerable to bot attack?

In December 2019, we published a new study to see firsthand just how vulnerable the public comment process is to an automated attack. Using publicly available artificial intelligence (AI) methods, we successfully generated 1,001 comments of deepfake text, computer-generated text that closely mimics human speech, and submitted them to the Centers for Medicare & Medicaid Services’ (CMS) website for a proposed federal rule that would institute mandatory work reporting requirements for citizens on Medicaid in Idaho.

The comments we produced using deepfake text constituted over 55% of the 1,810 total comments submitted during the federal public comment period. In a follow-up study, we asked people to identify whether comments were from a bot or a human. Respondents were only correct half of the time — the same probability as random guessing.

deepfake text question

Image Credits: Zang/Weiss/Sweeney

The example above is deepfake text generated by the bot that all survey respondents thought was from a human.

We ultimately informed CMS of our deepfake comments and withdrew them from the public record. But a malicious attacker would likely not do the same.

Previous large-scale fake comment attacks on federal websites have occurred, such as the 2017 attack on the FCC website regarding the proposed rule to end net neutrality regulations.

During the net neutrality comment period, firms hired by industry group Broadband for America used bots to create comments expressing support for the repeal of net neutrality. They then submitted millions of comments, sometimes even using the stolen identities of deceased voters and the names of fictional characters, to distort the appearance of public opinion.

A retroactive text analysis of the comments found that 96-97% of the more than 22 million comments on the FCC’s proposal to repeal net neutrality were likely coordinated bot campaigns. These campaigns used relatively unsophisticated and conspicuous search-and-replace methods — easily detectable even on this mass scale. But even after investigations revealed the comments were fraudulent and made using simple search-and-replace-like computer techniques, the FCC still accepted them as part of the public comment process.

Even these relatively unsophisticated campaigns were able to affect a federal policy outcome. However, our demonstration of the threat from bots submitting deepfake text shows that future attacks can be far more sophisticated and much harder to detect.

The laws and politics of public comments

Let’s be clear: The ability to communicate our needs and have them considered is the cornerstone of the democratic model. As enshrined in the Constitution and defended fiercely by civil liberties organizations, each American is guaranteed a role in participating in government through voting, through self-expression and through dissent.

search and replace FCC questions

Image Credits: Zang/Weiss/Sweeney

When it comes to new rules from federal agencies that can have sweeping impacts across America, public comment periods are the legally required method to allow members of the public, advocacy groups and corporations that would be most affected by proposed rules to express their concerns to the agency and require the agency to consider these comments before they decide on the final version of the rule. This requirement for public comments has been in place since the passage of the Administrative Procedure Act of 1946. In 2002, the e-Government Act required the federal government to create an online tool to receive public comments. Over the years, there have been multiple court rulings requiring the federal agency to demonstrate that they actually examined the submitted comments and publish any analysis of relevant materials and justification of decisions made in light of public comments [see Citizens to Preserve Overton Park, Inc. v. Volpe, 401 U. S. 402, 416 (1971); Home Box Office, supra, 567 F.2d at 36 (1977), Thompson v. Clark, 741 F. 2d 401, 408 (CADC 1984)].

In fact, we only had a public comment website from CMS to test for vulnerability to deepfake text submissions in our study, because in June 2019, the U.S. Supreme Court ruled in a 7-1 decision that CMS could not skip the public comment requirements of the Administrative Procedure Act in reviewing proposals from state governments to add work reporting requirements to Medicaid eligibility rules within their state.

The impact of public comments on the final rule by a federal agency can be substantial based on political science research. For example, in 2018, Harvard University researchers found that banks that commented on Dodd-Frank-related rules by the Federal Reserve obtained $7 billion in excess returns compared to non-participants. When they examined the submitted comments to the “Volcker Rule” and the debit card interchange rule, they found significant influence from submitted comments by different banks during the “sausage-making process” from the initial proposed rule to the final rule.

Beyond commenting directly using their official corporate names, we’ve also seen how an industry group, Broadband for America, in 2017 would submit millions of fake comments in support of the FCC’s rule to end net neutrality in order to create the false perception of broad political support for the FCC’s rule amongst the American public.

Technology solutions to deepfake text on public comments

While our study highlights the threat of deepfake text to disrupt public comment websites, this doesn’t mean we should end this long-standing institution of American democracy, but rather we need to identify how technology can be used for innovative solutions that accepts public comments from real humans while rejecting deepfake text from bots.

There are two stages in the public comment process — (1) comment submission and (2) comment acceptance — where technology can be used as potential solutions.

In the first stage of comment submission, technology can be used to prevent bots from submitting deepfake comments in the first place; thus raising the cost for an attacker to need to recruit large numbers of humans instead. One technological solution that many are already familiar with are the CAPTCHA boxes that we see at the bottom of internet forms that ask us to identify a word — either visually or audibly — before being able to click submit. CAPTCHAs provide an extra step that makes the submission process increasingly difficult for a bot. While these tools can be improved for accessibility for disabled individuals, they would be a step in the right direction.

However, CAPTCHAs would not prevent an attacker willing to pay for low-cost labor abroad to solve any CAPTCHA tests in order to submit deepfake comments. One way to get around that may be to require strict identification to be provided along with every submission, but that would remove the possibility for anonymous comments that are currently accepted by agencies such as CMS and the Food and Drug Administration (FDA). Anonymous comments serve as a method of privacy protection for individuals who may be significantly affected by a proposed rule on a sensitive topic such as healthcare without needing to disclose their identity. Thus, the technological challenge would be to build a system that can separate the user authentication step from the comment submission step so only authenticated individuals can submit a comment anonymously.

Finally, in the second stage of comment acceptance, better technology can be used to distinguish between deepfake text and human submissions. While our study found that our sample of over 100 people surveyed were not able to identify the deepfake text examples, more sophisticated spam detection algorithms in the future may be more successful. As machine learning methods advance over time, we may see an arms race between deepfake text generation and deepfake text identification algorithms.

The challenge today

While future technologies may offer more comprehensive solutions, the threat of deepfake text to our American democracy is real and present today. Thus, we recommend that all federal public comment websites adopt state-of-the-art CAPTCHAs as an interim measure of security, a position that is also supported by the 2019 U.S. Senate Subcommittee on Investigations’ Report on Abuses of the Federal Notice-and-Comment Rulemaking Process.

In order to develop more robust future technological solutions, we will need to build a collaborative effort between the government, researchers and our innovators in the private sector. That’s why we at Harvard University have joined the Public Interest Technology University Network along with 20 other education institutions, New America, the Ford Foundation and the Hewlett Foundation. Collectively, we are dedicated to helping inspire a new generation of civic-minded technologists and policy leaders. Through curriculum, research and experiential learning programs, we hope to build the field of public interest technology and a future where technology is made and regulated with the public in mind from the beginning.

While COVID-19 has disrupted many parts of American society, it hasn’t stopped federal agencies under the Trump administration from continuing to propose new deregulatory rules that can have long-lasting legacies that will be felt long after the current pandemic has ended. For example, on March 18, 2020, the Environmental Protection Agency (EPA) proposed new rules about limiting which research studies can be used to support EPA regulations, which have received over 610,000 comments as of April 6, 2020. On April 2, 2020, the Department of Education proposed new rules for permanently relaxing regulations for online education and distance learning. On February 19, 2020, the FCC re-opened public comments on its net neutrality rules, which in 2017 saw 22 million comments submitted by bots, after a federal court ruled that the FCC ignored how ending net neutrality would affect public safety and cellphone access programs for low-income Americans.

Federal public comment websites offer the only way for the American public and organizations to express their concerns to the federal agency before the final rules are determined. We must adopt better technological defenses to ensure that deepfake text doesn’t further threaten American democracy during a time of crisis.

TechCrunch

NEWS

What can ChatGPT do?

Published

on

ChatGPT Explained

ChatGPT is a large language model developed by OpenAI that is trained on a massive amount of text data. It is capable of generating human-like text and has been used in a variety of applications, such as chatbots, language translation, and text summarization.

One of the key features of ChatGPT is its ability to generate text that is similar to human writing. This is achieved through the use of a transformer architecture, which allows the model to understand the context and relationships between words in a sentence. The transformer architecture is a type of neural network that is designed to process sequential data, such as natural language.

Another important aspect of ChatGPT is its ability to generate text that is contextually relevant. This means that the model is able to understand the context of a conversation and generate responses that are appropriate to the conversation. This is accomplished by the use of a technique called “masked language modeling,” which allows the model to predict the next word in a sentence based on the context of the previous words.

One of the most popular applications of ChatGPT is in the creation of chatbots. Chatbots are computer programs that simulate human conversation and can be used in customer service, sales, and other applications. ChatGPT is particularly well-suited for this task because of its ability to generate human-like text and understand context.

Another application of ChatGPT is language translation. By training the model on a large amount of text data in multiple languages, it can be used to translate text from one language to another. The model is able to understand the meaning of the text and generate a translation that is grammatically correct and semantically equivalent.

In addition to chatbots and language translation, ChatGPT can also be used for text summarization. This is the process of taking a large amount of text and condensing it into a shorter, more concise version. ChatGPT is able to understand the main ideas of the text and generate a summary that captures the most important information.

Despite its many capabilities and applications, ChatGPT is not without its limitations. One of the main challenges with using language models like ChatGPT is the risk of generating text that is biased or offensive. This can occur when the model is trained on text data that contains biases or stereotypes. To address this, OpenAI has implemented a number of techniques to reduce bias in the training data and in the model itself.

In conclusion, ChatGPT is a powerful language model that is capable of generating human-like text and understanding context. It has a wide range of applications, including chatbots, language translation, and text summarization. While there are limitations to its use, ongoing research and development is aimed at improving the model’s performance and reducing the risk of bias.

** The above article has been written 100% by ChatGPT. This is an example of what can be done with AI. This was done to show the advanced text that can be written by an automated AI.

Continue Reading

NEWS

Google December Product Reviews Update Affects More Than English Language Sites? via @sejournal, @martinibuster

Published

on

Google’s Product Reviews update was announced to be rolling out to the English language. No mention was made as to if or when it would roll out to other languages. Mueller answered a question as to whether it is rolling out to other languages.

Google December 2021 Product Reviews Update

On December 1, 2021, Google announced on Twitter that a Product Review update would be rolling out that would focus on English language web pages.

The focus of the update was for improving the quality of reviews shown in Google search, specifically targeting review sites.

A Googler tweeted a description of the kinds of sites that would be targeted for demotion in the search rankings:

“Mainly relevant to sites that post articles reviewing products.

Think of sites like “best TVs under $200″.com.

Goal is to improve the quality and usefulness of reviews we show users.”

Advertisement

Continue Reading Below

Google also published a blog post with more guidance on the product review update that introduced two new best practices that Google’s algorithm would be looking for.

The first best practice was a requirement of evidence that a product was actually handled and reviewed.

The second best practice was to provide links to more than one place that a user could purchase the product.

The Twitter announcement stated that it was rolling out to English language websites. The blog post did not mention what languages it was rolling out to nor did the blog post specify that the product review update was limited to the English language.

Google’s Mueller Thinking About Product Reviews Update

Screenshot of Google's John Mueller trying to recall if December Product Review Update affects more than the English language

Screenshot of Google's John Mueller trying to recall if December Product Review Update affects more than the English language

Product Review Update Targets More Languages?

The person asking the question was rightly under the impression that the product review update only affected English language search results.

Advertisement

Continue Reading Below

But he asserted that he was seeing search volatility in the German language that appears to be related to Google’s December 2021 Product Review Update.

This is his question:

“I was seeing some movements in German search as well.

So I was wondering if there could also be an effect on websites in other languages by this product reviews update… because we had lots of movement and volatility in the last weeks.

…My question is, is it possible that the product reviews update affects other sites as well?”

John Mueller answered:

“I don’t know… like other languages?

My assumption was this was global and and across all languages.

But I don’t know what we announced in the blog post specifically.

But usually we try to push the engineering team to make a decision on that so that we can document it properly in the blog post.

I don’t know if that happened with the product reviews update. I don’t recall the complete blog post.

But it’s… from my point of view it seems like something that we could be doing in multiple languages and wouldn’t be tied to English.

And even if it were English initially, it feels like something that is relevant across the board, and we should try to find ways to roll that out to other languages over time as well.

So I’m not particularly surprised that you see changes in Germany.

But I also don’t know what we actually announced with regards to the locations and languages that are involved.”

Does Product Reviews Update Affect More Languages?

While the tweeted announcement specified that the product reviews update was limited to the English language the official blog post did not mention any such limitations.

Google’s John Mueller offered his opinion that the product reviews update is something that Google could do in multiple languages.

One must wonder if the tweet was meant to communicate that the update was rolling out first in English and subsequently to other languages.

It’s unclear if the product reviews update was rolled out globally to more languages. Hopefully Google will clarify this soon.

Citations

Google Blog Post About Product Reviews Update

Product reviews update and your site

Google’s New Product Reviews Guidelines

Write high quality product reviews

John Mueller Discusses If Product Reviews Update Is Global

Watch Mueller answer the question at the 14:00 Minute Mark

[embedded content]

Searchenginejournal.com

Continue Reading

NEWS

Survey says: Amazon, Google more trusted with your personal data than Apple is

Published

on

survey-says:-amazon,-google-more-trusted-with-your-personal-data-than-apple-is-–-phonearena
 

MacRumors reveals that more people feel better with their personal data in the hands of Amazon and Google than Apple’s. Companies that the public really doesn’t trust when it comes to their personal data include Facebook, TikTok, and Instagram.

The survey asked over 1,000 internet users in the U.S. how much they trusted certain companies such as Facebook, TikTok, Instagram, WhatsApp, YouTube, Google, Microsoft, Apple, and Amazon to handle their user data and browsing activity responsibly.

Amazon and Google are considered by survey respondents to be more trustworthy than Apple

Those surveyed were asked whether they trusted these firms with their personal data “a great deal,” “a good amount,” “not much,” or “not at all.” Respondents could also answer that they had no opinion about a particular company. 18% of those polled said that they trust Apple “a great deal” which topped the 14% received by Google and Amazon.

However, 39% said that they trust Amazon  by “a good amount” with Google picking up 34% of the votes in that same category. Only 26% of those answering said that they trust Apple by “a good amount.” The first two responses, “a great deal” and “a good amount,” are considered positive replies for a company. “Not much” and “not at all” are considered negative responses.

By adding up the scores in the positive categories,

Apple tallied a score of 44% (18% said it trusted Apple with its personal data “a great deal” while 26% said it trusted Apple “a good amount”). But that placed the tech giant third after Amazon’s 53% and Google’s 48%. After Apple, Microsoft finished fourth with 43%, YouTube (which is owned by Google) was fifth with 35%, and Facebook was sixth at 20%.

Rounding out the remainder of the nine firms in the survey, Instagram placed seventh with a positive score of 19%, WhatsApp was eighth with a score of 15%, and TikTok was last at 12%.

Looking at the scoring for the two negative responses (“not much,” or “not at all”), Facebook had a combined negative score of 72% making it the least trusted company in the survey. TikTok was next at 63% with Instagram following at 60%. WhatsApp and YouTube were both in the middle of the pact at 53% followed next by Google and Microsoft at 47% and 42% respectively. Apple and Amazon each had the lowest combined negative scores at 40% each.

74% of those surveyed called targeted online ads invasive

The survey also found that a whopping 82% of respondents found targeted online ads annoying and 74% called them invasive. Just 27% found such ads helpful. This response doesn’t exactly track the 62% of iOS users who have used Apple’s App Tracking Transparency feature to opt-out of being tracked while browsing websites and using apps. The tracking allows third-party firms to send users targeted ads online which is something that they cannot do to users who have opted out.

The 38% of iOS users who decided not to opt out of being tracked might have done so because they find it convenient to receive targeted ads about a certain product that they looked up online. But is ATT actually doing anything?

Marketing strategy consultant Eric Seufert said last summer, “Anyone opting out of tracking right now is basically having the same level of data collected as they were before. Apple hasn’t actually deterred the behavior that they have called out as being so reprehensible, so they are kind of complicit in it happening.”

The Financial Times says that iPhone users are being lumped together by certain behaviors instead of unique ID numbers in order to send targeted ads. Facebook chief operating officer Sheryl Sandberg says that the company is working to rebuild its ad infrastructure “using more aggregate or anonymized data.”

Aggregated data is a collection of individual data that is used to create high-level data. Anonymized data is data that removes any information that can be used to identify the people in a group.

When consumers were asked how often do they think that their phones or other tech devices are listening in to them in ways that they didn’t agree to, 72% answered “very often” or “somewhat often.” 28% responded by saying “rarely” or “never.”

Continue Reading

Trending

en_USEnglish