Connect with us

NEWS

Privacy not a blocker for “meaningful” research access to platform data, says report

Published

on

European lawmakers are eyeing binding transparency requirements for Internet platforms in a Digital Services Act (DSA) due to be drafted by the end of the year. But the question of how to create governance structures that provide regulators and researchers with meaningful access to data so platforms can be held accountable for the content they’re amplifying is a complex one.

Platforms’ own efforts to open up their data troves to outside eyes have been chequered to say the least. Back in 2018, Facebook announced the Social Science One initiative, saying it would provide a select group of academics with access to about a petabyte’s worth of sharing data and metadata. But it took almost two years before researchers got access to any data.

“This was the most frustrating thing I’ve been involved in, in my life,” one of the involved researchers told Protocol earlier this year, after spending some 20 months negotiating with Facebook over exactly what it would release.

Facebook’s political Ad Archive API has similarly frustrated researchers. “Facebook makes it impossible to get a complete picture of all of the ads running on their platform (which is exactly the opposite of what they claim to be doing),” said Mozilla last year, accusing the tech giant of transparency-washing.

Facebook, meanwhile, points to European data protection regulations and privacy requirements attached to its business following interventions by the US’ FTC to justify painstaking progress around data access. But critics argue this is just a cynical shield against transparency and accountability. Plus of course none of these regulations stopped Facebook grabbing people’s data in the first place.

In January, Europe’s lead data protection regulator penned a preliminary opinion on data protection and research which warned against such shielding.

“Data protection obligations should not be misappropriated as a means for powerful players to escape transparency and accountability,” wrote EDPS Wojciech Wiewiorówski. “Researchers operating within ethical governance frameworks should therefore be able to access necessary API and other data, with a valid legal basis and subject to the principle of proportionality and appropriate safeguards.”

Nor is Facebook the sole offender here, of course. Google brands itself a ‘privacy champion’ on account of how tight a grip it keeps on access to user data, heavily mediating data it releases in areas where it claims ‘transparency’. While, for years, Twitter routinely disparaged third party studies which sought to understand how content flows across its platform — saying its API didn’t provide full access to all platform data and metadata so the research couldn’t show the full picture. Another convenient shield to eschew accountability.

More recently the company has made some encouraging noises to researchers, updating its dev policy to clarify rules, and offering up a COVID-related dataset — though the included tweets remains self selected. So Twitter’s mediating hand remains on the research tiller.

A new report by AlgorithmWatch seeks to grapple with the knotty problem of platforms evading accountability by mediating data access — suggesting some concrete steps to deliver transparency and bolster research, including by taking inspiration from how access to medical data is mediated, among other discussed governance structures.

The goal: “Meaningful” research access to platform data. (Or as the report title puts it: Operationalizing Research Access in Platform Governance: What to Learn from Other Industries?

“We have strict transparency rules to enable accountability and the public good in so many other sectors (food, transportation, consumer goods, finance, etc). We definitely need it for online platforms — especially in COVID-19 times, where we’re even more dependent on them for work, education, social interaction, news and media consumption,” co-author Jef Ausloos tells TechCrunch.

The report, which the authors are aiming at European Commission lawmakers as they ponder how to shape an effective platform governance framework, proposes mandatory data sharing frameworks with an independent EU-institution acting as an intermediary between disclosing corporations and data recipients.

It’s not the first time an online regulator has been mooted, of course — but the entity being suggested here is more tightly configured in terms of purpose than some of the other Internet overseers being proposed in Europe.

“Such an institution would maintain relevant access infrastructures including virtual secure operating environments, public databases, websites and forums. It would also play an important role in verifying and pre-processing corporate data in order to ensure it is suitable for disclosure,” they write in a report summary.

Discussing the approach further, Ausloos argues it’s important to move away from “binary thinking” to break the current ‘data access’ trust deadlock. “Rather than this binary thinking of disclosure vs opaqueness/obfuscation, we need a more nuanced and layered approach with varying degrees of data access/transparency,” he says. “Such a layered approach can hinge on types of actors requesting data, and their purposes.”

A market research purpose might only get access to very high level data, he suggests. Whereas medical research by academic institutions could be given more granular access — subject, of course, to strict requirements (such as a research plan, ethical board review approval and so on).

“An independent institution intermediating might be vital in order to facilitate this and generate the necessary trust. We think it is vital that that regulator’s mandate is detached from specific policy agendas,” says Ausloos. “It should be focused on being a transparency/disclosure facilitator — creating the necessary technical and legal environment for data exchange. This can then be used by media/competition/data protection/etc authorities for their potential enforcement actions.”

Ausloos says many discussions on setting up an independent regulator for online platforms have proposed too many mandates or competencies — making it impossible to achieve political consensus. Whereas a leaner entity with a narrow transparency/disclosure remit should be able to cut through noisy objections, is the theory.

The infamous example of Cambridge Analytica does certainly loom large over the ‘data for research’ space — aka, the disgraced data company which paid a Cambridge University academic to use an app to harvest and process Facebook user data for political ad targeting. And Facebook has thought nothing of turning this massive platform data misuse scandal into a stick to beat back regulatory proposals aiming to crack open its data troves.

But Cambridge Analytica was a direct consequence of a lack of transparency, accountability and platform oversight. It was also, of course, a massive ethical failure — given that consent for political targeting was not sought from people whose data was acquired. So it doesn’t seem a good argument against regulating access to platform data. On the contrary.

With such ‘blunt instrument’ tech talking points being lobbied into the governance debate by self-interested platform giants, the AlgorithmWatch report brings both welcome nuance and solid suggestions on how to create effective governance structures for modern data giants.

On the layered access point, the report suggests the most granular access to platform data would be the most highly controlled, along the lines of a medical data model. “Granular access can also only be enabled within a closed virtual environment, controlled by an independent body — as is currently done by Findata [Finland’s medical data institution],” notes Ausloos.

Another governance structure discussed in the report — as a case study from which to draw learnings on how to incentivize transparency and thereby enable accountability — is the European Pollutant Release and Transfer Register (E-PRTR). This regulates pollutant emissions reporting across the EU, and results in emissions data being freely available to the public via a dedicated web-platform and as a standalone dataset.

“Credibility is achieved by assuring that the reported data is authentic, transparent and reliable and comparable, because of consistent reporting. Operators are advised to use the best available reporting techniques to achieve these standards of completeness, consistency and credibility,” the report says on the E-PRTR.

“Through this form of transparency, the E-PRTR aims to impose accountability on operators of industrial facilities in Europe towards to the public, NGOs, scientists, politicians, governments and supervisory authorities.”

While EU lawmakers have signalled an intent to place legally binding transparency requirements on platforms — at least in some less contentious areas, such as illegal hate speech, as a means of obtaining accountability on some specific content problems — they have simultaneously set out a sweeping plan to fire up Europe’s digital economy by boosting the reuse of (non-personal) data.

Leveraging industrial data to support R&D and innovation is a key plank of the Commission’s tech-fuelled policy priorities for the next five+ years, as part of an ambitious digital transformation agenda.

This suggests that any regional move to open up platform data is likely to go beyond accountability — given EU lawmakers are pushing for the broader goal of creating a foundational digital support structure to enable research through data reuse. So if privacy-respecting data sharing frameworks can be baked in, a platform governance structure that’s designed to enable regulated data exchange almost by default starts to look very possible within the European context.

“Enabling accountability is important, which we tackle in the pollution case study; but enabling research is at least as important,” argues Ausloos, who does postdoc research at the University of Amsterdam’s Institute for Information Law. “Especially considering these platforms constitute the infrastructure of modern society, we need data disclosure to understand society.”

“When we think about what transparency measures should look like for the DSA we don’t need to reinvent the wheel,” adds Mackenzie Nelson, project lead for AlgorithmWatch’s Governing Platforms Project, in a statement. “The report provides concrete recommendations for how the Commission can design frameworks that safeguard user privacy while still enabling critical research access to dominant platforms’ data.”

You can read the full report here.

TechCrunch

NEWS

OpenAI Introduces ChatGPT Plus with Monthly Subscription of $20

Published

on

Open AI - Chat GPT

OpenAI, the leading artificial intelligence research laboratory, has launched a new product – ChatGPT Plus. The new product is an advanced version of its previous language model, ChatGPT, and is available for a monthly subscription of $20. The company aims to provide a more sophisticated and efficient conversational AI tool to its users through this new product.

ChatGPT Plus is a state-of-the-art language model that uses advanced deep learning algorithms to generate human-like responses to text inputs. The model has been trained on a massive corpus of text data, allowing it to generate coherent and contextually relevant responses. The model is designed to handle a wide range of conversational topics and can be integrated into various applications, such as chatbots, customer support systems, and virtual assistants.

One of the main advantages of ChatGPT Plus over its predecessor, ChatGPT, is its ability to generate responses in a more human-like manner. The model has been fine-tuned to incorporate more advanced language processing techniques, which enable it to better understand the context and tone of a conversation. This makes it possible for the model to generate more nuanced and appropriate responses, which can greatly improve the user experience.

In addition to its advanced language processing capabilities, ChatGPT Plus also offers improved performance in terms of response generation speed and efficiency. The model has been optimized to run on faster hardware and has been fine-tuned to generate responses more quickly. This makes it possible for the model to handle a larger volume of requests, making it an ideal solution for businesses with high traffic websites or customer support centers.

The monthly subscription fee of $20 for ChatGPT Plus makes it an affordable solution for businesses of all sizes. The company has designed the pricing model in such a way that it is accessible to businesses of all sizes, regardless of their budget. This makes it possible for small businesses to take advantage of advanced conversational AI technology, which can greatly improve their customer engagement and support.

OpenAI has also made it easy to integrate ChatGPT Plus into various applications. The company has provided a comprehensive API that allows developers to easily integrate the model into their applications. The API supports a wide range of programming languages, making it possible for developers to use the technology regardless of their preferred programming language. This makes it possible for businesses to quickly and easily incorporate conversational AI into their operations.

In conclusion, OpenAI’s launch of ChatGPT Plus is a significant development in the field of conversational AI. The new product offers advanced language processing capabilities and improved performance, making it an ideal solution for businesses of all sizes. The affordable pricing model and easy integration make it accessible to businesses of all sizes, and the advanced language processing capabilities make it possible for businesses to improve their customer engagement and support. OpenAI’s ChatGPT Plus is set to revolutionize the conversational AI industry and bring advanced technology within the reach of businesses of all sizes.

Visit OpenAI.com to read more and to get the latest news about ChatGPT.

Continue Reading

NEWS

What can ChatGPT do?

Published

on

ChatGPT Explained

ChatGPT is a large language model developed by OpenAI that is trained on a massive amount of text data. It is capable of generating human-like text and has been used in a variety of applications, such as chatbots, language translation, and text summarization.

One of the key features of ChatGPT is its ability to generate text that is similar to human writing. This is achieved through the use of a transformer architecture, which allows the model to understand the context and relationships between words in a sentence. The transformer architecture is a type of neural network that is designed to process sequential data, such as natural language.

Another important aspect of ChatGPT is its ability to generate text that is contextually relevant. This means that the model is able to understand the context of a conversation and generate responses that are appropriate to the conversation. This is accomplished by the use of a technique called “masked language modeling,” which allows the model to predict the next word in a sentence based on the context of the previous words.

One of the most popular applications of ChatGPT is in the creation of chatbots. Chatbots are computer programs that simulate human conversation and can be used in customer service, sales, and other applications. ChatGPT is particularly well-suited for this task because of its ability to generate human-like text and understand context.

Another application of ChatGPT is language translation. By training the model on a large amount of text data in multiple languages, it can be used to translate text from one language to another. The model is able to understand the meaning of the text and generate a translation that is grammatically correct and semantically equivalent.

In addition to chatbots and language translation, ChatGPT can also be used for text summarization. This is the process of taking a large amount of text and condensing it into a shorter, more concise version. ChatGPT is able to understand the main ideas of the text and generate a summary that captures the most important information.

Despite its many capabilities and applications, ChatGPT is not without its limitations. One of the main challenges with using language models like ChatGPT is the risk of generating text that is biased or offensive. This can occur when the model is trained on text data that contains biases or stereotypes. To address this, OpenAI has implemented a number of techniques to reduce bias in the training data and in the model itself.

In conclusion, ChatGPT is a powerful language model that is capable of generating human-like text and understanding context. It has a wide range of applications, including chatbots, language translation, and text summarization. While there are limitations to its use, ongoing research and development is aimed at improving the model’s performance and reducing the risk of bias.

** The above article has been written 100% by ChatGPT. This is an example of what can be done with AI. This was done to show the advanced text that can be written by an automated AI.

Continue Reading

NEWS

Google December Product Reviews Update Affects More Than English Language Sites? via @sejournal, @martinibuster

Published

on

Google’s Product Reviews update was announced to be rolling out to the English language. No mention was made as to if or when it would roll out to other languages. Mueller answered a question as to whether it is rolling out to other languages.

Google December 2021 Product Reviews Update

On December 1, 2021, Google announced on Twitter that a Product Review update would be rolling out that would focus on English language web pages.

The focus of the update was for improving the quality of reviews shown in Google search, specifically targeting review sites.

A Googler tweeted a description of the kinds of sites that would be targeted for demotion in the search rankings:

“Mainly relevant to sites that post articles reviewing products.

Think of sites like “best TVs under $200″.com.

Goal is to improve the quality and usefulness of reviews we show users.”

Advertisement

Continue Reading Below

Google also published a blog post with more guidance on the product review update that introduced two new best practices that Google’s algorithm would be looking for.

The first best practice was a requirement of evidence that a product was actually handled and reviewed.

The second best practice was to provide links to more than one place that a user could purchase the product.

The Twitter announcement stated that it was rolling out to English language websites. The blog post did not mention what languages it was rolling out to nor did the blog post specify that the product review update was limited to the English language.

Google’s Mueller Thinking About Product Reviews Update

Screenshot of Google's John Mueller trying to recall if December Product Review Update affects more than the English language

Screenshot of Google's John Mueller trying to recall if December Product Review Update affects more than the English language

Product Review Update Targets More Languages?

The person asking the question was rightly under the impression that the product review update only affected English language search results.

Advertisement

Continue Reading Below

But he asserted that he was seeing search volatility in the German language that appears to be related to Google’s December 2021 Product Review Update.

This is his question:

“I was seeing some movements in German search as well.

So I was wondering if there could also be an effect on websites in other languages by this product reviews update… because we had lots of movement and volatility in the last weeks.

…My question is, is it possible that the product reviews update affects other sites as well?”

John Mueller answered:

“I don’t know… like other languages?

My assumption was this was global and and across all languages.

But I don’t know what we announced in the blog post specifically.

But usually we try to push the engineering team to make a decision on that so that we can document it properly in the blog post.

I don’t know if that happened with the product reviews update. I don’t recall the complete blog post.

But it’s… from my point of view it seems like something that we could be doing in multiple languages and wouldn’t be tied to English.

And even if it were English initially, it feels like something that is relevant across the board, and we should try to find ways to roll that out to other languages over time as well.

So I’m not particularly surprised that you see changes in Germany.

But I also don’t know what we actually announced with regards to the locations and languages that are involved.”

Does Product Reviews Update Affect More Languages?

While the tweeted announcement specified that the product reviews update was limited to the English language the official blog post did not mention any such limitations.

Google’s John Mueller offered his opinion that the product reviews update is something that Google could do in multiple languages.

One must wonder if the tweet was meant to communicate that the update was rolling out first in English and subsequently to other languages.

It’s unclear if the product reviews update was rolled out globally to more languages. Hopefully Google will clarify this soon.

Citations

Google Blog Post About Product Reviews Update

Product reviews update and your site

Google’s New Product Reviews Guidelines

Write high quality product reviews

John Mueller Discusses If Product Reviews Update Is Global

Watch Mueller answer the question at the 14:00 Minute Mark

[embedded content]Searchenginejournal.com

Continue Reading

Trending

en_USEnglish