What ChatGPT Is and Isn't?

Contents
#

What is ChatGPT ?
#

ChatGTP is a web application based on GPT-3, specifically the GPT-3.5 “text-davinci-003” model developed by OpenAi. The ChatGPT model is optimized to work in conversational form, responding to inputs that users provide as a text “prompt”. ChatGPT is a type of Generative AI based on a Machine Learning Model . We can consider ChatGPT a weak AI (see Types of Artificial Intelligence].

GPT-3 is a massive natural language generative model. It has been trained with 175 billion parameters in multiple languages, Catalan among them. If the training process had been done on a single computer it would have taken 355 years. The cost to train it on a low-cost cloud platform is $4,600,000. https://lambdalabs.com/blog/demystifying-gpt-3

(Image source: https://blog.accubits.com/getting-started-with-gpt-3-model-by-openai/)

ChatGPT is Software As A Service (SaaS)
#

ChatGPT is provided as a service; it’s “Software as a Service”, not software we can install on a server or a cloud computing service. Therefore, the owners of the service (OpenAi, 49% owned by Microsoft since January 2023) decide the service terms: who has access, for what purposes, and what quality of service and availability they offer.

Terms and conditions
#

ChatGPT’s terms of service https://openai.com/terms/ state that

You agree and instruct us to use Content to develop and improve the Services. You can read more here about how Content may be used to improve model performance. We understand that in some cases you may not want your Content used to improve the Services. You can opt out from having Content used for improvement by contacting suport@openai.com with your organization ID. Please note that in some cases this may limit our Services’ ability to better address your specific use case.

In other words, the data we enter into ChatGPT in the form of a Prompt will be collected, analyzed, and used for research, service improvement, and very likely to train future versions of GPT-N.

OpenAi clearly publishes its terms and conditions, doesn’t abuse legalese, and doesn’t use fine print. I recommend reading the terms of service https://openai.com/terms/ and the privacy policy https://openai.com/privacy/

Authorship rights and responsibility for ChatGPT’s outputs
#

A very important aspect of using generative AI tools like ChatGPT, DALL-E, Stable Diffusion, Whisper, or VALL-E is the implicitly or explicitly established agreement on the authorship rights of what is generated (text, image, audio, or video). This determines the copyright and usage rights that derive for the parties involved.

OpenAi’s generic terms for its tools define two important concepts: Input and output. Input is the prompt generated by users, while Output is the content generated by the Machine Learning Model. Input and output together make up the content.

Let’s see what the terms say as translated by ChatGPT itself.

_(a) Your Content. You may provide input to the Services (“Input”) and receive output generated and returned by the Service based on the Input (“Output”). Input and Output are collectively “Content.” Between the parties and to the extent permitted by applicable law, you own all Input and, subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to the Output. OpenAI may use Content as necessary to provide and maintain the Services, comply with applicable law, and enforce our policies. You are responsible for Content, including ensuring that it does not violate any applicable law or these Terms.

In other words, the author of a prompt] holds the rights to the outputs generated by an OpenAi AI as long as they hold the rights to the inputs that produced those contents. If they don’t hold those rights they’d be contravening OpenAi’s terms of service, and that would mean I’m not the one to opine on the matter; you should consult lawyers and legal services.

OpenAi has things to say about the content, its nature, and the possibility that certain outputs repeat.

(b) Similarity of Content. Due to the nature of machine learning, Output may not be unique among users and the Services may generate the same or similar output for OpenAI or a third party. For example, you could provide input to a model such as “What color is the sky?” and receive output such as “The sky is blue.” Other users may ask similar questions and receive the same answer. Responses that are requested by and generated for other users are not considered your Content.

That is, the originality of the output depends on the originality of the input. Lawyers are going to have a field day in the coming years. Or maybe a sibling of ChatGPT specialized in legal advice… but I digress.

Cost and access
#

As of January 2023 and since December 2022, ChatGPT is available for free at https://chat.openai.com

Earlier GPT-3 models were available for testing via a more complex interface and by API (Application Programming Interface). To access them, after a trial period you had to pay for usage by purchasing “token” packages. It’s not clear how long ChatGPT will be available for free use.

There’s already a waitlist open to access the “pro” mode of ChatGPT. It’s speculated that the price will be $42 per month, although the free tier remains.

Access for application developers via API (Application Programming Interface) to ChatGTP will soon begin to be managed through Microsoft’s cloud services: Azure. https://indianexpress.com/article/technology/tech-news-technology/openais-chatgpt-api-microsoft-azure-api-businesses-8386106/ As a result, we’ll surely start seeing specialized tools making use of various GPT-3 models.

One example is the “codex” model, a GPT-3 variant trained with source code available on GitHub (the largest free software project repository on the web, acquired by Microsoft a couple of years ago), where for a few months they’ve been offering the Copilot system, a programming assistant capable of generating code, translating programs between programming languages, optimizing code, and commenting on programs, among other applications.

What’s ChatGPT for and why is it free?
#

The goal of ChatGPT is to make it easy for a large number of people (on the order of hundreds of millions) to use this model, thereby obtaining data for OpenAi’s researchers. As OpenAi’s CEO points out in the Moonshots Podcast.

Moonshots. Peter Diamandis. “The Rise of OpenAi with Sam Altman” https://www.diamandis.com/podcast/sam-altman

ChatGPT isn’t open source
#

GPT-3 isn’t open source; other Machine Learning Models are, but GPT-3 and DALL-E aren’t. On this point ChatGPT says:

_GPT-3 is a language processing model developed by OpenAI. The model and the data used to train it are owned by OpenAI, and therefore it is not open source. OpenAI has decided not to open it because it considers it a valuable technology and believes it is important to protect it in order to continue funding its development and improvement.
In addition, there are concerns about the possible risks associated with unintentional or malicious use of GPT-3 technology, such as the automated generation of fake news or the automation of social engineering. OpenAI has adopted a restrictive licensing strategy to avoid these concerns and ensure that GPT-3 is used responsibly.

Ethical aspects
#

The results of transformer-based generative models can be convincing enough to pass the Turing test. In June 2022 a Google engineer, who was fired, stated he was convinced the LaMDA model was self-aware and therefore had rights analogous to human rights. https://www.livescience.com/google-sentient-ai-lamda-lemoine

In January 2023 social networks were circulating rumors about the number of parameters in GPT-4. However, OpenAi CEO Sam Altman denied it and emphasized that GPT-4 won’t be a Types of Artificial Intelligence.

ChatGPT is a project in continuous evolution. The service is often updated and many researchers have reported changes in its behavior. After a short while using ChatGPT we can observe that an effort has been made to make its responses politically correct. Some examples are:

The insistence that it is software, that it has no agency, and that it is not an Artificial Intelligence

Ethical reflections from the creators of GPT-3
#

The creators of GPT-3 presented the project in the paper “Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.” In the paper they devote a fairly extensive section to discussing the possible misuses of text-generating systems like GPT-3.

Summarized by GPT-3

_The malicious uses of language models can be somewhat difficult to anticipate because they often involve repurposing language models in a very different setting or for a different purpose than researchers had in mind.
To help with this, we can think in terms of traditional security risk assessment frameworks, which outline key steps such as identifying threats and potential impacts, assessing likelihood, and determining risk as a combination of probability and impact.
We discuss three factors: potential misuse applications, risk actors, and external incentive structures. The potential for misuse of language models increases as the quality of text synthesis improves. GPT-3’s ability to generate multiple paragraphs of synthetic content that people find difficult to distinguish from human-written text is a point of concern in this regard._

Potential misuse applications of language models include disinformation, spam, phishing, abuse of legal and governmental processes, fraudulent drafting of academic essays, and the precision of social engineering.
The potential for misuse increases with improvements in the quality of text synthesis.

The authors identify different potential types of “threat actors” based on their level of skill and resources. These range from actors with low or moderate skills and resources to highly skilled and well-equipped groups such as state-sponsored ones they call APT (Advanced Persistent Threats).

Low- and medium-skilled actors do not currently pose an immediate threat, but improvements in reliability could change this.
APT actors do not discuss their operations in public, but no differences have been detected in these actors’ actions since the release of GPT-2.

Threat actors’ behavior is influenced by external incentive structures such as scalability, reduced deployment cost, and ease of use, which can influence the adoption of new techniques, tactics, and procedures (TTPs) adopted by threat actors.
AI researchers are expected to develop increasingly reliable and steerable language models, which would pose challenges for the scientific community and the need to work on security solutions.

The creators of GPT-3 continue their analysis with a discussion of the biases the system may have and how to mitigate them. And they conclude with a section on the system’s energy aspects.

The ChatGPT Hype
#

Generative AI tools capable of creating text from prompts have been available for almost two years now. However, these tools had very limited popularity in communities interested in AI and technological innovation. In the following figure we see the popularity of GPT-3, the best model so far, compared with the popularity in Google searches of a relatively mainstream term like _“ferrari”__.

By contrast, when we add the term “ChatGTP”_ to the comparison we get a fright.

The term “ferrari” is a good benchmark because it maintains fairly stable interest and allows us to see how ChatGPT sparks very high global attention. But how does ChatGPT compare to a term with first-order hype during January 2023 like “Shakira”, who has been very trendy due to her latest song, which has become a phenomenon in music and popular culture?

Well, the Google Trends result points to the fact that the level of attention for searches that “ChatGPT” is attracting is of the same order as that attracted by the Colombian singer who until recently lived in Barcelona.
Let’s recall the meaning of “Hype” According to ChatGPT:

Hype is a term used to describe a large amount of advertising and exaggerated enthusiasm for a product, idea, or trend. In general, it refers to an increase in popularity or attention that has been given to something without a real basis or justification. Thus, the term hype usually refers to a situation in which oversized expectations have been generated about a product or service, in a way that is not consistent with its reality or its capabilities.

Whether or not there is a real basis or justification, ChatGPT is undoubtedly one of the most popular search terms on the internet.

Contents#

What is ChatGPT ?#

ChatGPT is Software As A Service (SaaS)#

Terms and conditions#

Authorship rights and responsibility for ChatGPT’s outputs#

Cost and access#

What’s ChatGPT for and why is it free?#

ChatGPT isn’t open source#

Ethical aspects#

Ethical reflections from the creators of GPT-3#

The ChatGPT Hype#

Related