Generative AI

There has been a lot of buzz going around Generative AI. But what is this so-called Generative AI, and Have you ever wondered why is there a lot of buzz around it? 

As we know Generative AI generates content. Generative AI will not just capture data from some web page. It will create its own content according to the prompt it got.  The data it creates can be in any form, it can be text, audio, video, images, etc.

Generative AI was introduced long back in chatbots, but it was stateless (will not remember previous conversation). But later with the help of Transformers, Generative Adversarial Networks (GANS), etc, it was developed into an extent where it can remember previous conversation and generate data accordingly in any format. Generally GAN's are used to generate images. Generative AI lacks cognitive thinking, it only gives us the content according to the data it has been trained on. To generate the text we use LLM's (Large Language Models), which are nothing but transformers. Transformers are nothing but very large neural networks which are pre- trained with huge amounts of data with billions of parameters. 

                                         ARCHITECTURE OF GENERATIVE AI



Training of the Generative AI to generate text: 

 Initially, the necessary data is gathered. Once collected, it is then segmented into chunks. Followed by which the data  would go through a transformer, which would convert the whole data into numerical representation. These numerical representations gets stored in the vector DB. 

How the Generative AI give the context when a prompt is given: 

Here when we give a prompt, it would gather all the relevant data from the vector database and sends that data through LLM's to generate the relevant content. Here while getting the relevant data from the database, it would calculate the vector distance and accordingly generate the content. To put it simply, here vector distance is nothing but the distance between the vector of the prompt and 

Components: 

Text Splitter: This is used to break down the huge data into smaller chunks.

Embedding Model: Most of the Embedding models are Transformers based. Embeddings are made by assigning each item in data to a dense vector. Since some vectors are similar by construction, embeddings can be used to find similar items or to understand the context or intent of data. There are different types of embedding models that are available.



Vector DB: It would store all the embeddings of the text. 

LLM: LLM's are the models that has been trained with huge volume of data. It has billions of parameters and has capabilities to understand and respond to a text. These are trained to understand human text. There are many LLM's available to work with like GPT(Generative Pre-trained Transformer), BLOOM, RoBERTa (Robustly Optimized BERT Pretraining Approach), BERT (Bidirectional Encoder Representations from transformers). The LLM's use transformers models.

Till now we have heard this term Transformers many times. Let us discuss about these transformers first,

Transformers: At the end of the transformers are nothing but a huge neural network with billions of parameters. These are trained to understand the context of the text with huge corpus of data. They demonstrated State Of The Art (SOTA) performance in the field of NLP.


I know that the above transformers architecture looks complicated, I would be explaining about the  architecture of transformers in my next blog.

Brief intro about Parameters, SOTA, NLP: 

Parameters: As we all know that neural networks are nothing but connection of multiple neurons. Computationally these neurons are some values. To shape the understanding network or enhance the learning of neurons, there need to be some sort of computational calculation where many variables are going to be involved. Those variables are parameters. These are used to make any model learn the data properly. They determine how the input data is converted into output.

State Of The Art (SOTA) : This basically means the most advanced technology or modern algorithms, models.

Natural Language Processing (NLP) : This is used to make computers or any devices understand or interpret the human language. 

Fine Tuning : It is the process of adapting a pre-trained model to perform a specific tasks or to understand a particular domain more effectively.

There are many ways to fine tune LLM's: 

self supervised : Here we give the domain specific data to the LLM's to learn
Supervised : Here the data we will be giving will be in the form of tables.
Reinforcement Learning: Here it is reward based.

Have you guys ever wondered about what would happen if we ask chatgpt or any GenAI tool about something that is not existing in the data that it has been trained on ??? 

I know that it would either tell us that it has no knowledge on it or would give us some rubbish answer. Don't you guys think this would prove us that it has no cognitive thinking. But lot of people are thinking of designing AI that would have cognitive thinking like us humans, That is what called Artificial General Intelligence(AGI).  Let us see till where the GPT's would develop.

General Knowledge: 

A company called BloomBerg had 40 years of financial data, which includes both public and the private data they had. They used all the 40 years of financial data to train the GPT with many billions of parameters. Finally the end product called BloombergGPT which turned out to be a game changer in Finance Industry. 



References:

[1] Image source 1: Nandakishore J. (2023, June 13). Emerging architecture for generative AI on textual data. Medium. Retrieved from [https://nandakishorej8.medium.com/emerging-architecture-for-generative-ai-on-textual-data-a71aaea0087f](https://nandakishorej8.medium.com/emerging-architecture-for-generative-ai-on-textual-data-a71aaea0087f)

Comments

Popular posts from this blog

Fleet Management

Digital Twin