Generative AI By Shazia Zahoor

contact: shaaz.z9@gmail.com

Introduction to AI

Welcome!

You’re presumably here because you want to learn more about artificial intelligence (AI). Maybe you’ve heard about AI in the media and want to know more; or maybe you’re going to be adopting AI at work or in school, and want to know more about what to expect.

This training module is designed to provide a high-level overview of some core capabilities of artificial intelligence (AI) and give you an intuition of how they work. It’s not a deeply technical module, and we won’t be writing any code or getting into the mathematical details of machine learning models. Instead, we’ll focus on understanding the kinds of things that AI can do, and the basic principles on which it is based.

So, let’s go! Move on to the next unit and we’ll start our exploration of AI.

Generative AI

Key points to understand about generative AI include:

Generative AI is a branch of AI that enables software applications to generate new content; often natural language dialogs, but also images, video, code, and other formats.
The ability to generate content is based on a language model, which has been trained with huge volumes of data – often documents from the Internet or other public sources of information.
Generative AI models encapsulate semantic relationships between language elements (that’s a fancy way of saying that the models “know” how words relate to one another), and that’s what enables them to generate a meaningful sequence of text.
There are large language models (LLMs) and small language models (SLMs) – the difference is based on the volume of data and the number of variables in the model. LLMs are very powerful and generalize well, but can be more costly to train and use. SLMs tend to work well in scenarios that are more focused on specific topic areas, and usually cost less.

Generative AI scenarios

Common uses of generative AI include:

Summarizing or explaining complex documents.

1.Implementing chatbots and AI agents that assist human users.

2.Creating new documents or other content (often as a starting point for further iterative development)

3.Automated translation of text between languages.

4.Summarizing or explaining complex documents.

Computer vision

Key points to understand about computer vision include:

Computer vision is accomplished by using large numbers of images to train a model.
Image classification is a form of computer vision in which a model is trained with images that are labeled with the main subject of the image (in other words, what it’s an image of) so that it can analyze unlabeled images and predict the most appropriate label – identifying the subject of the image.
Object detection is a form of computer vision in which the model is trained to identify the location of specific objects in an image.
There are more advanced forms of computer vision – for example, semantic segmentation is an advanced form of object detection where, rather than indicate an object’s location by drawing a box around it, the model can identify the individual pixels in the image that belong to a particular object.
You can combine computer vision and language models to create a multi-modal model that combines computer vision and generative AI capabilities.

Computer vision scenarios

Common uses of computer vision include:

Robotics and self-driving vehicles.

Auto-captioning or tag-generation for photographs.

Visual search.

Monitoring stock levels or identifying items for checkout in retail scenarios.

Security video monitoring.

Authentication through facial recognition.

Speech

Key points to understand about speech include:

Speech recognition is the ability of AI to “hear” and interpret speech. Usually this capability takes the form of speech-to-text (where the audio signal for the speech is transcribed into text).
Speech synthesis is the ability of AI to vocalize words as spoken language. Usually this capability takes the form of text-to-speech in which information in text format is converted into an audible signal.
AI speech technology is evolving rapidly to handle challenges like ignoring background noise, detecting interruptions, and generating increasingly expressive and human-like voices.

AI speech scenarios

Common uses of AI speech technologies include:

Personal AI assistants in phones, computers, or household devices with which you interact by talking.
Automated transcription of calls or meetings.
Automating audio descriptions of video or text.
Automated speech translation between languages.

Natural language processing

Key points to understand about natural language processing (NLP) include:

NLP capabilities are based on models that are trained to do particular types of text analysis.
While many natural language processing scenarios are handled by generative AI models today, there are many common text analytics use cases where simpler NLP language models can be more cost-effective.
Common NLP tasks include:
- Entity extraction – identifying mentions of entities like people, places, organizations in a document
- Text classification – assigning document to a specific category.
- Sentiment analysis – determining whether a body of text is positive, negative, or neutral and inferring opinions.
- Language detection – identifying the language in which text is written.

Note: In this module, we’ve used the term natural language processing (NLP) to describe AI capabilities derive meaning from “ordinary” human language. You might also see this area of AI referred to as natural language understanding (NLU).

Natural language processing scenarios

Common uses of NLP technologies include:

Implementing chatbots that can answer frequently asked questions or orchestrate predictable conversational dialogs that don’t require the complexity of generative AI.

Analyzing document or transcripts of calls and meetings to determine key subjects and identify specific mentions of people, places, organizations, products, or other entities.

Analyzing social media posts, product reviews, or articles to evaluate sentiment and opinion.

Extract data and insights

Key points to understand about using AI to extract data and insights include:

The basis for most document analysis solutions is a computer vision technology called optical character recognition (OCR).
While an OCR model can identify the location of text in an image, more advanced models can also interpret individual values in the document – and so extract specific fields.
While most data extraction models have historically focused on extracting fields from text-based forms, more advanced models that can extract information from audio recording, images, and videos are becoming more readily available.

Data and insight extraction scenarios

Common uses of AI to extract data and insights include:

Automated processing of forms and other documents in a business process – for example, processing an expense claim.
Large-scale digitization of data from paper forms. For example, scanning and archiving census records.
Indexing documents for search.
Identifying key points and follow-up actions from meeting transcripts or recordings.

Responsible AI

Key points to understand about responsible AI include:

Fairness: AI models are trained using data, which is generally sourced and selected by humans. There’s substantial risk that the data selection criteria, or the data itself reflects unconscious bias that may cause a model to produce discriminatory outputs. AI developers need to take care to minimize bias in training data and test AI systems for fairness.
Reliability and safety: AI is based on probabilistic models, it is not infallible. AI-powered applications need to take this into account and mitigate risks accordingly.
Privacy and security: Models are trained using data, which may include personal information. AI developers have a responsibility to ensure that the training data is kept secure, and that the trained models themselves can’t be used to reveal private personal or organizational details.
Inclusiveness: The potential of AI to improve lives and drive success should be open to everyone. AI developers should strive to ensure that their solutions don’t exclude some users.
Transparency: AI can sometimes seem like “magic”, but it’s important to make users aware of how the system works and any potential limitations it may have.
Accountability: Ultimately, the people and organizations that develop and distribute AI solutions are accountable for their actions. It’s important for organizations developing AI models and applications to define and apply a framework of governance to help ensure that they apply responsible AI principles to their work.

Responsible AI examples

Some example of scenarios where responsible AI practices should be applied include:

An AI-powered college admissions system should be tested to ensure it evaluates all applications fairly, taking into account relevant academic criteria but avoiding unfounded discrimination based on irrelevant demographic factors.
An AI-powered robotic solution that uses computer vision to detect objects should avoid unintentional harm or damage. One way to accomplish this goal is to use probability values to determine “confidence” in object identification before interacting with physical objects, and avoid any action if the confidence level is below a specific threshold.
A facial identification system used in an airport or other secure area should delete personal images that are used for temporary access as soon as they’re no longer required. Additionally, safeguards should prevent the images being made accessible to operators or users who have no need to view them.
A web-based chatbot that offers speech-based interaction should also generate text captions to avoid making the system unusable for users with a hearing impairment.
A bank that uses an AI-based loan-approval application should disclose the use of AI, and describe features of the data on which it was trained (without revealing confidential information).