LLM Overview Slides | LLM & RAG Guide

SLIDE 1

SLIDE 1
SLIDE 1

SLIDE 2

SLIDE 2
SLIDE 2

SLIDE 3

SLIDE 3
SLIDE 3

SLIDE 4

SLIDE 4
SLIDE 4

NLP TASKS WITH LLM

NLP TASKS WITH LLM
NLP TASKS WITH LLM

SOFTWARE APPLICATIONS

SOFTWARE APPLICATIONS
SOFTWARE APPLICATIONS

SLIDE 7

SLIDE 7
SLIDE 7

SLIDE 8

SLIDE 8
SLIDE 8

SLIDE 9

SLIDE 9
SLIDE 9

SLIDE 10

SLIDE 10
SLIDE 10

SLIDE 11

SLIDE 11
SLIDE 11

SLIDE 12

SLIDE 12
SLIDE 12

SLIDE 13

SLIDE 13
SLIDE 13

SLIDE 14

SLIDE 14
SLIDE 14

SLIDE 15

SLIDE 15
SLIDE 15

SLIDE 16

SLIDE 16
SLIDE 16

SLIDE 17

SLIDE 17
SLIDE 17

SLIDE 18

SLIDE 18
SLIDE 18

SLIDE 19

SLIDE 19
SLIDE 19

SLIDE 20

SLIDE 20
SLIDE 20

SLIDE 21

SLIDE 21
SLIDE 21

SLIDE 22

SLIDE 22
SLIDE 22

SLIDE 23

SLIDE 23
SLIDE 23

SLIDE 24

SLIDE 24
SLIDE 24

SLIDE 25

SLIDE 25
SLIDE 25

SLIDE 26

SLIDE 26
SLIDE 26

SLIDE 27

SLIDE 27
SLIDE 27

SLIDE 28

SLIDE 28
SLIDE 28

SLIDE 29

SLIDE 29
SLIDE 29

SLIDE 30

SLIDE 30
SLIDE 30

SLIDE 31

SLIDE 31
SLIDE 31

SLIDE 32

SLIDE 32
SLIDE 32

SLIDE 33

SLIDE 33
SLIDE 33

SLIDE 34

SLIDE 34
SLIDE 34

SLIDE 35

SLIDE 35
SLIDE 35

SLIDE 36

SLIDE 36
SLIDE 36

SLIDE 37

SLIDE 37
SLIDE 37

SLIDE 38

SLIDE 38
SLIDE 38

SLIDE 39

SLIDE 39
SLIDE 39

SLIDE 40

SLIDE 40
SLIDE 40

SLIDE 41

SLIDE 41
SLIDE 41

SLIDE 42

SLIDE 42
SLIDE 42

SLIDE 43

SLIDE 43
SLIDE 43

SLIDE 44

SLIDE 44
SLIDE 44

SLIDE 45

SLIDE 45
SLIDE 45

SLIDE 46

SLIDE 46
SLIDE 46

SLIDE 47

SLIDE 47
SLIDE 47

SLIDE 48

SLIDE 48
SLIDE 48

SLIDE 49

SLIDE 49
SLIDE 49

SLIDE 50

SLIDE 50
SLIDE 50

SLIDE 51

SLIDE 51
SLIDE 51

SLIDE 52

SLIDE 52
SLIDE 52

SLIDE 53

SLIDE 53
SLIDE 53

SLIDE 54

SLIDE 54
SLIDE 54

SLIDE 55

SLIDE 55
SLIDE 55

SLIDE 56

SLIDE 56
SLIDE 56

SLIDE 57

SLIDE 57
SLIDE 57

SLIDE 58

SLIDE 58
SLIDE 58

SLIDE 59

SLIDE 59
SLIDE 59

SLIDE 60

SLIDE 60
SLIDE 60

SLIDE 61

SLIDE 61
SLIDE 61

SLIDE 62

SLIDE 62
SLIDE 62

SLIDE 63

SLIDE 63
SLIDE 63

SLIDE 64

SLIDE 64
SLIDE 64

SLIDE 65

SLIDE 65
SLIDE 65

SLIDE 66

SLIDE 66
SLIDE 66

SLIDE 67

SLIDE 67
SLIDE 67

SLIDE 68

SLIDE 68
SLIDE 68

SLIDE 69

SLIDE 69
SLIDE 69

SLIDE 70

SLIDE 70
SLIDE 70

SLIDE 71

SLIDE 71
SLIDE 71

SLIDE 72

SLIDE 72
SLIDE 72

SLIDE 73

SLIDE 73
SLIDE 73

SLIDE 74

SLIDE 74
SLIDE 74

SLIDE 75

SLIDE 75
SLIDE 75

SLIDE 76

SLIDE 76
SLIDE 76

SLIDE 77

SLIDE 77
SLIDE 77

SLIDE 78

SLIDE 78
SLIDE 78

SLIDE 79

SLIDE 79
SLIDE 79

SLIDE 80

SLIDE 80
SLIDE 80

SLIDE 81

SLIDE 81
SLIDE 81

SLIDE 82

SLIDE 82
SLIDE 82

SLIDE 83

SLIDE 83
SLIDE 83

SLIDE 84

SLIDE 84
SLIDE 84

SLIDE 85

SLIDE 85
SLIDE 85

SLIDE 86

SLIDE 86
SLIDE 86

SLIDE 87

SLIDE 87
SLIDE 87

SLIDE 88

SLIDE 88
SLIDE 88

SLIDE 89

SLIDE 89
SLIDE 89

SLIDE 90

SLIDE 90
SLIDE 90

SLIDE 91

SLIDE 91
SLIDE 91

SLIDE 92

SLIDE 92
SLIDE 92

SLIDE 93

SLIDE 93
SLIDE 93

SLIDE 94

SLIDE 94
SLIDE 94

SLIDE 95

SLIDE 95
SLIDE 95


Large Language Models (LLMs)


LLMs are a type of artificial intelligence (AI) capable of processing and generating human-like text in response to a wide range of prompts and questions. Trained on massive datasets of text and code, they can perform various tasks such as:

Generating different creative text formats: poems, code, scripts, musical pieces, emails, letters, etc.
Answering open ended, challenging, or strange questions in an informative way: drawing on their internal knowledge and understanding of the world.
Translating languages: seamlessly converting text from one language to another.
Writing different kinds of creative content: stories, poems, scripts, musical pieces, etc., often indistinguishable from human-written content.

Retrieval Augmented Generation (RAG)


RAG is a novel approach that combines the strengths of LLMs with external knowledge sources. It works by:

Retrieval: When given a prompt, RAG searches through an external database of relevant documents to find information related to the query.
Augmentation: The retrieved information is then used to enrich the context provided to the LLM. This can be done by incorporating facts, examples, or arguments into the prompt.
Generation: Finally, the LLM uses the enhanced context to generate a response that is grounded in factual information and tailored to the specific query.
RAG offers several advantages over traditional LLM approaches:

Improved factual accuracy: By anchoring responses in real-world data, RAG reduces the risk of generating false or misleading information.
Greater adaptability: As external knowledge sources are updated, RAG can access the latest information, making it more adaptable to changing circumstances.
Transparency: RAG facilitates a clear understanding of the sources used to generate responses, fostering trust and accountability.
However, RAG also has its challenges:

Data quality: The accuracy and relevance of RAG's outputs depend heavily on the quality of the external knowledge sources.
Retrieval efficiency: Finding the most relevant information from a large database can be computationally expensive.
Integration complexity: Combining two different systems (retrieval and generation) introduces additional complexity in terms of design and implementation.

Prompt Engineering


Prompt engineering is a crucial technique for guiding LLMs towards generating desired outputs. It involves crafting prompts that:

Clearly define the task: Specify what the LLM should do with the provided information.
Provide context: Give the LLM enough background knowledge to understand the prompt and generate an appropriate response.
Use appropriate language: Frame the prompt in a way that aligns with the LLM's capabilities and training data.



Advantage of using RAG


Better Accuracy: If factual correctness is crucial, RAG can be fantastic. It retrieves information from external sources, allowing the AI assistant to double-check its responses and provide well-sourced answers.
Domain Knowledge: Imagine an AI assistant for medical diagnosis or legal or up to date tax laws. RAG can access medical databases to enhance its responses and ensure they align with established medical knowledge.
Reduce Hallucination: LLMs can sometimes fabricate information, a phenomenon called hallucination in which they make up things. RAG mitigates this risk by grounding the response in retrieved data.
Building Trust: By citing sources, RAG fosters trust with users. Users can verify the information and see the reasoning behind the response.

Disadvantages of using RAG


Speed is Crucial: RAG involves retrieving information, which can add a slight delay to the response. If real-time response is essential, a pre-trained LLM might be sufficient.
Limited Context: RAG works best when the user's query and context are clear. If the conversation is ambiguous, retrieved information might not be relevant.
Privacy Concerns: If the AI assistant deals with sensitive user data, RAG might raise privacy concerns. External retrievals could potentially expose user information.



When to finetune LLM


Consider fine-tuning a large language model (LLM) when you want it to perform better at a specific task or adapt to a particular domain. Here are exampple scenarios where fine-tuning is optimal

Domain-Specific Nuances: If you need an LLM for financial analysis, legal document or on medical document - finething is better. Example word capital has meaning in finance domain. While a pre-trained LLM might understand language, it won't grasp legal, finance or medical specific terms or jargons. Fine-tuning on finance Q&A, document or legal documents will imrpove the LLM to that specific domain.

Instruction Fine-Tuning: This is a recent advancement where you provide the LLM with instructions or demonstrations alongside training data. This can be useful for tasks where you want the LLM to follow a certain style or format, like writing safety instructions in a specific tone.

Specialized Tasks: Imagine you want an LLM to write different kinds of creative content, like poems or code. Fine-tuning on a dataset of poems can improve its poetry generation skills, while fine-tuning on code samples can enhance its code writing abilities.

However, fine-tuning isn't always the right answer. Here are examples when you should not do fine tuning. In fact you should first use standard LLM, prompt engineering. Then try RAG. Later consider fine tuning.

General Use Cases: If you need a broad LLM for various tasks, a pre-trained model will do better job. Pre-trained models are versatile, trained on diverse data and can handle many tasks well enough without specific fine-tuning.

Limited Data: Fine-tuning works well when you have lot of data related to your specific task or domain. If you only add few records which most demo shows - fine-tuning might not be effective and could even harm the model's performance.

Knowledge Integration: If your goal is to add propietary/latest/specific knowledge to the LLM, retrieval-augmented-generation (RAG)is better approach. In RAG LLM retrieves relevant information from a knowledge base. You can use out of box RAG. You can further optimize it by smmarizing knowledge,, creating embeddings with meta data etc.



Fine Tuning Steps



How can LLMs be fine-tuned for summarization?
LLMs (Large Language Models) like GPT-3 can be fine-tuned for summarization using the following approaches:
Supervised training - The simplest approach is to fine-tune the LLM using a large dataset of text-summary pairs. The model is trained to generate the corresponding summary given the input text.
This requires a sizable supervised dataset, which can be expensive to create. Public datasets like CNN/DailyMail can be used.

Self-supervised training - The LLM is trained using the original text as input and the first few sentences as the "summary". This creates weak supervision from the data itself.
The model is then fine-tuned on a smaller set of human-written summaries to improve accuracy. This approach requires less labeled data.

Reinforcement learning - The LLM is first trained autoencoding - to reproduce the input text. Then, rewards are given based on the quality and conciseness of the generated summary.
The model learns to generate better summaries through trial-and-error to maximize these rewards. However, this requires defining a good reward function.

Filtering and post-processing - Generated summaries from the LLM can be filtered and refined using techniques like:
• Extracting sentences with the highest similarity to human references • Removing repetitive sentences • Combining overlapping content into a single sentence, etc.
This requires minimal fine-tuning of the base LLM but provides less control over the summary style.

Prompting - The LLM can be "prompted" to generate a summary using natural language instructions. For example:
In 2-3 short paragraphs, summarize the main points of the following text:
This relies more on the pre-trained LLM abilities and requires less labeled data. But accuracy tends to be lower.
So in short, there are a variety of approaches to fine-tune LLMs for summarization - from fully supervised to minimally supervised. The choice depends on the available data, required accuracy and custom need.



Verify LLM and AI Assistant Answers

If you are using AI Assistant, you should cross check facts/number given by AI Assistant

Check in Vecor DB: If you are using Vector DB/RAG, you can check what value RAG provide. This will help to ensure that response generated by RAG is in line with value stored in vector DB.
Use Second LLM: Other/aditional approach is you can ask a smaller question from second or same LLM and se what answer you get e.g. if there is 1 page of text and it says company Dataknobs has revenue of $78M, you can ask a smaller question "how much revneue Dataknobs has". However you need to consider additional cost of 2nd call? You may have more than one fact and multiple calls may be needed for each fact.
Call to Search Engine: You can run a query on search engine programmatically and chec response. However depending on domain this result may or may not work. It may require parsing result from search engine.





How to evaluate LLM


Method Description
Perplexity Perplexity measures how well a language model predicts a sample of text. Lower perplexity indicates better performance.
BLEU Score BLEU (Bilingual Evaluation Understudy) Score is commonly used to evaluate the quality of machine-translated text by comparing it to human-generated translations.
ROUGE Score ROUGE (Recall-Oriented Understudy for Gisting Evaluation) Score is used to evaluate the quality of summaries produced by a language model by comparing them to reference summaries.
Human Evaluation Human evaluation involves having human judges assess the quality of text generated by the language model based on criteria such as fluency, coherence, and relevance.
Word Error Rate (WER) WER measures the difference between the words generated by the language model and the reference text. Lower WER indicates better performance.





100K-tokens    Agenda    Ai-assistant-architecture    Ai-assistant-building-blocks    Ai-assistant-custom-model    Ai-assistant-evaluation-metric    Ai-assistant-finetune-model    Ai-assistant-on-your-data    Ai-assistant-tech-stack    Ai-assistant-wrapper   

Dataknobs Blog

Showcase: 10 Production Use Cases

10 Use Cases Built By Dataknobs

Dataknobs delivers real, shipped outcomes across finance, healthcare, real estate, e‑commerce, and more—powered by GenAI, Agentic workflows, and classic ML. Explore detailed walk‑throughs of projects like Earnings Call Insights, E‑commerce Analytics with GenAI, Financial Planner AI, Kreatebots, Kreate Websites, Kreate CMS, Travel Agent Website, and Real Estate Agent tools.

Data Product Approach

Why Build Data Products

Companies should build data products because they transform raw data into actionable, reusable assets that directly drive business outcomes. Instead of treating data as a byproduct of operations, a data product approach emphasizes usability, governance, and value creation. Ultimately, they turn data from a cost center into a growth engine, unlocking compounding value across every function of the enterprise.

AI Agent for Business Analysis

Analyze reports, dashboard and determine To-do

Our structured‑data analysis agent connects to CSVs, SQL, and APIs; auto‑detects schemas; and standardizes formats. It finds trends, anomalies, correlations, and revenue opportunities using statistics, heuristics, and LLM reasoning. The output is crisp: prioritized insights and an action‑ready To‑Do list for operators and analysts.

AI Agent Tutorial

Agent AI Tutorial

Dive into slides and a hands‑on guide to agentic systems—perception, planning, memory, and action. Learn how agents coordinate tools, adapt via feedback, and make decisions in dynamic environments for automation, assistants, and robotics.

Build Data Products

How Dataknobs help in building data products

GenAI and Agentic AI accelerate data‑product development: generate synthetic data, enrich datasets, summarize and reason over large corpora, and automate reporting. Use them to detect anomalies, surface drivers, and power predictive models—while keeping humans in the loop for control and safety.

KreateHub

Create New knowledge with Prompt library

KreateHub turns prompts into reusable knowledge assets—experiment, track variants, and compose chains that transform raw data into decisions. It’s your workspace for rapid iteration, governance, and measurable impact.

Build Budget Plan for GenAI

CIO Guide to create GenAI Budget for 2025

A pragmatic playbook for CIOs/CTOs: scope the stack, forecast usage, model costs, and sequence investments across infra, safety, and business use cases. Apply the framework to IT first, then scale to enterprise functions.

RAG for Unstructured & Structured Data

RAG Use Cases and Implementation

Explore practical RAG patterns: unstructured corpora, tabular/SQL retrieval, and guardrails for accuracy and compliance. Implementation notes included.

Why knobs matter

Knobs are levers using which you manage output

The Drivetrain approach frames product building in four steps; “knobs” are the controllable inputs that move outcomes. Design clear metrics, expose the right levers, and iterate—control leads to compounding impact.

Our Products

KreateBots

  • Ready-to-use front-end—configure in minutes
  • Admin dashboard for full chatbot control
  • Integrated prompt management system
  • Personalization and memory modules
  • Conversation tracking and analytics
  • Continuous feedback learning loop
  • Deploy across GCP, Azure, or AWS
  • Add Retrieval-Augmented Generation (RAG) in seconds
  • Auto-generate FAQs for user queries
  • KreateWebsites

  • Build SEO-optimized sites powered by LLMs
  • Host on Azure, GCP, or AWS
  • Intelligent AI website designer
  • Agent-assisted website generation
  • End-to-end content automation
  • Content management for AI-driven websites
  • Available as SaaS or managed solution
  • Listed on Azure Marketplace
  • Kreate CMS

  • Purpose-built CMS for AI content pipelines
  • Track provenance for AI vs human edits
  • Monitor lineage and version history
  • Identify all pages using specific content
  • Remove or update AI-generated assets safely
  • Generate Slides

  • Instant slide decks from natural language prompts
  • Convert slides into interactive webpages
  • Optimize presentation pages for SEO
  • Content Compass

  • Auto-generate articles and blogs
  • Create and embed matching visuals
  • Link related topics for SEO ranking
  • AI-driven topic and content recommendations