LLMs

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 40

Large Language Models and

Applications
GPT-4 just
released!
GPT-4 just released!
GPT-4 just released!
Model size is increasing exponentially
1. Behind LLMs: Language Models
Theories of Language Models
Three approaches for language modelling

· Sentence Correction (Denoising)

· Text Completion

· Text Translation

7
Theories of Language Models
Parametric architectures for sentence denoising: Encoder

8
Theories of Language Models
Parametric architectures for text completion: Decoder

9
Theories of Language Models
Parametric architectures for text translation: Encoder-Decoder

10
Theories of Language Models
Training Language Models

· Pre-training

· Supervised Training

· Reinforcement Learning

11
2. More is Different, Language Models Likewise

12
LLM is different: A paradigm shift
• Harder to handle: Training cost

13
LLM is different: A paradigm shift
• Easier to use: From fine-tuning to prompt engineering

14
LLM is different: A paradigm shift
• Emerging Capabilities: ICL / CoT / MM reasoning

15
LLM is different: A paradigm shift
• Solving real-word problems with general intelligence

16
LLM is different: A paradigm shift
• Solving real-word problems with general intelligence

17
3. Examples and applications of LLMs

18
ChatGPT: Reinforcement Learning from Human Feedback

19
Kosmos-1: Multimodal Large Language Models

20
PaLM-E: Embodied Language Models

21
Visual ChatGPT: Large Language Model + Visual Models

22
Galactica: Language Model + Research Data

23
Applications

24
Applications
• Education

• Customer service / advisor

• Knowledge Management

• Recommendation

• Virtual Assistant

25
4. Future Research Directions about LLMs

26
“Replicating” LLMs
Approaches: Improving Effectiveness

• Curating training corpus

• Human Evaluation for Ethics

• Learning from existing LMs

27
Alpaca: Learning from existing LLMs

28
“Replicating” LLMs
Approaches: Improving Efficiency

• Training: Efficient / Staged Training

• Hardware: GPU / Cloud / Federate Learning

• Software: Compiling / Coding

29
“Replicating” LLMs
Pros:

• Encourages open-source

• Business security

• Customizable for specific use

30
“Replicating” LLMs
Cons:

• Expensive with risks

• Ethical considerations

• Limited data availability

31
“Leveraging” LLMs
Approaches: borrowing wisdom from LLMs

• In-Context Learning

• Knowledge Distillation

• Systems with LLMs

• Prompt Engineering

32
MathPrompter: Prompt LM and verify result

33
“Leveraging” LLMs
Pros:

• Easy to start

• Relatively low cost

Cons:

• Competitiveness

• Performance variance
34
“Replacing” LLMs
Approaches:

• Augmented Language Models

• New Architecture for Language Models

Pros & cons:

• high risk & high reward

35
Augmented Language Models: Toolformer

36
New Architecture for Language Models: AFT

37
5. Summary

38
A new
Moore’s Law
is coming!

39
Thanks!
Email: [email protected]
Homepage: zihanwang314.github.io

40

You might also like