LLMs
LLMs
LLMs
Applications
GPT-4 just
released!
GPT-4 just released!
GPT-4 just released!
Model size is increasing exponentially
1. Behind LLMs: Language Models
Theories of Language Models
Three approaches for language modelling
· Text Completion
· Text Translation
7
Theories of Language Models
Parametric architectures for sentence denoising: Encoder
8
Theories of Language Models
Parametric architectures for text completion: Decoder
9
Theories of Language Models
Parametric architectures for text translation: Encoder-Decoder
10
Theories of Language Models
Training Language Models
· Pre-training
· Supervised Training
· Reinforcement Learning
11
2. More is Different, Language Models Likewise
12
LLM is different: A paradigm shift
• Harder to handle: Training cost
13
LLM is different: A paradigm shift
• Easier to use: From fine-tuning to prompt engineering
14
LLM is different: A paradigm shift
• Emerging Capabilities: ICL / CoT / MM reasoning
15
LLM is different: A paradigm shift
• Solving real-word problems with general intelligence
16
LLM is different: A paradigm shift
• Solving real-word problems with general intelligence
17
3. Examples and applications of LLMs
18
ChatGPT: Reinforcement Learning from Human Feedback
19
Kosmos-1: Multimodal Large Language Models
20
PaLM-E: Embodied Language Models
21
Visual ChatGPT: Large Language Model + Visual Models
22
Galactica: Language Model + Research Data
23
Applications
24
Applications
• Education
• Knowledge Management
• Recommendation
• Virtual Assistant
25
4. Future Research Directions about LLMs
26
“Replicating” LLMs
Approaches: Improving Effectiveness
27
Alpaca: Learning from existing LLMs
28
“Replicating” LLMs
Approaches: Improving Efficiency
29
“Replicating” LLMs
Pros:
• Encourages open-source
• Business security
30
“Replicating” LLMs
Cons:
• Ethical considerations
31
“Leveraging” LLMs
Approaches: borrowing wisdom from LLMs
• In-Context Learning
• Knowledge Distillation
• Prompt Engineering
32
MathPrompter: Prompt LM and verify result
33
“Leveraging” LLMs
Pros:
• Easy to start
Cons:
• Competitiveness
• Performance variance
34
“Replacing” LLMs
Approaches:
35
Augmented Language Models: Toolformer
36
New Architecture for Language Models: AFT
37
5. Summary
38
A new
Moore’s Law
is coming!
39
Thanks!
Email: [email protected]
Homepage: zihanwang314.github.io
40