GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot...
Transcript of GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot...
![Page 1: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/1.jpg)
GPT-3Hung-yi Lee 李宏毅
![Page 2: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/2.jpg)
GPT-3
![Page 3: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/3.jpg)
Bigger Model
Megatron
TuringNLG
17B
https://www.microsoft.com/en-us/research/blog/turing-nlg-a-17-billion-parameter-language-model-by-microsoft/
GPT-3 has 175B parameters! (10 times larger than Turing NLG)
![Page 4: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/4.jpg)
假設 ELMO 的參數量是長 30 公分的尺
那麼 GPT-3 比台北 101 還高
GPT-3 的參數量大約是 ELMO 的 2000 倍
![Page 5: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/5.jpg)
GPT-3 是來自於暗黑大陸的模型
![Page 6: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/6.jpg)
https://www.zhihu.com/question/398114261
https://github.com/openai/gpt-3/issues/1
![Page 7: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/7.jpg)
Text without annotation
Pre-train
A model that can read text
Model
Model
Task Specific
Model
Task Specific
Model
Task Specific
Fine-tune
Task-specific data with annotation
![Page 8: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/8.jpg)
GPT 系列的野望
題型說明
少數範例
![Page 9: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/9.jpg)
Few-shot Learning
One-shot Learning
Zero-shot Learning
(no gradient descent)
“In-context” Learning
![Page 10: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/10.jpg)
Average of 42 tasks
![Page 11: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/11.jpg)
Closed Book QA
![Page 12: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/12.jpg)
![Page 13: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/13.jpg)
![Page 14: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/14.jpg)
![Page 15: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/15.jpg)
![Page 16: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/16.jpg)
![Page 17: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/17.jpg)
![Page 18: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/18.jpg)
![Page 20: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/20.jpg)
![Page 21: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/21.jpg)
Turing Advice Challenge
![Page 22: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/22.jpg)
raster order
![Page 23: GPT-3speech.ee.ntu.edu.tw/~tlkagk/courses/DLHLP20/GPT3 (v6).pdf · 2020. 6. 28. · OpenAl Elliot Turner @eturner303 Reading the OpenAl GPT-3 paper. Impressive performance on many](https://reader035.fdocument.pub/reader035/viewer/2022071501/611fce52577572277d17292a/html5/thumbnails/23.jpg)
Source of image: https://openai.com/blog/image-gpt/
My Favorite Ones