[논문 리뷰] Recipes for building an open-domain chatbot ( feat. blenderbot1, parlai, facebook, chatbot, open-domain, bb1, 블렌더봇, 블렌더봇1, 페이스북)

250x250

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

코딩일기

[논문 리뷰] Recipes for building an open-domain chatbot ( feat. blenderbot1, parlai, facebook, chatbot, open-domain, bb1, 블렌더봇, 블렌더봇1, 페이스북) 본문

Paper Reviews

[논문 리뷰] Recipes for building an open-domain chatbot ( feat. blenderbot1, parlai, facebook, chatbot, open-domain, bb1, 블렌더봇, 블렌더봇1, 페이스북)

daje 2022. 8. 1. 20:42

728x90

사전에 숙지해야할 사항

1. Transformers

-. Title : Attention all you need

-. link : https://arxiv.org/pdf/1706.03762.pdf

-. review : 2021.10.04 - [Paper Reviews] - [논문리뷰] Attention is all you need (feat. Transformer)

2. BERT

-. Title : Pre-training of Deep Bidirectional Transformers for Language Understanding

-. link : https://arxiv.org/pdf/1810.04805.pdf

-. review : 진행 예정

3. Poly-encoders

-. Title : architectures and pre-training strategies for fast and accurate multi-sentence scoring

-. link : https://arxiv.org/pdf/1905.01969.pdf

-. review : 진행 예정

4. BART

-. Title : Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

-. link : https://arxiv.org/pdf/1910.13461.pdf

-. review : 진행 예정

0. Abstract
1. Introduction
2. Model architectures
- 2.1 Retriever
- 2.2 Generator
- 2.3 Retriveve and Refine
  - Dialogue Retrieval
  - Knowledge Retrieval
3. Training Objectives
- 3.1 Ranking for Retrieval
- 3.2 Likelihood Training for Generation
- 3.3 a-blending for Retrieve and Refine
- 3.4 Unlikelihood trainin for generation
4. Decoding
- 4.1 Beam Search
- 4.2 Sampling
- 4.3 Response Length
  - Minimum length
  - Predictive length
- 4.4 Subsequence Blocking
5. Training Details
- Pre-training Ranking models
- Pre-training Generative models
- Fine-tuning
6. Training Data
- 6.1 Pre-training
  - pushshift.io Reddit
- 6.2 Fine-tuning
  - ConvAI2
  - Empathetic Dialogues(ED)
  - Wizard of Wikipedia(WoW)
  - Blended Skill Talk
7. Safety Characteristics
8. Evaluation Methods
- ACUTE-Eval
- Self-Chat ACUTE-Eval
9. Related Work
10. Results & Analysis
- 10.1 Automatic Evaluations
  - Retriever
  - Generator
  - Retrieve and Refine(RetNRef)
  - Safety
- 10.2 Self-Chat Evaluations
  - Retrieval vs. Generator vs. RetNRef
  - Generator Decoding choices
  - Small vs. Large models
  - Pre-trainin vs. Fine-Tuning
  - Persona context vs. No context given
  - Likelihood vs. Unlikehood
- 10.3 Full(Human-Bot Chat) Evaluations
  - Retrieval vs. Generator vs. RetNRef
  - Comparison to Meena
  - Model vs. Human-human Chat Comparisons
  - Response Length
- 10.4 Example Successful Conversations
- 10.5 Failure Cases and Mpdel Extensions
  - Vocabulary Usage
  - Nontrivial Repetition
  - Contradiction and Forgetfulness
  - Knowledge and Factual Correctness
  - Conversation Length and Memory
  - Deeper Understanding
  - Further Notes onEvaluation
- 11. Released code and models
- 12. Discussion

0. Abstract

- prior works has shown that scaling neural models in the number of parameters and the size of the data.

- Good conversation requires a number of skills that an expert conversationalist blends in a seamless way.

- We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy and then we build new models.

- Human evaluations show our best models are superior to extisting approaches in multi-turn dialogue.

- 기존 연구들은 더 많은 데이터와 파라미터를 가지는 인공신경망을 만드는데 연구를 해왔습니다.

- 좋은 대화는 전문가적인 스킬과 공감, 개성 등을 반영한 매력적인 대화여야한다고 이야기하고 있습니다.

- 이러한 좋은 대화를 하는 모델을 만들기 위해서는 적절한 훈련데이터와 generation strategy가 필요하다고 이야기하고 있습니다.

- 또한, 새로운 평가 지표를 만들어서 평가를 했다고 이야기하고 있습니다.

1. Introduction

- the pre-training on large corpora is important.

- Beyond simply scaling models the two main takeaways from our study are Blending Skills and Generation Strategies

디코딩 전략을 어떻게 설정하냐에 따라서 perplexity가 같은 두 모델도 엄청나게 다른 결과를 내놓을 수 있습니다.

특히, bot의 utterances의 길이가 사람이 봇의 응답을 판단할 때 큰 영향을 준다고 언급합니다.

이전 연구에서는 beam search가 별로라고 이야기 했지만, minimum beam length를 설정하여 좋은 응답을 뽑아 낼 수 있다고 이야기하고 있습니다.

728x90

저작자표시 (새창열림)

'Paper Reviews' 카테고리의 다른 글

[논문 리뷰] DIALOGPT : Large-Scale Generative Pre-Training for Conversational Response Generation(feat. paper review, GPT2, MicroSoft) (0)	2022.09.14
[논문 리뷰] Language Models that Seek for Knowledge:Modular Search & Generation for Dialogue and Prompt Completion(feat. SeeKeR, FaceBooK, Chatbot, opendomain, 챗봇) (0)	2022.08.21
[논문 리뷰] Beyond Goldfish Memory: Long-Term Open-Domain Conversation(feat. Blenderbot2.0, Long-Term chatbot, Facebook AI Research, Parlai, MSC) (0)	2022.06.14
[논문 리뷰] Internet-Augmented Dialogue Generation(feat. Blenderbot2.0, Long-Term chatbot, Facebook Research, Parlai) (0)	2022.06.13
What is FAISS index (feat. facebook) (0)	2022.06.13

'Paper Reviews' Related Articles

코딩일기

코딩일기

[논문 리뷰] Recipes for building an open-domain chatbot ( feat. blenderbot1, parlai, facebook, chatbot, open-domain, bb1, 블렌더봇, 블렌더봇1, 페이스북) 본문

[논문 리뷰] Recipes for building an open-domain chatbot ( feat. blenderbot1, parlai, facebook, chatbot, open-domain, bb1, 블렌더봇, 블렌더봇1, 페이스북)

목차

'Paper Reviews' 카테고리의 다른 글

티스토리툴바