Nlp_자연어처리이해_3강

자연어처리에 대한 이해 3강

1. Dialogue Modeling Problem

2. Recurrent Language Model

x = ‘I am going to the school’ (x1, x2, … xt)

p(x) = p(x1, x2, …, xt) joint prob

= p(xt

x1, x2… xt-1) * p(x1,x2,…,xt-1)

= p(school

I, am, …, the )*p(I,am,…,the)

= conditional probabiltiy * marginal probability

= p(school

I, am, …, the )p(the

I,am,…,to)p(I,am,…,the)

RNN은 이 확률을 쉽게 계산할 수 있다. hidden layer. 이전 정보들을 가지고 있다. (context infomation)

RNN 에서 softmax(Wh+b)

3. Neural Machine Translation

4. Naive Seq2Seq Model

5. Context Based Model

6. Persona-based Model

[5] input data / output data

[6] data -> number

앞에는 그냥 data 생성부분.

이부분은 TUESDAY이런 쓸데없는 거땜에 though vector 다음에 hidden value 를 initial로 받음 target : output data without 'go' random_uniform 으로 initialize input embed -> lstm dynamic rnn last state of encoding -> initial stae of decoding all I want is output of encoding logits -> loss function (fully connected??) ouput : batch size y size last state -> sequence length 29 embd_size :10 input output target 3 things actual target : everything but the go character ex. yes what's up output : input to decoding , everything but the last character prediction is maximum value target 과 output 차이 dimension