Embedding

์›Œ๋“œ ์ž„๋ฒ ๋”ฉ (Word Embedding)

  • ๋‹จ์–ด๋ฅผ ๋ฐ€์ง‘ํ‘œํ˜„ ํ˜•ํƒœ์˜ ๋ฒกํ„ฐ๋กœ ํ‘œํ˜„ํ•˜๋Š” ๋ฐฉ๋ฒ•

  • ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ ๊ณผ์ •์„ ํ†ตํ•ด ๋‚˜์˜จ ๋ฒกํ„ฐ๋ฅผ ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ(embedding vector)๋ผ๊ณ  ํ•จ

  • Word2vec, Glove, FastText์™€ ๊ฐ™์€ ๋ฐฉ๋ฒ•๋ก ์ด ์žˆ์Œ

์›-ํ•ซ ์ธ์ฝ”๋”ฉ

์›Œ๋“œ ์ž„๋ฒ ๋”ฉ

๊ณ ์ฐจ์›(์ „์ฒด ๋‹จ์–ด ๊ฐœ์ˆ˜)

์ €์ฐจ์›, ์‚ฌ์šฉ์ž ์ง€์ •

ํฌ์†Œ ๋ฒกํ„ฐ

๋ฐ€์ง‘ ๋ฒกํ„ฐ

1, 0์œผ๋กœ ํ‘œํ˜„

์‹ค์ˆ˜๋กœ ํ‘œํ˜„

์œ ์‚ฌ๋„ ๊ณ„์‚ฐ ๋ถˆ๊ฐ€๋Šฅ

์œ ์‚ฌ๋„ ๊ณ„์‚ฐ ๊ฐ€๋Šฅ

ํฌ์†Œ ํ‘œํ˜„(Sparse Representation)

  • ๋ฒกํ„ฐ ๋˜๋Š” ํ–‰๋ ฌ์˜ ๊ฐ’์ด ๋Œ€๋ถ€๋ถ„ 0์œผ๋กœ ํ‘œํ˜„๋˜๋Š” ๋ฐฉ๋ฒ•

  • ์›-ํ•ซ ์ธ์ฝ”๋”ฉ(one-hot encoding)๋ฐฉ์‹

์˜ˆ์‹œ) 1๋งŒ๊ฐœ์˜ ๋‹จ์–ด๊ฐ€ ์žˆ๊ณ , ๊ฐ•์•„์ง€์˜ ์ธ๋ฑ์Šค๋Š” 5์˜€์„ ๋•Œ, ๊ฐ•์•„์ง€ = [ 0 0 0 0 1 0 0 0 0 0 0 0 โ€ฆ ์ค‘๋žต โ€ฆ 0] # ์ด ๋•Œ 1 ๋’ค์˜ 0์˜ ์ˆ˜๋Š” 9995๊ฐœ.

๋ฐ€์ง‘ํ‘œํ˜„(Dense Representation)

  • ์‚ฌ์šฉ์ž๊ฐ€ ์„ค์ •ํ•œ ๊ฐ’์œผ๋กœ ๋ชจ๋“  ๋‹จ์–ด์˜ ๋ฒกํ„ฐ ํ‘œํ˜„์˜ ์ฐจ์›์„ ๋งž์ถ”๋Š” ๋ฐฉ๋ฒ•

  • ๋ฒกํ„ฐ์˜ ์ฐจ์›์ด ์กฐ๋ฐ€ํ•ด์กŒ๋‹ค๊ณ  ํ•˜์—ฌ ๋ฐ€์ง‘ ๋ฒกํ„ฐ(dense vector)๋ผ๊ณ  ํ•จ

์˜ˆ์‹œ) ๊ฐ•์•„์ง€ = [0.2 1.8 1.1 -2.1 1.1 2.8 ... ์ค‘๋žต ...] (์ด ๋ฒกํ„ฐ์˜ ์ฐจ์›์€ 128)

Word2Vec

  • ๋น„์Šทํ•œ ์œ„์น˜์—์„œ ๋“ฑ์žฅํ•˜๋Š” ๋‹จ์–ด๋“ค์€ ๋น„์Šทํ•œ ์˜๋ฏธ๋ฅผ ๊ฐ€์ง„๋‹ค๋ผ๋Š” ๊ฐ€์ • (๊ฐ•์•„์ง€์™€ ๊ณ ์–‘์ด๋Š” ๋น„์Šทํ•˜๋‹ค)

  • ์ค‘์‹ฌ ๋‹จ์–ด์™€ ์ฃผ๋ณ€ ๋‹จ์–ด๋กœ ํ•™์Šตํ•˜๋ฏ€๋กœ ๋ผ๋ฒจ๋ง์ด ํ•„์š” ์—†์Œ -> ๋น„์ง€๋„ํ•™์Šต(unsupervised learning)

  • CBOW(Continuous Bag of Words), Skip-gram ๋ฐฉ์‹

CBOW(Continuous Bag of Words) vs Skip-gram ๋ฐฉ์‹

  • CBOW : ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด๋กœ๋ถ€ํ„ฐ ์ค‘์‹ฌ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธก

  • Skip-gram : ์ค‘์‹ฌ ๋‹จ์–ด์—์„œ ์ฃผ๋ณ€ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธก

๋„ค๊ฑฐํ‹ฐ๋ธŒ ์ƒ˜ํ”Œ๋ง(Negative Sampling)

  • ์ฃผ๋ณ€ ๋‹จ์–ด-์ค‘์‹ฌ ๋‹จ์–ด ๊ด€๊ณ„๋ฅผ ๊ฐ€์ง€๊ณ  ์ง€์ •ํ•œ ์œˆ๋„์šฐ ์‚ฌ์ด์ฆˆ ๋‚ด์— ์กด์žฌํ•˜๋ฉด 1, ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด 0์œผ๋กœ ์ด์ง„๋ถ„๋ฅ˜ ๋ฌธ์ œ๋กœ ๋ณ€๊ฒฝํ•˜์—ฌ ํ•™์Šตํ•˜๋ฉด ๋” ๋น ๋ฅด๊ฒŒ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Œ

  • ์ „์ฒด ๋‹จ์–ด๊ฐ€ ์•„๋‹ˆ๋ผ, ์ผ๋ถ€์— ๋Œ€ํ•ด์„œ๋งŒ ํ•™์Šตํ•˜๋„๋ก ์ƒ˜ํ”Œ๋ง

  • ์—ฌ๊ธฐ์„œ ์›๋„์šฐ๋Š” ์ค‘์‹ฌ ๋‹จ์–ด๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด์„œ ์•ž, ๋’ค๋กœ ๋ช‡ ๊ฐœ์˜ ๋‹จ์–ด๋ฅผ ๋ณผ์ง€์— ๋Œ€ํ•œ ๋ฒ”์œ„์ด๋‹ค

Word2Vec ๋ชจ๋ธ ํ‰๊ฐ€

  • ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„(Cosine Similarity)

    • ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๋Š” ๋‘ ํŠน์„ฑ ๋ฒกํ„ฐ ๊ฐ„์˜ ์œ ์‚ฌ ์ •๋„๋ฅผ ์ฝ”์‚ฌ์ธ ๊ฐ’์œผ๋กœ ํ‘œํ˜„ํ•œ ๊ฒƒ

    • ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๋Š” -1์—์„œ 1๊นŒ์ง€์˜ ๊ฐ’์„ ๊ฐ€์ง„๋‹ค

    • '-1' ์€ ์„œ๋กœ ์™„์ „ํžˆ ๋ฐ˜๋Œ€, '0' ์€ ์„œ๋กœ ๋…๋ฆฝ, '1' ์€ ์„œ๋กœ ๊ฐ™์€ ๊ฒฝ์šฐ๋ฅผ ์˜๋ฏธ

  • ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ(Euclidean Distance)

  • ์ž์นด๋“œ ์œ ์‚ฌ๋„(Jaccard Similarity

  • Word Analogy

    • ์œ ์ถ”๋ฅผ ํ†ตํ•œ ํ‰๊ฐ€๋กœ ์œ ์ถ”์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๊ฐ€ ์กด์žฌํ•ด์•ผ ํ…Œ์ŠคํŠธ๋ฅผ ํ•  ์ˆ˜ ์žˆ์Œ

Glove

  • ์นด์šดํŠธ ๊ธฐ๋ฐ˜ ๋ฐฉ์‹์€ ๋‹จ์–ด ์˜๋ฏธ์˜ ์œ ์ถ”๊ฐ€ ๋ถˆ๊ฐ€๋Šฅ, ์˜ˆ์ธก ๊ธฐ๋ฐ˜์€ ์ „์ฒด์ ์ธ ํ†ต๊ณ„ ์ •๋ณด๋ฅผ ๋ฐ˜์˜ ๋ชปํ•จ

  • GloVe๋Š” ์ด๋ฅผ ์œ„ํ•ด ์นด์šดํŠธ ๊ธฐ๋ฐ˜๊ณผ ์˜ˆ์ธก ๊ธฐ๋ฐ˜์„ ๋ชจ๋‘ ์‚ฌ์šฉ

  • ์ž„๋ฒ ๋”ฉ ๋œ ์ค‘์‹ฌ ๋‹จ์–ด์™€ ์ฃผ๋ณ€ ๋‹จ์–ด ๋ฒกํ„ฐ์˜ ๋‚ด์ ์ด ์ „์ฒด ์ฝ”ํผ์Šค์—์„œ์˜ ๋™์‹œ ๋“ฑ์žฅ ํ™•๋ฅ ์ด ๋˜๋„๋ก ๋งŒ๋“œ๋Š” ๊ฒƒ

FastText

  • Facebook research์—์„œ ๊ณต๊ฐœํ•œ ๋‹จ์–ด ์ž„๋ฒ ๋”ฉ ๊ธฐ๋ฒ•

  • ๋‹จ์–ด๋ฅผ n-gram์œผ๋กœ ๋‚˜๋ˆ„์–ด ํ•™์Šต

    • n-gram์˜ ๋ฒ”์œ„๊ฐ€ 2-5๋กœ ์„ค์ •ํ•œ ๊ฒฝ์šฐ : assumption = {as, ss, su, โ€ฆ, ass, ssu, sum, โ€ฆ, mptio, ption, assumption}

  • ์‹ค์ œ ์‚ฌ์šฉ ์‹œ์—๋Š”, ์ž…๋ ฅ ๋‹จ์–ด๊ฐ€ ์‚ฌ์ „์— ์žˆ์„ ๊ฒฝ์šฐ ํ•ด๋‹น ๋‹จ์–ด์˜ ๋ฒกํ„ฐ๋ฅผ ๊ณง๋ฐ”๋กœ ๋ฆฌํ„ดํ•˜๊ณ  ์‚ฌ์ „์— ์—†๋Š” ๊ฒฝ์šฐ (OOV, Out-of-Vocabulary) ์ž…๋ ฅ ๋‹จ์–ด์˜ n-gram vector๋ฅผ ํ•ฉ์‚ฐํ•˜์—ฌ ๋ฐ˜ํ™˜

ELMo(Embeddings from Language Model)

  • ๋ฌธ๋งฅ์„ ๋ฐ˜์˜ํ•œ ์›Œ๋“œ ์ž„๋ฒ ๋”ฉ ๊ธฐ๋ฒ•

  • ๊ธ€์ž๊ฐ€ ๊ฐ™์€ ๋‹จ์–ด๋„ ๋‹ค๋ฅธ ๋œป์„ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Œ (๋ฐ”๋‹ค ์œ„์— '๋ฐฐ'๊ฐ€ ๋– ์žˆ๋‹ค. ๋‚˜๋ฌด์— '๋ฐฐ'๊ฐ€ ์—ด๋ ธ๋‹ค.)

  • Pre-trained ๋ชจ๋ธ์˜ ์‹œ์ž‘

Universal sentence encoder

  • ๊ตฌ๊ธ€์ด ๊ณต๊ฐœํ•œ pretrained model๋กœ ๋ฌธ์žฅ์„ ๊ณ ์ฐจ์› ๋ฒกํ„ฐ๋กœ ์ธ์ฝ”๋”ฉ

  • ์งง์€ ๋ฌธ์žฅ๋ณด๋‹ค๋Š” ๊ธด ๋ฌธ์žฅ์—์„œ ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ž„

  • ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ฒ„์ „์ด ์žˆ์œผ๋ฉฐ, ํ•œ๊ตญ์–ด๋ฅผ ํ•จ๊ป˜ ์ง€์›ํ•˜๋Š” multilingual ๋ฒ„์ „์ด ์กด์žฌ

  • ์†๋„๊ฐ€ ๋น ๋ฅธ CNN ๋ฒ„์ „, ์†๋„๋Š” ๋А๋ฆฌ์ง€๋งŒ ์„ฑ๋Šฅ์ด ๋” ๋›ฐ์–ด๋‚œ Transformer ๋ฒ„์ „์ด ์กด์žฌ

Last updated

Was this helpful?