ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • [Paper] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
    ML engineer/Papers & CS generals 2023. 1. 17. 00:04
    ๋ฐ˜์‘ํ˜•

    ๐Ÿ•“ 4 mins read

    https://arxiv.org/abs/2212.09741

     

    One Embedder, Any Task: Instruction-Finetuned Text Embeddings

    We introduce INSTRUCTOR, a new method for computing text embeddings given task instructions: every text input is embedded together with instructions explaining the use case (e.g., task and domain descriptions). Unlike encoders from prior work that are more

    arxiv.org

    ์˜ค๋Š˜์€ ๊ฐ„๋‹จํ•œ engineering ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค.

    NLP ์—ฐ๊ตฌ ์ถ”์„ธ๊ฐ€ ๊ณ„์† LLM(Large Language Model)๋กœ ํ˜๋Ÿฌ๊ฐ€๊ณ  ์žˆ์ง€๋งŒ, ์•„์ง์€ ํ˜„์‹ค์ ์œผ๋กœ ๊ทธ๋ ‡๊ฒŒ ํฐ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ธฐ๋„, ์„œ๋น„์Šคํ•˜๊ธฐ์—๋„ ๋„ˆ๋ฌด ๋น„์šฉ ํšจ์œจ์ด ๋–จ์–ด์ง€๊ณ , ๋ชจ๋“  ๊ธฐ์—…์ด ๊ฑฐ๋Œ€ ๋ชจ๋ธ ์ž๋ณธ ๊ฒฝ์Ÿ์— ๋›ฐ์–ด๋“ค ์ˆ˜ ๋„ ์—†๋Š” ๋…ธ๋ฆ‡์ด์ฃ .

    ์ด๋ฒˆ ๋…ผ๋ฌธ์€ ๋น„๊ต์  ์ž‘์€ ํฌ๊ธฐ์˜ ์ผ๋ฐ˜์ ์ธ ๋ชจ๋ธ ๊ทœ๋ชจ์—, (ํฐ ๋ฒ„์ „๋„ 1.5B ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜์ค€์œผ๋กœ ๊ตฌ๊ธ€ TPUํด๋Ÿฌ์Šคํ„ฐ ์—†์ด ํ•™์Šต ๊ฐ€๋Šฅํ•œ ์ˆ˜์ค€์˜ ๊ทœ๋ชจ) ์˜๋ฌธ ๋ชจ๋ธ๋„ ๋‹ค์–‘ํ•œ ํฌ๊ธฐ๋กœ ๋ฆด๋ฆฌ์ฆˆ ๋˜์—ˆ์œผ๋ฉฐ, ํ•œ๊ตญ์–ด ๋ชจ๋ธ๋กœ๋„ ์žฌํ˜„ ๊ฐ€๋Šฅํ•œ ์ˆ˜์ค€์ž…๋‹ˆ๋‹ค.

     

    # ํ•ต์‹ฌ ๋‚ด์šฉ

    ๊ณต๊ฐœํ•˜๋Š” ๋ชจ๋ธ๋ช…์„ InstructOR (Instruction-based Omnifarious Representations) ๋ผ๊ณ  ์ง€์—ˆ๋Š”๋ฐ์š”, ์‚ฌ์‹ค ๊ธฐ์กด์— ๊ณต๊ฐœ๋œ ๋ชจ๋ธ์—์„œ ๋ณ€๊ฒฝ์ ์€ ์—†์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ๋ฐ์ดํ„ฐ๋ฅผ ์–ด๋–ป๊ฒŒ ๋‹ค๋ฅด๊ฒŒ ์ž…๋ ฅ/ํ•™์Šต์‹œ์ผœ์„œ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ์‹œ์ผฐ๋Š๋ƒ์˜ engineering ํŽ˜์ดํผ์ž…๋‹ˆ๋‹ค. (Technical novelty๋Š” ๋งŽ์ด ๋ถ€์กฑํ•ฉ๋‹ˆ๋‹ค๋งŒ, ๊ทธ๋ž˜๋„ ๋ฒค์น˜๋งˆํฌ ๋ฆฌ๋”๋ณด๋“œ 1๋“ฑ์ด๋ผ๋Š” ๋ฐ์„œ ์˜๋ฏธ๋ฅผ ๋‘ก๋‹ˆ๋‹ค.)

    InstructOR๋Š” ๋‹จ์ผ ๋ชจ๋ธ๋กœ finetuning ์—†์ด instruction๋งŒ ๋‹ฌ๋ฆฌํ•ด์„œ task์— ์ ํ•ฉํ•œ embedding์„ ๋งŒ๋“ค์–ด ์ฃผ๋Š” encoder ๋ชจ๋ธ.
    - MTEB leaderboard ํ˜„์žฌ 1์œ„  https://huggingface.co/spaces/mteb/leaderboard
    - Dense retrieval์„ ๋ฉ”์ธ ํƒ€๊ฒŸ task๋กœ ์‚ผ๊ธด ํ•˜๋‚˜, reranking, generation evaluation ๋“ฑ ๋‹ค์–‘ํ•œ task๋กœ๋„ ํ™œ์šฉ๊ฐ€๋Šฅ
    - ๋‹ค์–‘ํ•œ ์‚ฌ์ด์ฆˆ๋กœ ๊ณต๊ฐœ ๋˜์–ด์žˆ๋Š” GTR ๋ชจ๋ธ์„ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉ, instruction prompt๋ฅผ ๋ถ™์—ฌ์„œ ํ•™์Šต/์‹คํ–‰ํ•˜๋Š” ๊ฒƒ์ด ํ•ต์‹ฌ ์•„์ด๋””์–ด.

     

    ์•„ํ‚คํ…์ฒ˜๋Š” GTR ๋ชจ๋ธ์„ ๊ทธ๋Œ€๋กœ ์ด์šฉํ•˜๋ฏ€๋กœ, ๊ธฐ์กด ๋…ผ๋ฌธ์˜ ๋‹ค์ด์–ด๊ทธ๋žจ์„ ํ•˜๋‚˜ ๋ด…๋‹ˆ๋‹ค.

    GTR ๋…ผ๋ฌธ: https://arxiv.org/abs/2112.07899

    ์˜ˆ์ „์— Siamese Convolutional Dual encoder ๋ชจ๋ธ๊ณผ ๊ฐ™์€ ํ˜•ํƒœ์ฃ ?
    ๊ณตํ†ต ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ์ด๋ฃจ์–ด์ง„ ์ธ์ฝ”๋” ๋ชจ๋ธ ๋‘ ๊ฐœ๋กœ ์งˆ์˜์™€ ๋‹ต๋ณ€(ํ˜น์€ ๋ฌธ์„œ)๊ณผ ๊ฐ™์ด ์Œ์œผ๋กœ ์ด๋ฃจ์–ด์ง„ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ๊ฐ dense vector๋กœ ์ธ์ฝ”๋”ฉ ํ•ด์ค€๋‹ค์Œ ๋‘ ๋ฒกํ„ฐ ๊ฐ„์˜ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๋กœ relevant / irrelevant (ํ˜น์€ positive / negative) ํด๋ž˜์Šค๋กœ regression ํ˜น์€ binary classification ํ•™์Šต์„ ํ•ฉ๋‹ˆ๋‹ค.

    ์‚ฌ์‹ค ์—ฌ๊ธฐ๊นŒ์ง€๋„ ์ด๋ฏธ ๊ธฐ์กด์— siamese network์™€ ๋‹ค๋ฅผ๊ฒŒ ์ „ํ˜€ ์—†๋Š” ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜์ง€๋งŒ, ํ›จ์”ฌ ๋” ๋‹ค์–‘ํ•œ task์˜ ๋ฐ์ดํ„ฐ์…‹์„ ์ด๋Ÿฐ paired ๊ตฌ์กฐ๋กœ ๋ณด๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

    GTR ๋ชจ๋ธ๋„ ๋ฒ ์ด์Šค ๋ชจ๋ธ์„ T5 ๋ชจ๋ธ(seq2seq ๋ชจ๋ธ ๊ตฌ์กฐ๋กœ ๋‹ค์–‘ํ•œ task๋ฅผ ํ’€์–ด๋‚ธ ๊ฒƒ)์—์„œ ์ธ์ฝ”๋” ๋ ˆ์ด์–ด๋งŒ ๊ฐ€์ ธ์™€์„œ ์ดˆ๊ธฐํ™”ํ•œ ๋ชจ๋ธ์ด๋‹ˆ, T5 -> GTR -> InstructOR ์ˆœ์„œ๋กœ ์ ์ฐจ ๊ฐœ์„ (?) ๋˜์—ˆ๋‹ค ๋ณด๋ฉด ๋ฉ๋‹ˆ๋‹ค.

    T5์ฒ˜๋Ÿผ seq2seq ํ˜•ํƒœ์˜ multitask ๋ชจ๋ธ์—์„  output์„ ์‹ค์ œ๋กœ generation ํ•ด์•ผ ํ•˜๋Š” ๋ฌธ์ œ์ด๋‹ˆ ๋งŒํผ ํ•™์Šต ๋‚œ๋„๊ฐ€ ํ›จ์”ฌ ๋†’์€ ๋ฐ˜๋ฉด, GTR์ด๋‚˜ InstructOR์—์„œ๋Š” ์‹œํ€€์Šค๋ฅผ ์ƒ์„ฑํ•˜์ง€๋Š” ์•Š๊ณ  ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ ์‹œํ€€์Šค๋ฅผ ํ•˜๋‚˜์˜ ๋ฒกํ„ฐ ๊ณต๊ฐ„์— mapping ํ•˜๋Š” ๋ฌธ์ œ์ด๋ฏ€๋กœ ์ข€ ๋” ๋‚œ์ด๋„๊ฐ€ ๋‚ฎ๋‹ค๊ณ  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (Output space๊ฐ€ ์˜ˆ๋ฅผ ๋“ค์–ด 768์ฐจ์› ๋ฒกํ„ฐ์˜ $t$ step ๋งŒํผ์˜ ์‹œํ€€์Šค ์กฐํ•ฉ์ผ ๋•Œ ๋ณด๋‹ค 768์ฐจ์› ๋ฒกํ„ฐ ๋‘ ๊ฐœ๋งŒ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์ด ํ›จ์”ฌ ๋‹จ์ˆœํ•œ ๊ณต๊ฐ„์ด๊ฒ ์ฃ ?)

    InstructOR ๋ชจ๋ธ์˜ ํ•™์Šต์€, GTR ๊ณผ ๋™์ผํ•˜๊ฒŒ multitask ๋ฐ์ดํ„ฐ์…‹์„ query / passage ๋‘ ์ธ์ฝ”๋” ํƒ€์›Œ์— ์ž…๋ ฅํ•˜๋Š” ๊ฑด ๋งˆ์ฐฌ๊ฐ€์ง€์ง€๋งŒ, ๊ฐ๊ฐ์˜ ํ…์ŠคํŠธ ์•ž์— "instruction", ์ฆ‰ ์ผ์ข…์˜ prompt ํ…์ŠคํŠธ๋ฅผ ๋ง๋ถ™์—ฌ์„œ ํ•ด๋‹น ํ…์ŠคํŠธ์˜ ์—ญํ• ์„ ๋ช…์‹œ์ ์œผ๋กœ ์ž…๋ ฅํ•ด ์ค๋‹ˆ๋‹ค.


    ์ฆ‰ BERT ๋ชจ๋ธ์˜ ๋ชฉ์ ์ด ๋™์ผํ•œ "ํ† ํฐ(๋‹จ์–ด)"์˜ ์ž„๋ฒ ๋”ฉ์ด ์‹œํ€€์Šค์˜ ๋งฅ๋ฝ์— ๋”ฐ๋ผ ๋‹ค๋ฅธ ๋ฒกํ„ฐ๋กœ ๋™์ ์œผ๋กœ ์ž„๋ฒ ๋”ฉ ๋˜๋„๋ก ํ•™์Šต์„ ํ•œ ๊ฒƒ์ด๋ผ๋ฉด, InstructOR ๋ชจ๋ธ์€ ๋™์ผํ•œ "์‹œํ€€์Šค(๋ฌธ์žฅ/๋ฌธ์„œ)"์˜ ์ž„๋ฒ ๋”ฉ์ด task์— ๋”ฐ๋ผ ๋‹ค๋ฅธ ๋ฒกํ„ฐ๋กœ ๋™์  ์ž„๋ฒ ๋”ฉ ๋˜๊ฒŒ๋” ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.


    ๋…ผ๋ฌธ์˜ ํ•ต์‹ฌ contribution์€ ์—ฌ๊ธฐ๊นŒ์ง€์ด๊ณ , ์ด๋ ‡๊ฒŒ๋งŒ ๋…ผ๋ฌธ์„ ์“ฐ๋ฉด ๋„ˆ๋ฌด ์‹คํ—˜ ๋ณด๊ณ ์„œ ์ˆ˜์ค€์ด ๋˜๋‹ˆ, ๊ฐ ์กฐ๊ฑด ์š”์†Œ๋“ค์„ ablation ์Šคํ„ฐ๋”” (๋ณ€์ธ ํ™•์ธ)๋ฅผ ํ†ตํ•ด ๋ชจ๋ธ ์„ฑ๋Šฅ์— ๊ธฐ์ธํ•˜๋Š” ์š”์ธ๋“ค์ด ์–ด๋–ค ๊ฒƒ๋“ค์ด ์žˆ๋Š”์ง€๋ฅผ ๋ณด์ž…๋‹ˆ๋‹ค.

    Task ๋ณ„ instruction (prompt)๋ฌธ์žฅ์˜ ์˜ˆ์‹œ

     

    ## ํฌ์ธํŠธ 1.

    Instruction์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๊ธฐ์ธํ•˜๋Š” ์š”์†Œ๋“ค์„ ํŒŒ์•…ํ•˜๊ธฐ ์œ„ํ•œ ๊ทธ๋ž˜ํ”„์ž…๋‹ˆ๋‹ค.

    Task๋“ค ์ค‘์—๋Š” ์ค‘๋ณต ๋ฌธ์žฅ ํƒ์ƒ‰, ๋ฌธ์žฅ ๋ถ„๋ฅ˜ ๋ฌธ์ œ ๋“ฑ๊ณผ ๊ฐ™์ด query / passage ์ชฝ์˜ ํ˜•ํƒœ๊ฐ€ ๋™์ผ/์œ ์‚ฌํ•œ "symmetric" task๊ฐ€ ์žˆ๊ณ , ๋ฌธ์„œ ๊ฒ€์ƒ‰๊ณผ ๊ฐ™์ด query / passage์˜ ํ˜•ํƒœ๊ฐ€ ํ•œ์ชฝ์€ ์งง์€ ๋ฌธ์žฅ/ํ‚ค์›Œ๋“œ๊ณ  ๋‹ค๋ฅธ ํ•œ์ชฝ์€ ๊ธด ๋ฌธ์„œ ํ˜•ํƒœ์ธ ๊ฒƒ์ฒ˜๋Ÿผ ์ƒ์ดํ•œ "asymmetric" task๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

    ์œ„์˜ ๊ทธ๋ž˜ํ”„๋กœ๋ถ€ํ„ฐ ์šฐ๋ฆฌ๋Š” instruction์ด ์–ด๋–ค ๊ฒฝ์šฐ๋“  ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ํฌ๊ฒŒ ์ผ์œผ์ผฐ๊ณ , ๋‘ ๊ฐ€์ง€ ์œ ํ˜•์˜ task๊ฐ€ ์„ž์ธ ์ฑ„๋กœ ํ•™์Šตํ•  ๋• instruction์ด ์—†์œผ๋ฉด ๋‹จ์ˆœํ•œ siamese network ๊ตฌ์กฐ๋งŒ์œผ๋กœ๋Š” ํ•™์Šต์ด ์ž˜ ์ด๋ฃจ์–ด์ง€์ง€ ์•Š๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

     

    ์‚ฌ์šฉํ•œ ํ•™์Šต ๋ฐ์ดํ„ฐ ์…‹ ์ค‘์— ํŠนํžˆ SuperNI๋ผ๋Š” ๋ฐ์ดํ„ฐ์…‹์ด ์žˆ๋Š”๋ฐ, ์ด ๋ฐ์ดํ„ฐ์…‹์€ ์•ฝ 18๋งŒ ๊ฑด์œผ๋กœ ์–‘๋„ ๋งŽ์„ ๋ฟ ์•„๋‹ˆ๋ผ, task description์„ instruction ํ˜•ํƒœ๋กœ ์ž์„ธํ•˜๊ฒŒ ๋ผ๋ฒจ๋ง ๋˜์–ด์žˆ๋Š” ๋ฐ์ดํ„ฐ์ž…๋‹ˆ๋‹ค. (๋”ฑ ๋ด๋„ InstructOR ๋ชจ๋ธ ํ•™์Šต์— ๋„์›€์ด ํฌ๊ฒŒ ๋  ๊ฒƒ ๊ฐ™์€ ๋ฐ์ดํ„ฐ์ฃ ? ใ…Žใ…Ž)

    ์œ„์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ๋ณด๋ฉด SuperNI๋ฅผ ํฌํ•จํ•˜๊ณ  ์•ˆ ํ•˜๊ณ ์— ๋”ฐ๋ผ์„œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ๋ถ„ํฌ๊ฐ€ ๊ณ ๋ฅด๊ฒŒ ๊ฐœ์„ ๋˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 
    ๋ฌผ๋ก , SuperNI ์—†์ด๋„ ์ตœ๊ณ  ์„ฑ๋Šฅ์€ ๋น„์Šทํ•˜๊ฒŒ ๋†’์ง€๋งŒ, ์„ฑ๋Šฅ์˜ ๋ณ€ํ™” ํญ์ด ์ข€ ์ปค์„œ ๋ชจ๋ธ์ด ๋งŽ์ด ํ”๋“ค๋ฆฌ๋Š” ๊ฒƒ์ด ๋ณด์ž…๋‹ˆ๋‹ค.

     

    ## ํฌ์ธํŠธ 2.

    ๊ทธ๋Ÿผ instruction์€ ์–ด๋–ป๊ฒŒ ๋„ฃ์–ด์•ผ ํ• ๊นŒ์š”? ์ •๋ง ์ €์ž๋“ค์ด ์ฃผ์žฅํ•˜๋Š”, "instruction"์œผ๋กœ ๋ฌธ์žฅ/๋ฌธ์„œ๋ฅผ task์— ๋งž๊ฒŒ ๋™์ ์œผ๋กœ ์ž„๋ฒ ๋”ฉ ํ•˜๋ ค๋ฉด!

    ์ตœ๋Œ€ํ•œ ์ƒ์„ธํ•˜๊ณ  ๋ช…์‹œ์ ์œผ๋กœ ํ•˜๊ณ ์ž ํ•˜๋Š” task๋ฅผ, ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ๊ฐ€ ๋‚˜ํƒ€๋‚ด๊ธธ ๋ฐ”๋ผ๋Š” ๋‚ด์šฉ์„ instruction์— ๋„ฃ์œผ๋ฉด ๋œ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

    ์œ„์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ๋ณด๋ฉด, ํƒœ๊ทธ ํ˜•ํƒœ๋กœ๋งŒ task๋ฅผ ํ‘œ๊ธฐํ•ด์ค˜๋„ ์„ฑ๋Šฅ ํ–ฅ์ƒ์ด ์žˆ๊ณ , ์ ์  task์— ๋Œ€ํ•œ ์ง€์‹œ ์‚ฌํ•ญ์ด ๋šœ๋ ทํ•  ์ˆ˜ ๋ก ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    (์—ฌ๊ธฐ์„œ ๋…ผ๋ฌธ์— ์„ค๋ช…์ด ์—†์–ด ์ œ๊ฐ€ ์˜์•„ํ–ˆ๋˜ ๊ฑด, N/A ๋ฉด, GTR๊ณผ ๋™์ผํ•œ ์„ฑ๋Šฅ์„ ๋‚ด์•ผ ํ•˜๋Š” ๊ฒŒ ์•„๋‹Œ๊ฐ€? ์‹ถ์—ˆ๋Š”๋ฐ, ํ•™์Šต์— ๋™์›๋œ dataset์˜ ๋ฒ”์œ„๊ฐ€ ๋” ๋„“์–ด์ ธ์„œ ์„ฑ๋Šฅ์ €ํ•˜๊ฐ€ ์žˆ์—ˆ๋‹ค๊ณ  ๋ด์•ผ ํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.. ์„ค๋งˆ ๋ชจ๋ธ ์‚ฌ์ด์ฆˆ๋ฅผ ๋‹ค๋ฅธ ๊ฒƒ์„ ์‚ฌ์šฉํ•œ ๊ฒƒ์€ ์•„๋‹ ํ…Œ๋‹ˆ๊นŒ์š”.)

     

    # ๊ฐœ์ธ์ ์ธ ์ƒ๊ฐ

    ๋…ผ๋ฌธ์˜ technical novelty๋Š” ๋ณ„๋กœ ์—†์ง€๋งŒ, ์ ๋‹นํ•œ ๊ทœ๋ชจ์˜ ๋ชจ๋ธ ํฌ๊ธฐ๋กœ, zero-shot ํ˜น์€ ์•ฝ๊ฐ„์˜ finetuning๋งŒ์œผ๋กœ ๋‹ค์–‘ํ•œ task์— ๋ฒ”์šฉ์ ์œผ๋กœ ์“ฐ์ผ ์ˆ˜ ์žˆ๋Š” embedding ๋ชจ๋ธ์ด๋‹ˆ ํ›Œ๋ฅญํ•œ ์•„์ด๋””์–ด๊ฐ€ ์•„๋‹๊นŒ์š”?

    ์˜ค๋ž˜์ „๋ถ€ํ„ฐ ์ €๋Š” ๋จธ์‹ ๋Ÿฌ๋‹์˜ ์ตœ์ข… ์ข…์ฐฉ์ ์€ ๊ฒฐ๊ตญ ์–ผ๋งˆ๋‚˜ ์˜๋ฏธ/๋งฅ๋ฝ/๋ฉ”ํƒ€๋ฅผ ์ž˜ ๋‹ด์•„๋‚ด๋Š” embedding ๋ฒกํ„ฐ๋ฅผ ์ž˜ ๋งŒ๋“ค์–ด ๋‚ด๋Š๋ƒ์— ์‹œ์Šคํ…œ ์„ฑ๋Šฅ์ด ๊ฒฐ์ •๋œ๋‹ค ์ƒ๊ฐํ–ˆ๋˜ ํ„ฐ๋ผ.. ์ด๋Ÿฐ ์—ฐ๊ตฌ๋Š” ํ•ญ์ƒ ๋ฐ˜๊ฐ‘์Šต๋‹ˆ๋‹ค.

    ํ•œ๊ฒฐ ๊ฐ™์ด ๋” ๋งŽ์€ ์ž๋ณธ! ๋” ๋งŽ์€ ์ปดํ“จํŒ… ํŒŒ์›Œ!! ๋จธ๋‹ˆ์ด์ด! ํ•˜๊ณ  ์žˆ๋Š” ํƒ‘ ๋ฆฌ๊ทธ(?) ์—ฐ๊ตฌ ํŠธ๋ Œ๋“œ๋Š” ์†”์งํžˆ ์ข€ ์ง€๊ฒน๊ธฐ๋„ ํ•˜๊ณ .. (๊ตฌ๊ฒฝํ•˜๋Š” ์‹ ๊ธฐํ•˜๊ธฐ๋„ ํ•˜์ง€๋งŒ..) ๊ทธ๋ ‡๋‹ค๊ณ  ์ตœ์ ํ™” / ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๊ฐœ์„  ๋งŒ์œผ๋กœ ๋น…๋ชจ๋ธ ํŠธ๋ Œ๋“œ๋ฅผ ๋”ฐ๋ผ์žก๊ฑฐ๋‚˜ ๋’ค์—Ž๊ธด ์–ด๋ ค์›Œ ๋ณด์ด๊ณ . ์š”์ฆ˜ AI์—ฐ๊ตฌ ํž˜๋“ญ๋‹ˆ๋‹ค ใ…Žใ…Ž

    ๋Œ€์‹  ์—…๊ณ„์—์„ (๋…ผ๋ฌธ ์จ์•ผ ํ•˜๋Š” ํ•™์ƒ๋“ค์€ ๊ทธ์ € ์• ๋„..) ๋” ์ €๋ ดํ•œ ๋น„์šฉ์œผ๋กœ, ์ถฉ๋ถ„ํžˆ ํšจ์œจ์ ์ธ ์žญ๋‚˜์ดํ”„ ๊ฐœ๋ฐœ์— ํž˜์„ ์“ฐ๋ฉด ๋œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. (๊ตณ์ด ๋น„์œ ํ•˜๋ฉด ๋น…๋ชจ๋ธ์ด ์ „๊ธฐํ†ฑ ์ •๋„.. ๋ ค๋‚˜์š”? ์ด์ œ ๋ช‡ ๋…„ ๋” ์žˆ์œผ๋ฉด ๊ด‘์„ ๊ฒ€์ด ๋˜๊ฒ ๊ตฐ์š” ใ„ทใ„ท ์Šคํ…Œ์ดํฌ ์ฐ๋•Œ ๊ด‘์„ ๊ฒ€ ๊นŒ์ง„ ํ•„์š” ์—†์ž–์•„์š”?)

     

    # ์ฐธ๊ณ 

    - Prior work์ด์ง€๋งŒ ์—ฐ๊ด€ํ•ด์„œ single-model-multitask๋ฅผ ์ง€ํ–ฅํ•˜๋Š” ๋…ผ๋ฌธ๋“ค (LLM์€ ๊ฒฐ ์ž์ฒด๊ฐ€ ๋‹ค๋ฅด๋‹ˆ ์ œ์™ธํ•ฉ๋‹ˆ๋‹ค.)

    (T5) EXT5 : https://arxiv.org/pdf/2111.10952.pdf

    GTR : https://arxiv.org/abs/2112.07899 

    ๋ฐ˜์‘ํ˜•

    ๋Œ“๊ธ€

Designed by naubull2.