ABOUT ME

Answers lie in the details.

Today
Yesterday
Total
  • [Python] Pickle์— ๋Œ€ํ•œ ์˜คํ•ด์™€ Canโ€™t Pickle local object Error ํ•ด๊ฒฐ
    ML engineer/Papers & CS generals 2023. 2. 16. 00:42
    ๋ฐ˜์‘ํ˜•

    ๐Ÿ•“ 5 mins read

    # Pickle์˜ ์˜คํ•ด

    ํŒŒ์ด์ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋“ค์„ ์‚ฌ์šฉํ•˜๋‹ค ๋ณด๋ฉด, ์ง์ ‘ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋”๋ผ๋„, pickle์€ ์–ด๋–ป๊ฒŒ๋“  ๋งŒ๋‚  ์ˆ˜๋ฐ–์— ์—†๋Š”๋ฐ, ๋‹จ์ˆœํžˆ ํŒŒ์ผ์ด๋‚˜ ์˜ค๋ธŒ์ ํŠธ๋ฅผ ์ €์žฅํ•˜๊ณ  ๋ถˆ๋Ÿฌ์˜ฌ ๋•Œ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์–ด๋–ค ๊ฐ์ฒด๋ฅผ ํ”„๋กœ์„ธ์Šค ๊ฐ„์— ๊ณต์œ ํ•˜๊ฑฐ๋‚˜ ์ „๋‹ฌํ•  ๋•Œ๋„ ์“ฐ์ž…๋‹ˆ๋‹ค. ์ด๋•Œ pickle์˜ ์›๋ฆฌ๋ฅผ ๋ชจ๋ฅผ ๊ฒฝ์šฐ Attribute Error ํ˜น์€ PicklingError๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ˜•ํƒœ๋กœ ๋งŒ๋‚˜๋ฉด ๋‹นํ™ฉ์Šค๋Ÿฌ์šธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (๋ฌธ์ œ์—†๋Š” ๊ฑฐ ๊ฐ™์€๋ฐ ๋Œ€์ฒด ์™œ!?)

    PicklingError: Can't pickle <class 'class.method.var'>: it's not the same object as class.method.var
    AttributeError: Can't pickle local object 'class.<locals>.some_var'

    ํŠนํžˆ ํŒŒ์ด์ฌ์„ ์ฒ˜์Œ ์ ‘ํ•˜๋Š” ์‚ฌ์šฉ์ž๋‚˜, ํ”ํžˆ ์œ ํ–‰(?)ํ•˜๋Š” "pickle์„ ์ด์šฉํ•˜์—ฌ ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์ €์žฅํ•˜๊ธฐ" ๋”ฐ์œ„์˜ ๊ธ€์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ ์ €์žฅ ๋ฐฉ๋ฒ•์„ ์ž˜๋ชป ๋ฐฐ์šด ๊ฐœ๋ฐœ์ž๋“ค์ด ์˜คํ•ดํ•˜๊ธฐ ์‰ฌ์šด ๋ถ€๋ถ„์ธ๋ฐ์š”, ํŒŒ์ด์ฌ ๋ฌธ์„œ์—๋„ ๋ช…์‹œ๋˜์—ˆ๋“ฏ์ด pickle์€ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๋„๊ตฌ๊ฐ€ ์•„๋‹ˆ๋ผ "๊ฐ์ฒด ์ง๋ ฌํ™”" ๋„๊ตฌ์ž…๋‹ˆ๋‹ค.

    https://docs.python.org/ko/3/library/pickle.html

    ์ง๋ ฌํ™”(serialization)๋Š” ๊ฐ„๋‹จํžˆ ๋งํ•ด์„œ, ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”์ดํŠธ ์ŠคํŠธ๋ฆผ์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์ ˆ์ฐจ์ž…๋‹ˆ๋‹ค. ์ด๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ ํŒŒ์ผ๋กœ ์ €์žฅ์„ ํ•˜๋Š” ํ˜•ํƒœ๋กœ ๋ณด์ด๋‹ˆ, ์˜คํ•ดํ•  ๋งŒ ํ•˜๊ธด ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค๋งŒ ์—ฌ๊ธฐ์„œ ์ฃผ๋ชฉํ•ด์•ผ ํ•  ๋ถ€๋ถ„์€, ์ง๋ ฌํ™” ๋Œ€์ƒ์ด "๋ฐ์ดํ„ฐ"๊ฐ€ ์•„๋‹ˆ๋ผ "๊ฐ์ฒด ๊ตฌ์กฐ"๋ฅผ ์ง๋ ฌํ™”ํ•˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ ํ”„๋กœํ† ์ฝœ์ด๋ผ๋Š” ์ ์ž…๋‹ˆ๋‹ค.

    ## Pickling์˜ ๋ฌธ์ œ์ 

    • ๊ฐ์ฒด ๊ตฌ์กฐ๋ฅผ ์ง๋ ฌํ™”ํ•˜๋ฉด, ํ•ด๋‹น ๊ฐ์ฒด์— ๋Œ€ํ•œ ๊ตฌํ˜„์ด ํ˜„์žฌ scope์— ์กด์žฌํ•˜์ง€ ์•Š์œผ๋ฉด unpickling ํ•  ๋•Œ ์˜ค๋ฅ˜๊ฐ€ ๋‚ฉ๋‹ˆ๋‹ค. 
      "CustomClass" ์ธ์Šคํ„ด์Šค๋ฅผ pickle ํ–ˆ๋Š”๋ฐ, "CustomClass"๊ฐ€ ๋ญ”์ง€ ์•Œ ์ˆ˜ ์—†๋Š” ์ƒํƒœ์—์„  unpickle์ด ์•ˆ ๋˜๋Š” ๊ฒŒ ๋‹น์—ฐํ•˜๊ฒ ์ฃ .

    ๋ฌด์—‡๋ณด๋‹ค ์œ„์˜ ์˜ˆ์‹œ์—์„œ ๋ˆˆ์น˜ ๋น ๋ฅธ ๋ถ„๋“ค์€ ์•Œ๊ฒ ์ง€๋งŒ, unpickle์€ ์–ด๋–ค ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์ฝ”๋“œ๋ฅผ "์‹คํ–‰"ํ•˜๋Š” ๊ณผ์ •์ด๊ธฐ ๋•Œ๋ฌธ์—, ๊ฒ€์ฆ๋œ pickle ์˜ค๋ธŒ์ ํŠธ๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด unpickle ๋งŒ์œผ๋กœ๋„ ์•…์„ฑ์ฝ”๋“œ๋ฅผ ์‹คํ–‰์‹œํ‚ค๊ฒŒ ๋  ์ˆ˜ ๋„ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์•ˆ์ „ํ•˜์ง€ ์•Š์€ ํ”„๋กœํ† ์ฝœ์ž…๋‹ˆ๋‹ค.

    ํŒŒ์ด์ฌ ๋ฌธ์„œ์—๋„ ๋ฌด์‹œ๋ฌด์‹œํ•œ ๊ฒฝ๊ณ  ๋ฌธ๊ตฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

    https://docs.python.org/ko/3/library/pickle.html

    • ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ "์‹คํ–‰"ํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๋‹น์—ฐํžˆ ์„œ๋กœ ๋‹ค๋ฅธ ๋ฒ„์ „์˜ ํŒŒ์ด์ฌ ๋ฐ”์ด๋„ˆ๋ฆฌ์—์„œ ์ƒ์„ฑํ•œ pickle ์˜ค๋ธŒ์ ํŠธ๋Š” ๊ณต์œ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
      - ex. Python 3.7.9์—์„œ pickle๋œ ์˜ค๋ธŒ์ ํŠธ๋Š” Python 3.8.16์—์„œ unpickle ํ•  ์ˆ˜ ์—†๋Š” ๊ฒƒ.
    • ๋ฐ”์ด๋„ˆ๋ฆฌ ํ”„๋กœํ† ์ฝœ์ด๊ธฐ์— ๋‹น์—ฐํžˆ ์ €์žฅ๋œ ์˜ค๋ธŒ์ ํŠธ๋Š” human-readable ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

    ### huggingface model hub

    ๋ถˆ๊ณผ ๋ช‡ ๋‹ฌ ์ „๋งŒ ํ•ด๋„ huggingface model hub๋„ pickle์„ ์ด์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ์‹ ๋ขฐํ•  ์ˆ˜ ์—†๋Š” ๋ชจ๋ธ์ด ์•„๋‹ˆ๋ผ๋ฉด ํ•จ๋ถ€๋กœ from_pretrained()๋ฅผ ์‹คํ–‰ํ•ด์„  ์•ˆ ๋์—ˆ์Šต๋‹ˆ๋‹ค. (์•…์„ฑ์ฝ”๋“œ๋ฅผ ์‹ฌ์–ด๋‘˜ ์ˆ˜ ์žˆ๋‹ค๋‹ˆ๊นŒ์š”?)

    - ๋ฌผ๋ก  ์ด ์‚ฌ์‹ค์ด ์•Œ๋ ค์ง€๊ณ  ๋‚˜์„œ huggingface ์ธก์—์„œ๋„ ๋ฌด์‹œ๋ฌด์‹œ?ํ•œ ๊ฒฝ๊ณ  ๋ฌธ๊ตฌ์™€ ํ•จ๊ป˜ ๋ชจ๋ธ์ด ์—…๋กœ๋“œ๋  ๋•Œ ์–ด๋А ์ •๋„ ๊ฒ€์ˆ˜๋ฅผ ํ•ด์ฃผ๊ธฐ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค.

    https://huggingface.co/docs/hub/security-pickle

     

    Pickle Scanning

    Pickle Scanning Pickle is a widely used serialization format in ML. Most notably, it is the default format for PyTorch model weights. There are dangerous arbitrary code execution attacks that can be perpetrated when you load a pickle file. We suggest loadi

    huggingface.co

    pickle load๊ฐ€ ์™œ, ์–ผ๋งˆ๋‚˜ ์œ„ํ—˜ํ•œ์ง€์™€, ์ฃผ์˜ ์‚ฌํ•ญ๋“ค์— ๋Œ€ํ•ด ์ƒ์„ธํžˆ ์„ค๋ช…ํ•˜๊ณ  ์žˆ์œผ๋‹ˆ, ๊ด€์‹ฌ ์žˆ์œผ์‹œ๋ฉด ์ฝ์–ด๋ณด๋ฉด ์ข‹์„ ๊ธ€์ž…๋‹ˆ๋‹ค.

     

    # Pickle ์˜ค๋ฅ˜ ์œ ํ˜•

    ์‚ฌ์‹ค ๋Œ€๊ฒŒ ์ด๋Ÿฐ local object ์˜ค๋ฅ˜๋Š”, pickle์ด local object(์ „์—ญ scope์—์„œ ์ •์˜๋ฅผ ์ฐพ์„ ์ˆ˜ ์—†๋Š”)๋“ค์ด pickle ๋Œ€์ƒ์ด ๋  ๋•Œ์ž…๋‹ˆ๋‹ค.
    (un-pickle์ด ์‹ค์ œ๋กœ๋Š” ํ•ด๋‹น ์˜ค๋ธŒ์ ํŠธ๋ฅผ ์žฌ์ƒ์„ฑํ•˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๊ณผ์ •์ธ๊ฑธ ์ƒ๊ฐํ•˜๋ฉด ๋‹น์—ฐํžˆ ์˜ค๋ฅ˜์ฃ .)
    ๋ช‡ ๊ฐ€์ง€ ๋Œ€ํ‘œ์ ์ธ ์˜ค๋ฅ˜ ์œ ํ˜•๊ณผ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•๋“ค์„ ์‚ดํŽด๋ณด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. (๋Œ€๊ฒŒ global๋กœ ์„ ์–ธํ•ด ์ฃผ๊ฑฐ๋‚˜ ๊ตฌ์กฐ๋ฅผ ๋ณ€๊ฒฝํ•˜๋ฉด ํ•ด์†Œ๋ฉ๋‹ˆ๋‹ค.)

    ex1. Attribute Error - can't pickle local objects

    ๊ฐ€์žฅ ํ”ํ•œ ์œ ํ˜•์œผ๋กœ, ํ•จ์ˆ˜ ๋‚ด๋ถ€์˜ ๋กœ์ปฌ scope์—์„œ ์ •์˜๋œ ์˜ค๋ธŒ์ ํŠธ๋ฅผ pickle.dump ํ•  ๋•Œ ๋ฐœ์ƒํ•˜๋Š” ์˜ค๋ฅ˜์ž…๋‹ˆ๋‹ค.

    import pickle
    def get_model_class(params):
    if params == 1:
    class ModelA(object):
    def __init__(self, ...):
    pass
    return ModelA
    else:
    class ModelB(object):
    def __init__(self, ...):
    pass
    return ModelB
    model_class = get_model_class(option)
    model = model_class()
    with open("model.pkl", "wb") as f:
    pickle.dump(model, f, pickle.HIGHEST_PROTOCOL)

    ์ผ๋ จ์˜ ์กฐ๊ฑด ํ˜น์€ ์‹คํ–‰ ์˜ต์…˜๋“ฑ์— ๋”ฐ๋ผ์„œ ๋ชจ๋ธ ํด๋ž˜์Šค๋ฅผ ์„ ํƒํ•ด์„œ ๋ชจ๋ธ ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“œ๋Š” ๊ฐ„๋‹จํ•˜๊ณ , ์ข…์ข… ๋งˆ์ฃผํ•  ๋ฒ•ํ•œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.

    Traceback (most recent call last):
    File "/Users/naubull2/test.py", line 20, in <module>
    pickle.dump(model, f, pickle.HIGHEST_PROTOCOL)
    AttributeError: Can't pickle local object 'get_model_class.<locals>.ModelA'

    ์ฒ˜์Œ์— ์„ค๋ช…๋“œ๋ ธ๋“ฏ์ด, ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
    ModelA ํด๋ž˜์Šค์˜ ์ •์˜๋Š” get_model_class ํ•จ์ˆ˜์˜ ๋กœ์ปฌ scope์— ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.
    ๊ทธ๋Ÿฌ๋ฏ€๋กœ ์ „์—ญ scope์—์„œ๋Š” ModelA ํด๋ž˜์Šค๊ฐ€ ์ •์˜๋œ ์ฝ”๋“œ๊ฐ€ ์กด์žฌํ•˜์ง€ ์•Š๊ธฐ์— ํ•ด๋‹น ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ์•ฝ pickle ๋˜๋Š” ๊ฑธ ํ—ˆ์šฉํ•œ๋‹ค๋ฉด, un-pickle ํ•  ์ˆ˜ ์—†์„ ๊ฒƒ์„ ์•Œ๋ฉด์„œ ์˜ค๋ธŒ์ ํŠธ๋ฅผ ์ €์žฅํ•˜๊ฒŒ ๋˜๋Š” ๊ฒƒ์ด์ฃ .

    ๋‹ค์‹œ ํ•œ๋ฒˆ ์ด์•ผ๊ธฐํ•˜๋ฉด.. pickle์€ ์ธ์Šคํ„ด์Šค๋ฅผ ๋ฐ”์ด๋„ˆ๋ฆฌ๋กœ ์ €์žฅํ•˜๋Š” ๊ฒŒ ์•„๋‹ˆ๋ผ ์ธ์Šคํ„ด์Šค๋ฅผ "์ƒ์„ฑ"ํ•˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

    ๊ทธ๋Ÿผ ์ด๋Ÿฐ ๊ฒฝ์šฐ ์–ด๋–ป๊ฒŒ ํ•ด๊ฒฐํ•ด์•ผ ์ข‹์„๊นŒ์š”?

    1. ๊ฐ„๋‹จํ•˜๊ฒŒ global๋กœ ์ „์—ญ ๋ณ€์ˆ˜๋กœ

    ๊ฐ€์žฅ ๊ฐ„ํŽธํ•˜๊ฒŒ๋Š” global ์„ ์–ธ์„ ๋ถ™์—ฌ์„œ pickle ํ•˜๋Š” scope์—์„œ ๋ชจ๋ธ ์ธ์Šคํ„ด์Šค ์ƒ์„ฑ์ž์˜ ์ •์˜๋ฅผ ์•Œ ์ˆ˜ ์žˆ๋„๋ก ํ•ด์ฃผ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

    import pickle
    def get_model_class(params):
    global ModelA
    global ModelB
    if params == 1:
    class ModelA(object):
    def __init__(self, ...):
    pass
    return ModelA
    else:
    class ModelB(object):
    def __init__(self, ...):
    pass
    return ModelB
    model_class = get_model_class(option)
    model = model_class()
    with open("model.pkl", "wb") as f:
    pickle.dump(model, f, pickle.HIGHEST_PROTOCOL)

    2. ํด๋ž˜์Šค ์ •์˜ ์ฝ”๋“œ๋ฅผ ์ „์—ญ์œผ๋กœ

    ํ•˜์ง€๋งŒ, ์žฌ์‚ฌ์šฉ์„ฑ์ด ๋†’์„๋งŒํ•œ ํด๋ž˜์Šค๋กœ ๋ณด์ด๋Š” ModelA, ModelB๋ฅผ ํ•จ์ˆ˜ ๋‚ด์—์„œ ์ •์˜ํ•ด ๋†“๊ณ global๋กœ ๋งŒ๋“œ๋Š” ๊ฒƒ์€ ์ •๋ง ๊ตฌ์กฐ๊ฐ€ ์ด์ƒํ•ฉ๋‹ˆ๋‹ค. ์ฐจ๋ผ๋ฆฌ ํด๋ž˜์Šค ์ •์˜๋“ค์„ ํ•จ์ˆ˜ ๋ฐ–์œผ๋กœ ๋นผ๋‚ด๋„ ํ•ด๊ฒฐ๋˜๊ฒ ์ฃ ? 

    ๊ธฐ๋Šฅ์ ์œผ๋กœ๋„, get_model_class ํ•จ์ˆ˜๋Š” ์˜ต์…˜์— ๋”ฐ๋ผ ์ƒ์„ฑํ•  ๋ชจ๋ธ์˜ ํด๋ž˜์Šค๋ฅผ ๊ณ ๋ฅด๊ธฐ๋งŒ ํ•˜๋ฉด ๋  ๋ฟ ๋นŒ๋” ํ•จ์ˆ˜๊ฐ€ ์•„๋‹ˆ๊ธฐ ๋•Œ๋ฌธ์— ํด๋ž˜์Šค ์ •์˜๋Š” get_model_class ๋ฐ–์— ์กด์žฌํ•˜๋Š” ๊ฒŒ ๊ตฌ์กฐ์ ์œผ๋กœ๋‚˜ ๊ธฐ๋Šฅ์ ์œผ๋กœ๋‚˜ ๋ช…ํ™•ํ•ฉ๋‹ˆ๋‹ค.

    import pickle
    class ModelA(object):
    def __init__(self, ...):
    pass
    class ModelB(object):
    def __init__(self, ...):
    pass
    def get_model_class(params):
    if params == 1:
    return ModelA
    else:
    return ModelB
    model_class = get_model_class(option)
    model = model_class()
    with open("model.pkl", "wb") as f:
    pickle.dump(model, f, pickle.HIGHEST_PROTOCOL)

    3. ํด๋ž˜์Šค ์ •์˜ ์ฝ”๋“œ๋ฅผ ๋ถ„๋ฆฌ

    ๊ทธ๋Ÿผ ๋‹ค ๋œ ๊ฑธ๊นŒ์š”? ์ด์ œ pickle๋กœ ์ €์žฅ๋œ ๋ชจ๋ธ์„ ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ฌ ๋•Œ๋ฅผ ์ƒ๊ฐํ•ด ๋ณด๋ฉด, ๊ฒฐ๊ตญ pickle.load ์‹œ์ ์— ํ•ด๋‹น ์ฝ”๋“œ scope์— ์ €์žฅ๋œ ModelA๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ํ•ด๋‹น ํด๋ž˜์Šค์˜ ์ƒ์„ฑ์ž๊ฐ€ ์–ด๋–ป๊ฒŒ ์ƒ๊ฒผ๋Š”์ง€๋ฅผ ์•Œ์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค. 

    ๋งŒ์•ฝ ์ž„์˜์˜ ๋‹ค๋ฅธ ๋ชจ๋“ˆ์—์„œ ์œ„์—์„œ ์ €์žฅํ•œ model.pkl์„ ๋กœ๋“œํ•˜๋ ค๊ณ  ํ•˜๋ฉด, ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์˜ค๋ฅ˜๋ฅผ ๋ณด๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

    Traceback (most recent call last):
    File "/Users/naubull2/main.py", line 3, in <module>
    d = pickle.load(open("model.pkl", "rb"))
    AttributeError: Can't get attribute 'ModelA' on <module '__main__' from '/Users/naubull2/main.py'>

    ModelA์— ๋Œ€ํ•œ ์ •์˜๋ฅผ ๋ชจ๋ฅด๋Š” ๊ณณ์—์„œ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ ์‹คํ–‰ํ–ˆ์œผ๋‹ˆ๊นŒ ๋‹น์—ฐํžˆ ์˜ค๋ฅ˜๊ฐ€ ๋‚˜๊ฒ ์ฃ . 

    ์ €์žฅํ•˜๋Š” ๊ณณ์—์„œ๋„, ๋ถˆ๋Ÿฌ์˜ค๋Š” ๊ณณ์—์„œ๋„ ํ•„์š”ํ•œ ํด๋ž˜์Šค ์ •์˜ ์ฝ”๋“œ๋‹ˆ, ์•„์–˜ ์ฒ˜์Œ๋ถ€ํ„ฐ ํด๋ž˜์Šค ์ •์˜๋ฅผ ๋ณ„๋„ ๋ชจ๋“ˆ๋กœ ๋ถ„๋ฆฌ ์‹œ์ผœ๋‘๋ฉด, ํ•„์š”ํ•œ ๊ณณ์—์„œ ํ•ด๋‹น ๋ชจ๋“ˆ๋งŒ import ํ•ด์„œ pickle dump / load ํ•˜๋Š” ๊ฒƒ์ด ๊ตฌ์กฐ์ ์œผ๋กœ ๊ฐ€์žฅ ์•ˆ์ „ํ•˜๊ณ  ํด๋ฆฐ ํ•œ ์ฝ”๋“œ๊ฐ€ ๋˜๊ฒ ์ฃ ?

    # models.py
    class ModelA(object):
    def __init__(self, ...):
    pass
    class ModelB(object):
    def __init__(self, ...):
    pass
    ##########################################################
    # save_model.py
    import pickle
    from models import ModelA, ModelB
    def get_model_class(params):
    if params == 1:
    return ModelA
    else:
    return ModelB
    model_class = get_model_class(option)
    model = model_class()
    with open("model.pkl", "wb") as f:
    pickle.dump(model, f, pickle.HIGHEST_PROTOCOL)
    ##########################################################
    # main.py
    import pickle
    from models import ModelA, ModelB
    model = pickle.load(open("model.pkl", "rb"))
    ...

    ๋‚˜๋จธ์ง€ ์œ ํ˜•๋“ค๋„ ์›์ธ์€ ๋ชจ๋‘ ๋™์ผํ•œ๋ฐ, ์ƒํ™ฉ์ด ๋‹ค๋ฅธ ๊ฒƒ๋ฟ์ž…๋‹ˆ๋‹ค.

    ex2. ๋ฉ€ํ‹ฐํ”„๋กœ์„ธ์‹ฑ ํ™˜๊ฒฝ์—์„œ์˜ can't pickle local objects

    ์ด์ œ, pickle์ด ์•ˆ ๋˜๋Š” ๊ฒฝ์šฐ์˜ ์ฃผ์š” ์›์ธ์„ ์•Œ์•˜๋Š”๋ฐ, ๊ทธ๋Ÿผ ์„ ํƒ์ ์œผ๋กœ ํด๋ž˜์Šค๋ฅผ ๊ณ ๋ฅด๊ณ  ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒฝ์šฐ ์™ธ์— ์–ด๋–ค ๊ฒฝ์šฐ์— ์ฃผ๋กœ ๋ฐœ์ƒํ• ๊นŒ์š”?

    multiprocessing์„ ์ด์šฉํ•  ๋•Œ, targetํ•จ์ˆ˜๋ฅผ ๋กœ์ปฌ ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•  ๊ฒฝ์šฐ์—๋„ ๋ฐœ์ƒ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

    ๊ทœ๋ชจ๊ฐ€ ์ œ๋ฒ• ์žˆ๋Š” ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ์—์„œ๋„ ์ข…์ข… ๋ณด์ด๋Š” ์˜ค๋ฅ˜ ์œ ํ˜•์ด๋‹ˆ.. pickle์˜ ์ •์ฒด๋ฅผ ์•Œ์•„๋„ ํ”ผํ•˜๊ธฐ๊ฐ€ ์‰ฝ์ง€ ์•Š์€ ์˜ค๋ฅ˜์ธ ๊ฒƒ ๊ฐ™๋„ค์š”.
    https://github.com/pyg-team/pytorch_geometric/issues/366

    import multiprocessing
    def main():
    a="string"
    def task1():
    print(a)
    job1 = multiprocessing.Process(target=task1, args=(a))
    job1.start()
    job1.join()
    main()

    multiprocessing ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋‚ด๋ถ€์—์„œ๋„ target ํ•จ์ˆ˜๋ฅผ ์ „๋‹ฌํ•  ๋•Œ pickle์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
    ๋”ฐ๋ผ์„œ worker process์—์„œ๋Š” ์œ„์˜ ์˜ˆ์ œ์—์„œ ๋กœ์ปฌ ํ•จ์ˆ˜์ธ task1()์„ ์•Œ ์ˆ˜๊ฐ€ ์—†์–ด pickle ์˜ค๋ฅ˜๋ฅผ ์ผ์œผํ‚ต๋‹ˆ๋‹ค.

    Traceback (most recent call last):
    File "/Users/naubull2/test.py", line 10, in <module>
    main()
    File "/Users/naubull2/test.py", line 8, in main
    job1.start()
    ...
    File "/Users/naubull2/opt/anaconda3/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
    File "/Users/naubull2/opt/anaconda3/lib/python3.9/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
    AttributeError: Can't pickle local object 'function.<locals>.task1'

    ์ด ์—ญ์‹œ 1, 2์˜ ํ•ด๊ฒฐ์ฑ…์ฒ˜๋Ÿผ, ์ „์—ญ์œผ๋กœ ์˜ฎ๊ธฐ๊ฑฐ๋‚˜, global๋กœ ์„ ์–ธํ•ด ์ฃผ๋ฉด ๋˜๊ฒ ์ฃ ?
    3์˜ ํ•ด๊ฒฐ์ฑ…์ฒ˜๋Ÿผtask1() ํ•จ์ˆ˜๋ฅผ ๋ณ„๋„ ๋ชจ๋“ˆ๋กœ ๋–ผ์–ด๋‚ด๋Š” ๊ฒƒ๋„ ๋ฐฉ๋ฒ•์ผ ํ…๋ฐ, ์ด๊ฑด ๊ฒฝ์šฐ์— ๋”ฐ๋ผ์„œ ๊ฒฐ์ •ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ๋ณดํ†ต job์„ ์‹คํ–‰ํ•˜๋Š” ๊ณณ์—์„œ job์„ ์ •์˜ํ•  ๋งŒํผ ๊ฐ„๋‹จํ•œ ๊ฒฝ์šฐ๋„ ๋งŽ์œผ๋‹ˆ๊นŒ ์ด๋Ÿด ๋• ๊ตณ์ด task1() ํ•จ์ˆ˜ ํ•˜๋‚˜๋งŒ ๋ชจ๋“ˆ๋กœ ๋–ผ์–ด๋‚ด๋Š” ๊ฒƒ์€ ๋ถ€์ž์—ฐ์Šค๋Ÿฌ์šธ ํ…Œ๋‹ˆ๊นŒ์š”.

    ex3. lambda ํ•จ์ˆ˜ ์‚ฌ์šฉ ์‹œ can't pickle local objects

    ์ด์ฏค ๋˜๋ฉด lambda ํ•จ์ˆ˜๋Š” ์ž„์‹œ ํ•จ์ˆ˜์ด๋‹ˆ, ๋‹น์—ฐํžˆ pickle ๋˜์ง€ ์•Š์„ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๊ฒ ์ฃ ?

    import pickle
    var = pickle.dumps(lambda x, y: x+y)

    ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์˜ค๋ฅ˜๋ฅผ ์œ ๋ฐœํ•ฉ๋‹ˆ๋‹ค.

    Traceback (most recent call last):
    File "/Users/naubull2/test.py", line 2, in <module>
    var=pickle.dumps(lambda x, y: x+y)
    _pickle.PicklingError: Can't pickle <function <lambda> at 0x104c82040>: attribute lookup <lambda> on __main__ failed

    lambda ํ•จ์ˆ˜๋ฅผ pickle ํ• ... ์ผ์ด ์žˆ๊ฒ ๋ƒ๋งˆ๋Š”.. ์ด๋Ÿฐ ๊ฒฝ์šฐ๋Š” ๊ฐ„๋‹จํ•œ helper ํ•จ์ˆ˜๋ฅผ ํ•˜๋‚˜ ์ •์˜ ํ•˜๊ณ  ์ €์žฅํ•˜๋Š” ํŽธ์ด ๊ทธ๋‚˜๋งˆ ํ•ด๊ฒฐ์ฑ…์ด ๋˜๊ฒ ๊ตฐ์š”.

    # ๊ฒฐ๋ก 

    ๊ธฐ๋ณธ์ ์œผ๋กœ pickle์„ ๋„ˆ๋ฌด ๋งน์‹ ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์ด ์ข‹์ง€๋งŒ, pickle ์ž์ฒด๋Š” ๊ณ ์„ฑ๋Šฅ serializer์ด๊ธฐ ๋•Œ๋ฌธ์—, ์‚ฌ์šฉ์„ ํ•ด์•ผ ํ•œ๋‹ค๋ฉด, ์ •์ฒด๋ฅผ ํ™•์‹คํ•˜๊ฒŒ ์ดํ•ดํ•˜๊ณ  ์‚ฌ์šฉํ•ด์•ผ๊ฒ ์ฃ .

    ๋ฌผ๋ก  ๋‚ด๊ฐ€ ์ง์ ‘ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•„๋„ ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋‚ด์—์„œ pickle์„ serialize ์ˆ˜๋‹จ์œผ๋กœ ์“ธ ์ˆ˜ ๋„ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ์œ„์˜ ์˜ˆ์‹œ๋“ค์ฒ˜๋Ÿผ ์ฃผ๋กœ ๋ฐœ์ƒํ•˜๋Š” ์›์ธ์€ scope ๋ฌธ์ œ์ด๋‹ˆ ์ด๋ฒˆ ํฌ์ŠคํŒ…์„ ํ†ตํ•ด pickle ์˜ค๋ฅ˜๋ฅผ ํ•ด๊ฒฐํ•˜์‹ค ์ˆ˜ ์žˆ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

    ๋ฐ˜์‘ํ˜•

    'ML engineer > Papers & CS generals' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

    ๋Œ“๊ธ€

    dev_naubull2Answers lie in the details.
Designed by naubull2.