ML engineer/Papers & CS generals

[Python] Pickle์— ๋Œ€ํ•œ ์˜คํ•ด์™€ Can’t Pickle local object Error ํ•ด๊ฒฐ

naubull2 2023. 2. 16. 00:42
๋ฐ˜์‘ํ˜•

๐Ÿ•“ 5 mins read

# Pickle์˜ ์˜คํ•ด

ํŒŒ์ด์ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋“ค์„ ์‚ฌ์šฉํ•˜๋‹ค ๋ณด๋ฉด, ์ง์ ‘ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋”๋ผ๋„, pickle์€ ์–ด๋–ป๊ฒŒ๋“  ๋งŒ๋‚  ์ˆ˜๋ฐ–์— ์—†๋Š”๋ฐ, ๋‹จ์ˆœํžˆ ํŒŒ์ผ์ด๋‚˜ ์˜ค๋ธŒ์ ํŠธ๋ฅผ ์ €์žฅํ•˜๊ณ  ๋ถˆ๋Ÿฌ์˜ฌ ๋•Œ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์–ด๋–ค ๊ฐ์ฒด๋ฅผ ํ”„๋กœ์„ธ์Šค ๊ฐ„์— ๊ณต์œ ํ•˜๊ฑฐ๋‚˜ ์ „๋‹ฌํ•  ๋•Œ๋„ ์“ฐ์ž…๋‹ˆ๋‹ค. ์ด๋•Œ pickle์˜ ์›๋ฆฌ๋ฅผ ๋ชจ๋ฅผ ๊ฒฝ์šฐ Attribute Error ํ˜น์€ PicklingError๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ˜•ํƒœ๋กœ ๋งŒ๋‚˜๋ฉด ๋‹นํ™ฉ์Šค๋Ÿฌ์šธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (๋ฌธ์ œ์—†๋Š” ๊ฑฐ ๊ฐ™์€๋ฐ ๋Œ€์ฒด ์™œ!?)

PicklingError: Can't pickle <class 'class.method.var'>: it's not the same object as class.method.var

AttributeError: Can't pickle local object 'class.<locals>.some_var'

ํŠนํžˆ ํŒŒ์ด์ฌ์„ ์ฒ˜์Œ ์ ‘ํ•˜๋Š” ์‚ฌ์šฉ์ž๋‚˜, ํ”ํžˆ ์œ ํ–‰(?)ํ•˜๋Š” "pickle์„ ์ด์šฉํ•˜์—ฌ ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์ €์žฅํ•˜๊ธฐ" ๋”ฐ์œ„์˜ ๊ธ€์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ ์ €์žฅ ๋ฐฉ๋ฒ•์„ ์ž˜๋ชป ๋ฐฐ์šด ๊ฐœ๋ฐœ์ž๋“ค์ด ์˜คํ•ดํ•˜๊ธฐ ์‰ฌ์šด ๋ถ€๋ถ„์ธ๋ฐ์š”, ํŒŒ์ด์ฌ ๋ฌธ์„œ์—๋„ ๋ช…์‹œ๋˜์—ˆ๋“ฏ์ด pickle์€ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๋„๊ตฌ๊ฐ€ ์•„๋‹ˆ๋ผ "๊ฐ์ฒด ์ง๋ ฌํ™”" ๋„๊ตฌ์ž…๋‹ˆ๋‹ค.

https://docs.python.org/ko/3/library/pickle.html

์ง๋ ฌํ™”(serialization)๋Š” ๊ฐ„๋‹จํžˆ ๋งํ•ด์„œ, ๋ฐ์ดํ„ฐ๋ฅผ ๋ฐ”์ดํŠธ ์ŠคํŠธ๋ฆผ์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์ ˆ์ฐจ์ž…๋‹ˆ๋‹ค. ์ด๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ ํŒŒ์ผ๋กœ ์ €์žฅ์„ ํ•˜๋Š” ํ˜•ํƒœ๋กœ ๋ณด์ด๋‹ˆ, ์˜คํ•ดํ•  ๋งŒ ํ•˜๊ธด ํ•ฉ๋‹ˆ๋‹ค. ๋‹ค๋งŒ ์—ฌ๊ธฐ์„œ ์ฃผ๋ชฉํ•ด์•ผ ํ•  ๋ถ€๋ถ„์€, ์ง๋ ฌํ™” ๋Œ€์ƒ์ด "๋ฐ์ดํ„ฐ"๊ฐ€ ์•„๋‹ˆ๋ผ "๊ฐ์ฒด ๊ตฌ์กฐ"๋ฅผ ์ง๋ ฌํ™”ํ•˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ ํ”„๋กœํ† ์ฝœ์ด๋ผ๋Š” ์ ์ž…๋‹ˆ๋‹ค.

## Pickling์˜ ๋ฌธ์ œ์ 

  • ๊ฐ์ฒด ๊ตฌ์กฐ๋ฅผ ์ง๋ ฌํ™”ํ•˜๋ฉด, ํ•ด๋‹น ๊ฐ์ฒด์— ๋Œ€ํ•œ ๊ตฌํ˜„์ด ํ˜„์žฌ scope์— ์กด์žฌํ•˜์ง€ ์•Š์œผ๋ฉด unpickling ํ•  ๋•Œ ์˜ค๋ฅ˜๊ฐ€ ๋‚ฉ๋‹ˆ๋‹ค. 
    "CustomClass" ์ธ์Šคํ„ด์Šค๋ฅผ pickle ํ–ˆ๋Š”๋ฐ, "CustomClass"๊ฐ€ ๋ญ”์ง€ ์•Œ ์ˆ˜ ์—†๋Š” ์ƒํƒœ์—์„  unpickle์ด ์•ˆ ๋˜๋Š” ๊ฒŒ ๋‹น์—ฐํ•˜๊ฒ ์ฃ .

๋ฌด์—‡๋ณด๋‹ค ์œ„์˜ ์˜ˆ์‹œ์—์„œ ๋ˆˆ์น˜ ๋น ๋ฅธ ๋ถ„๋“ค์€ ์•Œ๊ฒ ์ง€๋งŒ, unpickle์€ ์–ด๋–ค ๋ฐ์ดํ„ฐ๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์ฝ”๋“œ๋ฅผ "์‹คํ–‰"ํ•˜๋Š” ๊ณผ์ •์ด๊ธฐ ๋•Œ๋ฌธ์—, ๊ฒ€์ฆ๋œ pickle ์˜ค๋ธŒ์ ํŠธ๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด unpickle ๋งŒ์œผ๋กœ๋„ ์•…์„ฑ์ฝ”๋“œ๋ฅผ ์‹คํ–‰์‹œํ‚ค๊ฒŒ ๋  ์ˆ˜ ๋„ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์•ˆ์ „ํ•˜์ง€ ์•Š์€ ํ”„๋กœํ† ์ฝœ์ž…๋‹ˆ๋‹ค.

ํŒŒ์ด์ฌ ๋ฌธ์„œ์—๋„ ๋ฌด์‹œ๋ฌด์‹œํ•œ ๊ฒฝ๊ณ  ๋ฌธ๊ตฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

https://docs.python.org/ko/3/library/pickle.html

  • ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ "์‹คํ–‰"ํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๋‹น์—ฐํžˆ ์„œ๋กœ ๋‹ค๋ฅธ ๋ฒ„์ „์˜ ํŒŒ์ด์ฌ ๋ฐ”์ด๋„ˆ๋ฆฌ์—์„œ ์ƒ์„ฑํ•œ pickle ์˜ค๋ธŒ์ ํŠธ๋Š” ๊ณต์œ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
    - ex. Python 3.7.9์—์„œ pickle๋œ ์˜ค๋ธŒ์ ํŠธ๋Š” Python 3.8.16์—์„œ unpickle ํ•  ์ˆ˜ ์—†๋Š” ๊ฒƒ.
  • ๋ฐ”์ด๋„ˆ๋ฆฌ ํ”„๋กœํ† ์ฝœ์ด๊ธฐ์— ๋‹น์—ฐํžˆ ์ €์žฅ๋œ ์˜ค๋ธŒ์ ํŠธ๋Š” human-readable ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

### huggingface model hub

๋ถˆ๊ณผ ๋ช‡ ๋‹ฌ ์ „๋งŒ ํ•ด๋„ huggingface model hub๋„ pickle์„ ์ด์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ์‹ ๋ขฐํ•  ์ˆ˜ ์—†๋Š” ๋ชจ๋ธ์ด ์•„๋‹ˆ๋ผ๋ฉด ํ•จ๋ถ€๋กœ from_pretrained()๋ฅผ ์‹คํ–‰ํ•ด์„  ์•ˆ ๋์—ˆ์Šต๋‹ˆ๋‹ค. (์•…์„ฑ์ฝ”๋“œ๋ฅผ ์‹ฌ์–ด๋‘˜ ์ˆ˜ ์žˆ๋‹ค๋‹ˆ๊นŒ์š”?)

- ๋ฌผ๋ก  ์ด ์‚ฌ์‹ค์ด ์•Œ๋ ค์ง€๊ณ  ๋‚˜์„œ huggingface ์ธก์—์„œ๋„ ๋ฌด์‹œ๋ฌด์‹œ?ํ•œ ๊ฒฝ๊ณ  ๋ฌธ๊ตฌ์™€ ํ•จ๊ป˜ ๋ชจ๋ธ์ด ์—…๋กœ๋“œ๋  ๋•Œ ์–ด๋А ์ •๋„ ๊ฒ€์ˆ˜๋ฅผ ํ•ด์ฃผ๊ธฐ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค.

https://huggingface.co/docs/hub/security-pickle

 

Pickle Scanning

Pickle Scanning Pickle is a widely used serialization format in ML. Most notably, it is the default format for PyTorch model weights. There are dangerous arbitrary code execution attacks that can be perpetrated when you load a pickle file. We suggest loadi

huggingface.co

pickle load๊ฐ€ ์™œ, ์–ผ๋งˆ๋‚˜ ์œ„ํ—˜ํ•œ์ง€์™€, ์ฃผ์˜ ์‚ฌํ•ญ๋“ค์— ๋Œ€ํ•ด ์ƒ์„ธํžˆ ์„ค๋ช…ํ•˜๊ณ  ์žˆ์œผ๋‹ˆ, ๊ด€์‹ฌ ์žˆ์œผ์‹œ๋ฉด ์ฝ์–ด๋ณด๋ฉด ์ข‹์„ ๊ธ€์ž…๋‹ˆ๋‹ค.

 

# Pickle ์˜ค๋ฅ˜ ์œ ํ˜•

์‚ฌ์‹ค ๋Œ€๊ฒŒ ์ด๋Ÿฐ local object ์˜ค๋ฅ˜๋Š”, pickle์ด local object(์ „์—ญ scope์—์„œ ์ •์˜๋ฅผ ์ฐพ์„ ์ˆ˜ ์—†๋Š”)๋“ค์ด pickle ๋Œ€์ƒ์ด ๋  ๋•Œ์ž…๋‹ˆ๋‹ค.
(un-pickle์ด ์‹ค์ œ๋กœ๋Š” ํ•ด๋‹น ์˜ค๋ธŒ์ ํŠธ๋ฅผ ์žฌ์ƒ์„ฑํ•˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๊ณผ์ •์ธ๊ฑธ ์ƒ๊ฐํ•˜๋ฉด ๋‹น์—ฐํžˆ ์˜ค๋ฅ˜์ฃ .)
๋ช‡ ๊ฐ€์ง€ ๋Œ€ํ‘œ์ ์ธ ์˜ค๋ฅ˜ ์œ ํ˜•๊ณผ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•๋“ค์„ ์‚ดํŽด๋ณด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. (๋Œ€๊ฒŒ global๋กœ ์„ ์–ธํ•ด ์ฃผ๊ฑฐ๋‚˜ ๊ตฌ์กฐ๋ฅผ ๋ณ€๊ฒฝํ•˜๋ฉด ํ•ด์†Œ๋ฉ๋‹ˆ๋‹ค.)

ex1. Attribute Error - can't pickle local objects

๊ฐ€์žฅ ํ”ํ•œ ์œ ํ˜•์œผ๋กœ, ํ•จ์ˆ˜ ๋‚ด๋ถ€์˜ ๋กœ์ปฌ scope์—์„œ ์ •์˜๋œ ์˜ค๋ธŒ์ ํŠธ๋ฅผ pickle.dump ํ•  ๋•Œ ๋ฐœ์ƒํ•˜๋Š” ์˜ค๋ฅ˜์ž…๋‹ˆ๋‹ค.

import pickle

def get_model_class(params):
    if params == 1:
        class ModelA(object):
    	    def __init__(self, ...):
                pass
    	return ModelA
    else:
        class ModelB(object):
            def __init__(self, ...):
                pass
        return ModelB
        
model_class = get_model_class(option)
model = model_class()

with open("model.pkl", "wb") as f:
    pickle.dump(model, f, pickle.HIGHEST_PROTOCOL)

์ผ๋ จ์˜ ์กฐ๊ฑด ํ˜น์€ ์‹คํ–‰ ์˜ต์…˜๋“ฑ์— ๋”ฐ๋ผ์„œ ๋ชจ๋ธ ํด๋ž˜์Šค๋ฅผ ์„ ํƒํ•ด์„œ ๋ชจ๋ธ ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“œ๋Š” ๊ฐ„๋‹จํ•˜๊ณ , ์ข…์ข… ๋งˆ์ฃผํ•  ๋ฒ•ํ•œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.

Traceback (most recent call last):
  File "/Users/naubull2/test.py", line 20, in <module>
    pickle.dump(model, f, pickle.HIGHEST_PROTOCOL)
AttributeError: Can't pickle local object 'get_model_class.<locals>.ModelA'

์ฒ˜์Œ์— ์„ค๋ช…๋“œ๋ ธ๋“ฏ์ด, ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.
ModelA ํด๋ž˜์Šค์˜ ์ •์˜๋Š” get_model_class ํ•จ์ˆ˜์˜ ๋กœ์ปฌ scope์— ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.
๊ทธ๋Ÿฌ๋ฏ€๋กœ ์ „์—ญ scope์—์„œ๋Š” ModelA ํด๋ž˜์Šค๊ฐ€ ์ •์˜๋œ ์ฝ”๋“œ๊ฐ€ ์กด์žฌํ•˜์ง€ ์•Š๊ธฐ์— ํ•ด๋‹น ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ์•ฝ pickle ๋˜๋Š” ๊ฑธ ํ—ˆ์šฉํ•œ๋‹ค๋ฉด, un-pickle ํ•  ์ˆ˜ ์—†์„ ๊ฒƒ์„ ์•Œ๋ฉด์„œ ์˜ค๋ธŒ์ ํŠธ๋ฅผ ์ €์žฅํ•˜๊ฒŒ ๋˜๋Š” ๊ฒƒ์ด์ฃ .

๋‹ค์‹œ ํ•œ๋ฒˆ ์ด์•ผ๊ธฐํ•˜๋ฉด.. pickle์€ ์ธ์Šคํ„ด์Šค๋ฅผ ๋ฐ”์ด๋„ˆ๋ฆฌ๋กœ ์ €์žฅํ•˜๋Š” ๊ฒŒ ์•„๋‹ˆ๋ผ ์ธ์Šคํ„ด์Šค๋ฅผ "์ƒ์„ฑ"ํ•˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋Ÿผ ์ด๋Ÿฐ ๊ฒฝ์šฐ ์–ด๋–ป๊ฒŒ ํ•ด๊ฒฐํ•ด์•ผ ์ข‹์„๊นŒ์š”?

1. ๊ฐ„๋‹จํ•˜๊ฒŒ global๋กœ ์ „์—ญ ๋ณ€์ˆ˜๋กœ

๊ฐ€์žฅ ๊ฐ„ํŽธํ•˜๊ฒŒ๋Š” global ์„ ์–ธ์„ ๋ถ™์—ฌ์„œ pickle ํ•˜๋Š” scope์—์„œ ๋ชจ๋ธ ์ธ์Šคํ„ด์Šค ์ƒ์„ฑ์ž์˜ ์ •์˜๋ฅผ ์•Œ ์ˆ˜ ์žˆ๋„๋ก ํ•ด์ฃผ๋ฉด ๋ฉ๋‹ˆ๋‹ค.

import pickle

def get_model_class(params):
    global ModelA
    global ModelB
    if params == 1:
        class ModelA(object):
    	    def __init__(self, ...):
                pass
    	return ModelA
    else:
        class ModelB(object):
            def __init__(self, ...):
                pass
        return ModelB
        
model_class = get_model_class(option)
model = model_class()

with open("model.pkl", "wb") as f:
    pickle.dump(model, f, pickle.HIGHEST_PROTOCOL)

2. ํด๋ž˜์Šค ์ •์˜ ์ฝ”๋“œ๋ฅผ ์ „์—ญ์œผ๋กœ

ํ•˜์ง€๋งŒ, ์žฌ์‚ฌ์šฉ์„ฑ์ด ๋†’์„๋งŒํ•œ ํด๋ž˜์Šค๋กœ ๋ณด์ด๋Š” ModelA, ModelB๋ฅผ ํ•จ์ˆ˜ ๋‚ด์—์„œ ์ •์˜ํ•ด ๋†“๊ณ global๋กœ ๋งŒ๋“œ๋Š” ๊ฒƒ์€ ์ •๋ง ๊ตฌ์กฐ๊ฐ€ ์ด์ƒํ•ฉ๋‹ˆ๋‹ค. ์ฐจ๋ผ๋ฆฌ ํด๋ž˜์Šค ์ •์˜๋“ค์„ ํ•จ์ˆ˜ ๋ฐ–์œผ๋กœ ๋นผ๋‚ด๋„ ํ•ด๊ฒฐ๋˜๊ฒ ์ฃ ? 

๊ธฐ๋Šฅ์ ์œผ๋กœ๋„, get_model_class ํ•จ์ˆ˜๋Š” ์˜ต์…˜์— ๋”ฐ๋ผ ์ƒ์„ฑํ•  ๋ชจ๋ธ์˜ ํด๋ž˜์Šค๋ฅผ ๊ณ ๋ฅด๊ธฐ๋งŒ ํ•˜๋ฉด ๋  ๋ฟ ๋นŒ๋” ํ•จ์ˆ˜๊ฐ€ ์•„๋‹ˆ๊ธฐ ๋•Œ๋ฌธ์— ํด๋ž˜์Šค ์ •์˜๋Š” get_model_class ๋ฐ–์— ์กด์žฌํ•˜๋Š” ๊ฒŒ ๊ตฌ์กฐ์ ์œผ๋กœ๋‚˜ ๊ธฐ๋Šฅ์ ์œผ๋กœ๋‚˜ ๋ช…ํ™•ํ•ฉ๋‹ˆ๋‹ค.

import pickle

class ModelA(object):
    def __init__(self, ...):
        pass

class ModelB(object):
    def __init__(self, ...):
        pass
        
def get_model_class(params):
    if params == 1:
    	return ModelA
    else:
        return ModelB
        
model_class = get_model_class(option)
model = model_class()

with open("model.pkl", "wb") as f:
    pickle.dump(model, f, pickle.HIGHEST_PROTOCOL)

3. ํด๋ž˜์Šค ์ •์˜ ์ฝ”๋“œ๋ฅผ ๋ถ„๋ฆฌ

๊ทธ๋Ÿผ ๋‹ค ๋œ ๊ฑธ๊นŒ์š”? ์ด์ œ pickle๋กœ ์ €์žฅ๋œ ๋ชจ๋ธ์„ ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ฌ ๋•Œ๋ฅผ ์ƒ๊ฐํ•ด ๋ณด๋ฉด, ๊ฒฐ๊ตญ pickle.load ์‹œ์ ์— ํ•ด๋‹น ์ฝ”๋“œ scope์— ์ €์žฅ๋œ ModelA๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ํ•ด๋‹น ํด๋ž˜์Šค์˜ ์ƒ์„ฑ์ž๊ฐ€ ์–ด๋–ป๊ฒŒ ์ƒ๊ฒผ๋Š”์ง€๋ฅผ ์•Œ์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค. 

๋งŒ์•ฝ ์ž„์˜์˜ ๋‹ค๋ฅธ ๋ชจ๋“ˆ์—์„œ ์œ„์—์„œ ์ €์žฅํ•œ model.pkl์„ ๋กœ๋“œํ•˜๋ ค๊ณ  ํ•˜๋ฉด, ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์˜ค๋ฅ˜๋ฅผ ๋ณด๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

Traceback (most recent call last):
  File "/Users/naubull2/main.py", line 3, in <module>
    d = pickle.load(open("model.pkl", "rb"))
AttributeError: Can't get attribute 'ModelA' on <module '__main__' from '/Users/naubull2/main.py'>

ModelA์— ๋Œ€ํ•œ ์ •์˜๋ฅผ ๋ชจ๋ฅด๋Š” ๊ณณ์—์„œ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ๋ฅผ ์‹คํ–‰ํ–ˆ์œผ๋‹ˆ๊นŒ ๋‹น์—ฐํžˆ ์˜ค๋ฅ˜๊ฐ€ ๋‚˜๊ฒ ์ฃ . 

์ €์žฅํ•˜๋Š” ๊ณณ์—์„œ๋„, ๋ถˆ๋Ÿฌ์˜ค๋Š” ๊ณณ์—์„œ๋„ ํ•„์š”ํ•œ ํด๋ž˜์Šค ์ •์˜ ์ฝ”๋“œ๋‹ˆ, ์•„์–˜ ์ฒ˜์Œ๋ถ€ํ„ฐ ํด๋ž˜์Šค ์ •์˜๋ฅผ ๋ณ„๋„ ๋ชจ๋“ˆ๋กœ ๋ถ„๋ฆฌ ์‹œ์ผœ๋‘๋ฉด, ํ•„์š”ํ•œ ๊ณณ์—์„œ ํ•ด๋‹น ๋ชจ๋“ˆ๋งŒ import ํ•ด์„œ pickle dump / load ํ•˜๋Š” ๊ฒƒ์ด ๊ตฌ์กฐ์ ์œผ๋กœ ๊ฐ€์žฅ ์•ˆ์ „ํ•˜๊ณ  ํด๋ฆฐ ํ•œ ์ฝ”๋“œ๊ฐ€ ๋˜๊ฒ ์ฃ ?

# models.py
class ModelA(object):
    def __init__(self, ...):
        pass

class ModelB(object):
    def __init__(self, ...):
        pass
  
##########################################################
# save_model.py
import pickle
from models import ModelA, ModelB

def get_model_class(params):
    if params == 1:
    	return ModelA
    else:
        return ModelB
        
model_class = get_model_class(option)
model = model_class()

with open("model.pkl", "wb") as f:
    pickle.dump(model, f, pickle.HIGHEST_PROTOCOL)
    
##########################################################
# main.py
import pickle
from models import ModelA, ModelB

model = pickle.load(open("model.pkl", "rb"))
...

๋‚˜๋จธ์ง€ ์œ ํ˜•๋“ค๋„ ์›์ธ์€ ๋ชจ๋‘ ๋™์ผํ•œ๋ฐ, ์ƒํ™ฉ์ด ๋‹ค๋ฅธ ๊ฒƒ๋ฟ์ž…๋‹ˆ๋‹ค.

ex2. ๋ฉ€ํ‹ฐํ”„๋กœ์„ธ์‹ฑ ํ™˜๊ฒฝ์—์„œ์˜ can't pickle local objects

์ด์ œ, pickle์ด ์•ˆ ๋˜๋Š” ๊ฒฝ์šฐ์˜ ์ฃผ์š” ์›์ธ์„ ์•Œ์•˜๋Š”๋ฐ, ๊ทธ๋Ÿผ ์„ ํƒ์ ์œผ๋กœ ํด๋ž˜์Šค๋ฅผ ๊ณ ๋ฅด๊ณ  ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ฒฝ์šฐ ์™ธ์— ์–ด๋–ค ๊ฒฝ์šฐ์— ์ฃผ๋กœ ๋ฐœ์ƒํ• ๊นŒ์š”?

multiprocessing์„ ์ด์šฉํ•  ๋•Œ, targetํ•จ์ˆ˜๋ฅผ ๋กœ์ปฌ ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•  ๊ฒฝ์šฐ์—๋„ ๋ฐœ์ƒ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

๊ทœ๋ชจ๊ฐ€ ์ œ๋ฒ• ์žˆ๋Š” ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ์—์„œ๋„ ์ข…์ข… ๋ณด์ด๋Š” ์˜ค๋ฅ˜ ์œ ํ˜•์ด๋‹ˆ.. pickle์˜ ์ •์ฒด๋ฅผ ์•Œ์•„๋„ ํ”ผํ•˜๊ธฐ๊ฐ€ ์‰ฝ์ง€ ์•Š์€ ์˜ค๋ฅ˜์ธ ๊ฒƒ ๊ฐ™๋„ค์š”.
https://github.com/pyg-team/pytorch_geometric/issues/366

import multiprocessing

def main():
    a="string"
    def task1():
        print(a)
        
    job1 = multiprocessing.Process(target=task1, args=(a))
    job1.start()
    job1.join()
    
main()

multiprocessing ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋‚ด๋ถ€์—์„œ๋„ target ํ•จ์ˆ˜๋ฅผ ์ „๋‹ฌํ•  ๋•Œ pickle์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
๋”ฐ๋ผ์„œ worker process์—์„œ๋Š” ์œ„์˜ ์˜ˆ์ œ์—์„œ ๋กœ์ปฌ ํ•จ์ˆ˜์ธ task1()์„ ์•Œ ์ˆ˜๊ฐ€ ์—†์–ด pickle ์˜ค๋ฅ˜๋ฅผ ์ผ์œผํ‚ต๋‹ˆ๋‹ค.

Traceback (most recent call last):
  File "/Users/naubull2/test.py", line 10, in <module>
    main()
  File "/Users/naubull2/test.py", line 8, in main
    job1.start()
    ...
  File "/Users/naubull2/opt/anaconda3/lib/python3.9/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/Users/naubull2/opt/anaconda3/lib/python3.9/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'function.<locals>.task1'

์ด ์—ญ์‹œ 1, 2์˜ ํ•ด๊ฒฐ์ฑ…์ฒ˜๋Ÿผ, ์ „์—ญ์œผ๋กœ ์˜ฎ๊ธฐ๊ฑฐ๋‚˜, global๋กœ ์„ ์–ธํ•ด ์ฃผ๋ฉด ๋˜๊ฒ ์ฃ ?
3์˜ ํ•ด๊ฒฐ์ฑ…์ฒ˜๋Ÿผtask1() ํ•จ์ˆ˜๋ฅผ ๋ณ„๋„ ๋ชจ๋“ˆ๋กœ ๋–ผ์–ด๋‚ด๋Š” ๊ฒƒ๋„ ๋ฐฉ๋ฒ•์ผ ํ…๋ฐ, ์ด๊ฑด ๊ฒฝ์šฐ์— ๋”ฐ๋ผ์„œ ๊ฒฐ์ •ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ๋ณดํ†ต job์„ ์‹คํ–‰ํ•˜๋Š” ๊ณณ์—์„œ job์„ ์ •์˜ํ•  ๋งŒํผ ๊ฐ„๋‹จํ•œ ๊ฒฝ์šฐ๋„ ๋งŽ์œผ๋‹ˆ๊นŒ ์ด๋Ÿด ๋• ๊ตณ์ด task1() ํ•จ์ˆ˜ ํ•˜๋‚˜๋งŒ ๋ชจ๋“ˆ๋กœ ๋–ผ์–ด๋‚ด๋Š” ๊ฒƒ์€ ๋ถ€์ž์—ฐ์Šค๋Ÿฌ์šธ ํ…Œ๋‹ˆ๊นŒ์š”.

ex3. lambda ํ•จ์ˆ˜ ์‚ฌ์šฉ ์‹œ can't pickle local objects

์ด์ฏค ๋˜๋ฉด lambda ํ•จ์ˆ˜๋Š” ์ž„์‹œ ํ•จ์ˆ˜์ด๋‹ˆ, ๋‹น์—ฐํžˆ pickle ๋˜์ง€ ์•Š์„ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๊ฒ ์ฃ ?

import pickle

var = pickle.dumps(lambda x, y: x+y)

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์˜ค๋ฅ˜๋ฅผ ์œ ๋ฐœํ•ฉ๋‹ˆ๋‹ค.

Traceback (most recent call last):
  File "/Users/naubull2/test.py", line 2, in <module>
    var=pickle.dumps(lambda x, y: x+y)
_pickle.PicklingError: Can't pickle <function <lambda> at 0x104c82040>: attribute lookup <lambda> on __main__ failed

lambda ํ•จ์ˆ˜๋ฅผ pickle ํ• ... ์ผ์ด ์žˆ๊ฒ ๋ƒ๋งˆ๋Š”.. ์ด๋Ÿฐ ๊ฒฝ์šฐ๋Š” ๊ฐ„๋‹จํ•œ helper ํ•จ์ˆ˜๋ฅผ ํ•˜๋‚˜ ์ •์˜ ํ•˜๊ณ  ์ €์žฅํ•˜๋Š” ํŽธ์ด ๊ทธ๋‚˜๋งˆ ํ•ด๊ฒฐ์ฑ…์ด ๋˜๊ฒ ๊ตฐ์š”.

# ๊ฒฐ๋ก 

๊ธฐ๋ณธ์ ์œผ๋กœ pickle์„ ๋„ˆ๋ฌด ๋งน์‹ ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์ด ์ข‹์ง€๋งŒ, pickle ์ž์ฒด๋Š” ๊ณ ์„ฑ๋Šฅ serializer์ด๊ธฐ ๋•Œ๋ฌธ์—, ์‚ฌ์šฉ์„ ํ•ด์•ผ ํ•œ๋‹ค๋ฉด, ์ •์ฒด๋ฅผ ํ™•์‹คํ•˜๊ฒŒ ์ดํ•ดํ•˜๊ณ  ์‚ฌ์šฉํ•ด์•ผ๊ฒ ์ฃ .

๋ฌผ๋ก  ๋‚ด๊ฐ€ ์ง์ ‘ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•„๋„ ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋‚ด์—์„œ pickle์„ serialize ์ˆ˜๋‹จ์œผ๋กœ ์“ธ ์ˆ˜ ๋„ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ์œ„์˜ ์˜ˆ์‹œ๋“ค์ฒ˜๋Ÿผ ์ฃผ๋กœ ๋ฐœ์ƒํ•˜๋Š” ์›์ธ์€ scope ๋ฌธ์ œ์ด๋‹ˆ ์ด๋ฒˆ ํฌ์ŠคํŒ…์„ ํ†ตํ•ด pickle ์˜ค๋ฅ˜๋ฅผ ํ•ด๊ฒฐํ•˜์‹ค ์ˆ˜ ์žˆ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

๋ฐ˜์‘ํ˜•