reinforcement learning 1

210914 - DRF, RL(openai gym ๊ฒŒ์‹œ๊ธ€ ์ถ”์ฒœ), ๊ทธ๋ฆผ

DRF๋ฅผ ์„ฑ๊ณตํ–ˆ๋‹ค. auth๊ฐ€ ๊ธฐ๋ณธ์ ์œผ๋กœ ๊ด€๋ฆฌ๋˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ์„ค์ •์„ ํ•˜์ง€ ์•Š์œผ๋ฉด ์•ˆ๋˜๋”๋ผ. ๊ทธ๋ž˜๋„ views.py, urls.py ๊ทธ๋ฆฌ๊ณ  serializer๊นŒ์ง€ (๋‹คํ–‰ํžˆ) ์–ด๋Š์ •๋„ ์ดํ•ด๋„๊ฐ€ ์ƒ์Šนํ–ˆ๋‹ค. ์ด์ œ ๋””์ž์ธ ์‹œ์•ˆ๋ฐ›๊ณ , ํ”„๋ก ํŠธ-๋ฐฑ์—”๋“œ ์™”๋‹ค๊ฐ”๋‹ค ํ•˜๋ฉด์„œ ๋ฐ์ดํ„ฐ๋„ ๋„ฃ์–ด๋ณด๋ฉด django ์‹ค์Šต์€ ์ž์—ฐ์Šค๋ž˜ ๋Š˜ ์ˆ˜ ์žˆ์„ ๋“ฏํ•˜๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์˜ค๋Š˜ RL์„ ์œ„ํ•ด์„œ open ai gym์„ colab์—์„œ ์ญ‰ ๋Œ๋ ค๋ดค๋‹ค. ์—ญ์‹œ env.render()๊ฐ€ ๋ฌธ์ œ์˜€๋‹ค. ์ด ๋ถ€๋ถ„์€ ๊ทธ๋ƒฅ window์—์„œ๋งŒ ๋œ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋Š”๊ฒŒ(display๊ฐ€ ์žˆ์œผ๋ฉด ๋œ๋‹ค) ๋ง˜ ํŽธํ• ๋“ฏ. ์ฐธ๊ณ ํ•œ ์ฝ”๋“œ https://www.anyscale.com/blog/an-introduction-to-reinforcement-learning-with-openai-gym-r..

Life 2021.09.14