Docker is all the rage right now. Well containers in general. I’ve looked into Docker before for work before and it just didn’t fit with what the company needed at the time. I always wanted to take a look at Docker in particular for my own personal projects and recently I got the chance to do just that. Late one night I decided it would be a good time to do some cleanup of own of my project servers, namely the one that hosts the Reddit Media Scraper (RMS). As the name implies, it scrapes Reddit… for media… yeah… Anyway, this is just a personal project to teach myself how to access restful APIs (it uses not only Reddit’s API but also Imgur’s, Gfycat, and some others too), web scraping and functional Python programming. Well, as I was cleaning out files in my less than alert state, I accidentally destroyed the entire RMS directory. Yarp.
Since I had to restore all the app files, I decided to just wrap it all up into a Docker container and learn something in the process. I learned a little about Docker the first time after looking into the matter. Just had to learn how to build a container using a Dockerfile, which didn’t seem very hard to do. Looked up the Dockerfile reference documentation, started up Sublime Text, and in 5 mins I had it done. It was pretty basic. Used the official ubuntu image, installed python3 and python3-pip, copied the code, installed requirements, and set the command to run on container start. Tested it out and it worked! Well… it sort of worked.
I had the title of every link that was proceeded printed out to stdout and some titles had non-ASCII characters which would cause an UnicodeEncodeError to occur when ran inside the container. Ok, not a problem I’ll just convert it. So that’s what I tried first… only to realize that the problem is with the container not supporting UTF-8 and not a problem with the RMS or Python 3. Some quick googling revealed that by default Docker containers are set for ASCII. Suggestions for changing the container from ASCII to UTF-8 by setting the ENV LANG=UTF-8:
docker container run —name rms -e LANG=UTF-8 rms
Which didn’t work for me… after some more googling and testing I eventually found that I had to set LANG=C.UTF-8 for it to work. Updated the Dockerfile to set that ENV and gave it a test. It worked! Took me about 2 hours total from getting things working manually (i.e. running /bin/bash in the Ubuntu container), writing the Dockerfile, and finally getting everything working and deployed successfully. Pretty cool, I can see why developers like Docker so much.