GridFS, Nginx and Python Imaging Library

After spending quite of bit of time on the server setup and beginning development, I realized that I missed a key component for the some of the basic site functionality.

Image uploads.

MongoDB provides this support through GridFS which is a specification for storing large objects in the database. While benchmark results show that serving images directly from the file system is faster, I decided to give it a shot anyway.

Also, there is an Nginx module that will allow you to serve content directly from GridFS.

In a previous post, I used aptitude to install Nginx from the universe repository. Since the addition of the nginx-gridfs module required recompiling Nginx, I decided to upgrade to the latest stable Nginx release (1.0.5).

It was a bit trickier than I expected, but this is my notes on compiling Nginx on Ubuntu 10.04 LTS and trying to mimic the install as it is delivered from the Ubuntu repositories (with some minor improvements).

In my earlier post I covered a lot of the initial server build that included most of the essential build components for compiling from source. The only addition is the PCRE libraries.

apt-get -y install libpcre3 libpcre3-dev

A few folders need to be manually created prior to the install. I would have assumed that they would be created during as they are specified in the configure statement, but they are not.

mkdir /var/lib/nginx
mkdir /var/lib/nginx/body
mkdir /var/lib/nginx/proxy
mkdir /var/lib/nginx/fastcgi

Now, grab the nginx-gridfs module. It is needed prior to compiling Nginx.

cd /usr/local/src/
git clone https://github.com/mdirolf/nginx-gridfs/
cd nginx-gridfs
git submodule init
git submodule update

That should leave you with a folder in /usr/local/src/ that contains the nginx-gridfs module. You will use this path during the configure step.

Let’s start the Nginx compilation process:

wget http://nginx.org/download/nginx-1.0.5.tar.gz
tar xzvf nginx-1.0.5.tar.gz
cd ../nginx-1.0.5
./configure \
--sbin-path=/usr/sbin \
--conf-path=/etc/nginx/nginx.conf \
--http-log-path=/var/log/nginx/access.log \
--error-log-path=/var/log/nginx/error.log \
--pid-path=/var/run/nginx.pid \
--lock-path=/var/lock/nginx.lock \
--http-client-body-temp-path=/var/lib/nginx/body \
--http-proxy-temp-path=/var/lib/nginx/proxy \
--http-fastcgi-temp-path=/var/lib/nginx/fastcgi \
--with-debug \
--with-http_flv_module \
--with-http_ssl_module \
--with-http_dav_module \
--with-http_gzip_static_module \
--with-http_realip_module \
--with-mail \
--with-mail_ssl_module \
--with-ipv6 \
--add-module=/usr/local/src/nginx-gridfs/
make && make install

You should now have a working copy of Nginx in the /etc/nginx path and binary in /usr/sbin.

Part of the Nginx install from Ubuntu includes a pretty decent tie-in with the services management functionality in the OS. The base compilation of Nginx does not include any of this, so you need to add it.

There is an excellent init script on Github provided by JasonGiedymin. A few of the parameters need to be modified as the file and folder locations are different. Since these steps have been taken from my Linode StackScript, I am using the sed utility to make these changes.

cd /usr/local/src/
git clone https://github.com/JasonGiedymin/nginx-init-ubuntu/
cp ./nginx-init-ubuntu/nginx /etc/init.d/
sed -i 's#NGINX_CONF_FILE="/usr/local/nginx/conf/nginx.conf#NGINX_CONF_FILE="/etc/nginx/nginx.conf#g' /etc/init.d/nginx
sed -i 's#DAEMON=/usr/local/sbin/nginx#DAEMON=/usr/sbin/nginx#g' /etc/init.d/nginx
sed -i 's#lockfile=/var/lock/subsys/nginx#lockfile=var/lock/nginx.lock#g' /etc/init.d/nginx
chmod +x /etc/init.d/nginx
/usr/sbin/update-rc.d -f nginx defaults

Now, we can easily start/stop Nginx as well as a number of other helpful commands.

service nginx start

Since it is installed, we need to configure it to work with gunicorn and Django.

mkdir /etc/nginx/conf.d
mkdir /etc/nginx/sites-available
mkdir /etc/nginx/sites-enabled
mv /etc/nginx/nginx.conf /etc/nginx/nginx.conf-bak
touch /etc/nginx/nginx.conf

This is the Nginx conf file, located in /etc/nginx/nginx.conf. You will notice that a lot of the paths and files used in the configure statement above are used in this file.

user www-data;
worker_processes  1;
error_log  /var/log/nginx/error.log;
pid        /var/run/nginx.pid;
events {
    worker_connections  1024;
}
http {
    include       /etc/nginx/mime.types;
    access_log 	/var/log/nginx/access.log;
    sendfile        on;
    tcp_nopush     on;
    keepalive_timeout  65;
    tcp_nodelay        on;
    gzip  on;
    gzip_disable "MSIE [1-6]\.(?!.*SV1)";
    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*;
}

The /etc/nginx/sites-enabled folder contains the configuration files for each site you want to host. In this particular case, there is only one site hosted on the server but this is how the Ubuntu package is delivered.

Create the configuration file, setup a symlink between the sites-available and sites-enabled paths.

touch /etc/nginx/sites-available/site_name.conf
ln -s /etc/nginx/sites-available/site_name.conf /etc/nginx/sites-enabled/site_name.conf

Here is a sample configuration file (the site_name.conf provided above). Since this particular file is intended to proxy requests from Nginx to gunicorn.

server {
    listen   80;
    server_name site_name;
    access_log /var/log/nginx/site_name.access.log;
    error_log /var/log/nginx/site_name.error.log;
    location / {
        proxy_pass http://127.0.0.1:8000/;
        proxy_redirect off;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        client_max_body_size 10m;
        client_body_buffer_size 128k;
        proxy_connect_timeout 90;
        proxy_send_timeout 90;
        proxy_read_timeout 90;
        proxy_buffers 32 4k;
    }
    location /static {
        alias    /srv/site_name/public/static/;
        expires 24h;
    }
    location /gridfs/ {
        gridfs site_name;
    }
}

While I have not tested the gridfs proxy yet, the sample provided above has been taken from the module documentation.

Since we are intending to store images in MongoDB using the ImageField fieldtype provided as part of the Django model specification. This fieldtype requires the Python Imaging Library (PIL).

This is provided through aptitude but Django will not recognize it unless it is installed in the virtualenv. This is easy to solve by downloading the source and installing it once the virtualenv has been activated.

source bin/activate
wget http://effbot.org/downloads/Imaging-1.1.7.tar.gz
tar xzvf Imaging-1.1.7.tar.gz
cd Imaging-1.1.7
python setup.py install
cd .. && rm -r Imaging-1.1.7 && rm Imaging-1.1.7.tar.gz

Now that the PIL is installed, the database can be synchronized without throwing a error message about a missing PIL.

Hopefully, this is the last time I have to re-visit the server components as I should have everything required to complete the rest of the project. The next few posts should hopefully cover project development efforts.

One thought on “GridFS, Nginx and Python Imaging Library

  1. Pingback: Django, Nginx and GridFS Con’t | Uncorrupted State

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>