Thursday, March 5, 2015

How to Setup Django on an AWS EC2 Instance Using VirtualEnv

Setting up an ec2 instance is as easy as following the launch wizard provided by Amazon, and as such will not be covered in this tutorial (to setup ec2 for yourself, you can follow Amazon's guide found here). After following the instructions for setting up a Ubuntu server, you should be ready to follow the rest of this tutorial. Though the server used is Ubuntu, the steps will be fairly similar for any Linux distribution (just use that distros package manager calls instead). You can also follow this tutorial for connecting to an EC2 instance from Windows if you are unfamiliar with the process.

Update Ubuntu


The first thing that you should always do when starting a new server is running an update call on the server. This provides all of the security updates and functional fixes that have been released since Amazon took the image of Ubuntu that you are using.

To do so, enter the command:

sudo apt-get update

A bunch of packages will be downloaded and installed on the operating system. Normally this step doesn't take more than a minute or two, and once you are done you can continue setting up the server without having to restart (a nice advantage over the typical Windows update cycle).

Update the Distribution of Python to Python 3


NOTE: If you want to just stick with Python 2 instead of going through these steps, then skip ahead to the next section and treat python 3 references as though it were python 2 by simply removing the 3 in the call.

Before you do anything with your current installation's python, you need to take note (write on a piece of paper or store on a notepad document) of what version is the default version of python. You do so by running:

python --version

The above command will return something like Python 3.2.3. Make sure to make note of this as it will be important going forward.

My personal opinion is that Python 2, while great, should be deprecated and the world should move to Python 3. I won't go into an explanation here as to why I feel this way, but it is easy enough to do and writing things in Python 3 will ensure that your code will continue to work when (if ever) they decide to finally stop supporting Python 2.

To install python 3, you should run the following (this may not be necessary for server version 12.10 and up, see here):

sudo apt-get install python3

A quick Google search brought up some stackoverflow.com answers that seem to indicate that it is a bad idea to switch the system default python version to Python 3 (primarily because Python 2 and 3 are not compatible) and will likely break some scripts that rely on Python 2. I am not a Linux expert by any means, and since this is a practical tutorial on how to set up a server that will work, we will do as we are told (see more about it here). 

To set up Python 3, we will create an alias by doing the following command:

echo 'alias python=python3' >> .bashrc

The above command will edit the default bash shell setup (or, if you are not familiar with Linux at all, the command prompt you see through a terminal to your Linux box) with an alias to the word python that will point to python3. You will not see the change until you reconnect. To make sure it works, reconnect to the server and enter:

python --version

It should now be Python 3.x.x


Symlink Your Python Executable


We will also want to setup a symlink (basically an aliased path to a directory) to point /usr/bin/python to Python 3, do the following:

Make the directory:

sudo mkdir ~/bin/python -p

Then run:

sudo ln -s /usr/bin/python3 ~/bin/python

Essentially we just made a path called /home/ubuntu/bin/python that points to the Python interpreter in /usr/bin/python3. This link will be useful in the next step.

NOTE: If you didn't setup Python 3, then you will need to replace python3 with python2.

Install Python Package Manager


Pip is the preferred way by many to manage packages specific to python. There are other ways, but pip is so easy to use that it doesn't make a lot of sense not to use it. We will need to install pip (or some other package manager if we want to make this easy), before we continue. You can try another python package manager, but this tutorial is specific to pip.

To install pip, run:

sudo apt-get install python-pip

Pip should now be installed and ready to use.

Setup a VirtualEnv


Several places online encourage the use of virtualenv to run your Django instance. Since it is not a very hard thing to do, we will set it up as well. I will not go into detail as to why it is a good idea, but you can Google the reasons yourself if you want. We will follow the install instructions found at http://docs.python-guide.org/en/latest/dev/virtualenvs/.

To install virtualenv, call:

sudo pip install virtualenv

Since we set a symlink in the previous step to Python 3, we can use it to set up the virtualenv with the Python 3 interpreter:

virtualenv - p ~/bin/python venv

Note that it is possible to sidestep the symlinking and do something like virtualenv -p /usr/bin/python3, but symlinking provides the advantage of being able to change the python version without having to update this call. Essentially, if we wanted for some reason to move back to Python 2 (or if you never went to Python 3 using the steps above), we could set the symlink to /usr/bin/python2 and the virtualenv wouldn't know the difference (unless we broke compatibility by going back to python 2). It is therefore a good idea to make your calls using symlinks as it makes for more flexibility to change things in the future. Although in the case where you are doing this manually, it is probably fine to direct-link to the python version you want. It is good practice to write things in such a was as to makes portability into scripts much easier.

Now we need to activate the virtualenv:

source venv/bin/activate

Now that it is setup, you should see the (venv) on the left side of your command prompt indicating that you are in a virtual environment. Virtual environments only last as long as the shell is alive, so you will need to run the above command each time you want to edit your venv after closing the shell (or after running deactivate). Go ahead and enter deactivate for the next step

NOTE: Python 3 comes with a virtual environment package built in called, conveniently, venv. I didn't know this until after I started writing this tutorial. It is basically the same as virtualenv, and it is likely easier to use. I would read about it here: https://www.python.org/dev/peps/pep-0405/, or the documentation here.

Setup PostgreSQL


PostgreSQL is the recommended database for Django as it is the most supported of all the databases. You will need to install it on your system by doing the following:

sudo apt-get install postgresql

We now need to log in to the server and setup the postgres user (probably a good idea to try and set up a different user other than postgres since postgres is the superuser for your database, but for now we can just use postgres). Do so by entering the psql (postgres database management prompt) by typing the following:

sudo -u postgres psql

You should see a prompt that looks like:

postgres=#

This is the command prompt for postgres and will allow us to perform operations on the database. First we set up the user password:

ALTER USER postgres PASSWORD '<password here>';

Which should be followed by the words ALTER ROLE.

Now we will create our database:

CREATE DATABASE <db_name_here>;

You should then see CREATE DATABASE to confirm that it was created.

Now you will need to install a few other packages so Django will be able to talk to the server. Run the command:

sudo apt-get install postgresql-server-dev-x.x

where x.x is the version number of your PostgreSQL database. You can find out the version of PostgreSQL by running the following:

sudo /etc/init.d/postgresql stop

will show you the version that was stopped. Restart it be running:

sudo /etc/init.d/postgresql start

Then run the command:

sudo apt-get install python3-dev

which will install some python files that are not included in the original python 3 install. Next you will reactivate your virtual environment:

source venv/bin/activate

and then run:

pip install psycopg2

which installs the actual interface between Django and the PostgreSQL server. I don't know the reasons behind why all of these files are needed except that often times, when developing, developers will use files found in a package to help them speed up development time. That is what is going on here and thus requires us to install so many additional packages.

Install Django Inside VirtualEnv


Start up the virtual environment again:

source venv/bin/activate

Now run the command:

pip install django

Note that you no longer have to use sudo in front of pip to install packages. This is one of the best benefits of using virtualenv. 

Next we will make a symlink to the python 3.x site-packages directory, to be used later in the apache setup:

sudo ln -s ~/venv/lib/python3.x/site-packages  /var/lib/python/site-packages

where the x in 3.x is the name of the directory for the Python 3 version you are using.

Setup Your Django Project


If you have a Django project that you have already built, then you have a variety of ways you can get it onto your server. We will focus on the case of when you already have a Django project built and leave the other case up to the user to figure out (see Django's excellent documentation on how to get started with Django for building a Django app, though I recommend building it on your local computer first). If you don't know how to put files onto a server, you can follow up on how to do so with my tutorial here (the part on connecting with WinSCP is towards the bottom and is a little outdated, but should be sufficient). I will not go over how to get the project onto your machine, my previous tutorials details how to do so. We will start off assuming that you have your project on your server.

To begin, make a symlink to your django files wherever they may be (for ease of finding later in this tutorial):

sudo ln -s /path/to/django/files/directory /var/lib/<projectName>

We will need this path when we have setup apache on our server.

After Django is installed and your project setup, we will need to setup our Apache server. We could use just about any other server, such as nginx or lighttpd, but will stick with Apache because of its popularity and because I know it already. An added bonus is that Django has documented how to setup Django with Apache, so why not make it simple?

Setup Apache Server - Install


First, if inside your virtual environment, deactivate your virtual environment:

deactivate

Now enter the command:

sudo apt-get install apache2

Pretty straightforward. You can navigate to your server's url and you should see the Apache default page, which looks something like this:

It works!

This is the default web page for this server.
The web server software is running but no content has been added, yet.
Once this is complete, you will need to install a plugin for apache called mod_wsgi. Mod_wsgi needs to be compiled with the same version of python that your scripts are running. This creates a major headache, and I couldn't find an easy way with package managers to simply point the install to the right python version. Therefore, if you are running a version of Ubuntu that doesn't have Python 3 as the default version, you will need to do this (hopefully you checked what the system default python is as instructed above; if not, then it is up to you to figure it out).

To setup your mod_wsgi to work with apache in python 3, do the following (taken from this stackoverflow.com answer):

Install more packages needed to modify apache2 mods:

sudo apt-get install apache2-dev

Change directories to a common place to store source code:

cd /usr/local/src

Download and install the mod_wsgi code from the code repository. The following steps are all needed to take the code from the repository and make it into something Linux can use:

sudo apt-get install make

sudo wget https://modwsgi.googlecode.com/files/mod_wsgi-3.4.tar.gz

sudo tar -zxvf mod_wsgi-3.4.tar.gz

cd mod_wsgi-3.4/

sudo ./configure --with-python=/usr/bin/python3.x

where x is the version of Python 3 on your machine.

sudo make

sudo make install

Now you have mod_wsgi in an executable binary format and can be loaded into apache. To tell apache to load this module, we will have to edit the apache configuration, which we cover in the next section.

NOTE: If you decided to stick with Python 2, things are a lot easier. To install, simply run:

sudo apt-get install libapache2-mod-wsgi

You are now ready to setup Django to run with Apache.

**UPDATE: go to https://launchpad.net/ubuntu/trusty/+package/libapache2-mod-wsgi-py3 instead if using latest edition of Ubuntu. This is for python 3.

Setup Apache Server - Configure to Run with Django


One reason I really recommend Django over other web frameworks (at least for python users) is that the documentation is excellent. Django comes with a tutorial of deploying Django with Apache. I will attempt to distill the finer points here, but you can always see the Django tutorial at https://docs.djangoproject.com/en/1.7/howto/deployment/wsgi/modwsgi/.

First thing we need to do is navigate to the /etc/apache2 directory:

cd /etc/apache2

If you look in the directory (ls command), you will see several different files, each of them dealing with different aspects of apache's configuration. The apache2.conf is the main apache configuration file and any configuration changes you make there will be used. However, it is generally a good idea to leave custom changes to apache's configuration outside of the main config file. Therefore, apache comes with a httpd.conf file, which is a user-defined configuration file that is added to the main apache2.conf file. It is good practice to edit this file as it helps to segment the changes you made with what comes standard with apache. All we need to do is make sure that the apache2.conf file includes httpd.conf.

Open the apache2.conf file (if you don't know how, there are a number of ways to do so, each of which can be tricky to use if you don't know Linux). Since this is a very basic tutorial, we will use the text editor vim to open our file. It has some nice features to help read text files from a terminal, and it is widely regarded as one of the most useful text editors on Linux. Just note that if you are new to vim, only enter the commands you see here or else you will be totally confused as to what is going on.

Enter:

sudo vim apache2.conf

Your screen should now have a bunch of blue text. What you are reading are the instructions for how to use apache. Take some time to read it as it does give some useful information, but for our purposes we are just going to use the up and down arrow keys (you can also use the page up and down keys to scroll a whole page) to find what we want.

After scrolling down for a bit you should see:

Include httpd.conf

If this line is in there then you are ready to edit the httpd.conf file. Exit out of your current view by typing the keys

:q

and then pressing enter. This will return you back to your regular command prompt. If that line is not there, then enter the following sequence of commands:

  1. Press the i key.
  2. Write Include httpd.conf on its own line.
  3. Press the Esc key.
  4. Enter the character sequence :wq
  5. Press Enter.

You have successfully added the httpd.conf file to your apache2.conf file.

Now open the httpd.conf file as follows:

sudo vim httpd.conf

Using the same basic pattern described above for editing a file in vim, write the following config information in your httpd.conf file (do not save after this, more will be written):


WSGIDaemonProcess <projectName> python-path=/var/lib/<projectName>:/var/lib/python/site-packages
WSGIProcessGroup <projectName>
WSGIScriptAlias / /var/lib/<projectName>/<projectName>/wsgi.py
Alias /static/ /var/lib/<projectName>/static/
Note that <projectName> should be the name of your Django project.

The above configuration is telling apache to run your Django project that we set up previously and is also set to retrieve your static files, like your css and js files, from the Django static folder directly. Django strongly discourages the use of Django as a means to send static files to a user, so that is why we tell apache where to look for static files. This presupposes that your static files are pointing to ./static/ in your html.

If Python 3 is your python version, then do the following step; skip it if you stuck with Python 2. We are going to add the configuration now that tells apache to load the mod_wsgi executable. Write the line:

LoadModule wsgi_module /usr/lib/apache2/modules/mod_wsgi.so

Save the file as discussed in steps 3 through 5 in the vim editing example above. You are done setting up apache.

Restart Apache


After you have completed all of this, you are ready to go. Restart apache by issuing the following command:

sudo service apache2 restart

Navigate to your server's homepage and you should now be seeing the homepage of your django app!

Troubleshooting


I hope that this helps out aspiring developers in the future. Some of the pitfalls I faced when first setting up a Django app are as follows:
  1. Couldn't Find Anything with PIP - I didn't have my https port open on AWS. Though it seems like such an obvious reason for some of the issues I was facing, but it wasn't immediately apparent when pip was failing that it just couldn't see the pip repository. PIP uses https to get the packages you need on your system, so it requires that https be open. I didn't know it at the time, and I spent hours working with the extremely unhelpful error messages before I figured out the solution.
  2. Apache is Returning 500 Errors - It took me awhile to figure out why my first install of Django wasn't working, so I had to do a lot of searching just to figure out where the log files for apache were so I could see what is happening. For our purposes (since it really depends on the Linux distribution for where the log files are located) you can find the log files under /var/log/apache2. The most useful troubleshooting log is, naturally, error.log. Use the tail command to see what happened last: tail -100 /var/log/apache2/error.log
  3. My Apache error.log tells me that permission is denied with file /path/to/file/__pycache__ - Apache runs as user www-data and as such has very limited space in which it can edit files on the system (for obvious security reasons). You will need to edit the folder that stores your file (most likely going to be /var/lib/<projectName> as we setup above). To do so, enter the following: sudo chown -R www-data /var/lib/<projectName> assuming that the __pycache__ is within this directory. Then run the command sudo chmod -R 775 /var/lib/<projectName> . Note that this last command is somewhat insecure but should be sufficiently secure for now. Most security settings will have to be adjusted when you decide to really get serious about security, so we won't bother with it now.
  4. Makemigrations is Giving Me Permission Denied Errors - Permissions need to be edited to allow the ubuntu user (the default login user) to make edits as well. Do the following: sudo chown -R www-data:ubuntu /var/lib/<projectName> . For good measure, run the command sudo chmod -R 775 /var/lib/<projectName> .
  5. I can't see my media files - This one was really obvious but somehow I missed the explanation Django provided. You need to create another configuration entry in httpd.conf that points /media/ calls to your media folder (wherever that may be). Google Django set up media files for more information.
  6. VirtualEnv is getting a permission denied error. - This problem occurred because I failed to symlink my python interpreter correctly. Linux doesn't always (perhaps not so often) gives good error messages, and I had to bang my head for awhile with this one. After I removed the symlink to my python interpreter and remade it the correct way, everything worked. But, to be thorough, here is a resource you can use.
  7. Apt-get not working because lock can't be removed. - Again, another stupid problem with me just being impatient and ending a process before it finished and Linux couldn't recover. Basically you just have to end the apt-get processes and then remove the lock file if it doesn't work. See more here.
  8. I'm having trouble migrating my models to the database. - Remember that anything you do with Django has to be run inside the virtual environment. Before running migrations or other Django management, you must run source ~/venv/bin/activate .
  9. There is a problem with my virtual environment saying that I do not have setuptools installed when attempting to install pyscopg2. Follow the answer here: https://www.reddit.com/r/learnpython/comments/3jlbep/error_msg_pip_setuptools_must_be_installed_to/

Sources


I used a plethora of sources to make this work. As I have mentioned before, I am not a system administration guru, but I am fairly decent with Linux. That being said, several things about setting up a server were not immediately straight-forward to me since error messages on Linux can be extremely unhelpful at times. To help me get to where I am now, I have listed several of the sources I used.
  1. http://www.tonido.com/blog/index.php/2013/11/25/working-with-virtualenv-on-django-projects/#.VPjDHvnF_DQ
  2. https://www.digitalocean.com/community/tutorials/how-to-run-django-with-mod_wsgi-and-apache-with-a-virtualenv-python-environment-on-a-debian-vps
  3. https://virtualenv.pypa.io/en/latest/userguide.html#usage
  4. https://docs.djangoproject.com/en/1.7/topics/install/
  5. https://docs.djangoproject.com/en/1.7/howto/deployment/wsgi/modwsgi/
  6. https://www.digitalocean.com/community/tutorials/how-to-install-and-use-postgresql-on-ubuntu-14-04
  7. http://www.postgresql.org/download/linux/ubuntu/
  8. http://stackoverflow.com/questions/1951742/how-to-symlink-a-file-in-linux
  9. http://ubuntuforums.org/showthread.php?t=2141770
  10. http://askubuntu.com/questions/197626/where-is-a-postgresql-9-1-database-stored-in-ubuntu-12-04
  11. http://www.postgresql.org/message-id/006201c74b23$17cce130$9b0014ac@wbaus090
  12. http://askubuntu.com/questions/15433/unable-to-lock-the-administration-directory-var-lib-dpkg-is-another-process
  13. https://www.digitalocean.com/community/tutorials/how-to-read-and-set-environmental-and-shell-variables-on-a-linux-vps
  14. http://stackoverflow.com/questions/16618071/export-a-variable-to-the-environment-from-a-bash-script-without-sourcing-it
  15. http://askubuntu.com/questions/320996/make-default-python-command-to-use-python-3
  16. http://askubuntu.com/questions/401132/how-can-i-install-django-for-python-3-x
  17. https://docs.djangoproject.com/en/1.7/faq/install/
  18. http://stackoverflow.com/questions/5846167/how-to-change-default-python-version
  19. http://askubuntu.com/questions/244544/how-do-i-install-python-3-3
  20. http://docs.python-guide.org/en/latest/dev/virtualenvs/
  21. http://stackoverflow.com/questions/22938679/error-trying-to-install-postgres-for-python-psycopg2
  22. http://stackoverflow.com/questions/20913125/mod-wsgi-for-correct-version-of-python3
  23. http://askubuntu.com/questions/483744/config-status-error-cannot-find-input-file-makefile-in
  24. https://code.google.com/p/modwsgi/wiki/CheckingYourInstallation
  25. http://httpd.apache.org/docs/2.2/mod/mod_so.html
  26. https://launchpad.net/ubuntu/trusty/+package/libapache2-mod-wsgi-py3


No comments:

Post a Comment