Wednesday, December 5, 2018

Switching to Software Engineering as a Career

Recently my wife's friend reached out to me about her husband's desire to switch careers. Primarily he wanted to know how I went from being a Quality Assurance Engineer to a Software Engineer and had a variety of questions. Below is the email he sent me (names and a few other details changed), followed by my response. His situation that is not mentioned is that he also has a family with children.

Email

Hey Marc, 

This is Octavian, and your wife gave your email to my wife so that I could ask you a few questions about ways to transition into the software industry. I have a bachelors from <University> in Mechanical Engineering and am having a hard time finding a job in a place that we want to live. We are currently living in <the world somewhere> and it is just too expensive to live in this state, so we are really hoping to move back to Utah if possible. I know that there are a lot of opportunities in Utah for software development, and would be interested in learning more about possibly switching careers. 

Your wife said that you think the best way would be to teach myself how to code. What do you think are the most important coding languages to learn? What is being used the most, what are companies wanting their employees to know the most these days? Also, what type of things do you need to know for the coding challenges that you mentioned before? Do you have any suggestions for ways  to teach myself? Apps, videos, websites? Also, Andy said you are not a fan of coding bootcamps, but I was wondering more about your thoughts on that. And do you think that if I can effectively teach myself to code, is it really likely that I could get a job at a good company? Would they overlook not having formal education in programming? Thank you for being willing to help, I know it is a lot of questions, but I appreciate it. 

Thanks,
Octavian

Response

Hey Octavian,

I am happy to share what knowledge I have, but be aware that I am not an expert in the area of tech recruitment or what companies will be looking for. Also, my opinions and knowledge have come from my own experience and from what I have picked up speaking with others. For that reason, I would recommend taking everything I say as a good starting point, but definitely vet the ideas before making a decision. For example, I am not a fan of bootcamps (which i will get to shortly), but that doesn't mean you shouldn't do them. Indeed, it may actually be beneficial to do so (as we will discuss).

Bootcamps

There are several good bootcamps out there that will teach you the basics of programming and make you fairly competent in a small area of expertise (most of them teach website creation). Bootcamps are a great primer and launching point for many people who have never previously done anything technical or struggle with the basics of logic needed for software development. However, there are several cons to be aware of:
  • They are expensive. Expect to pay $10k+ for any bootcamp worth attending.
  • Many Bootcamps will help get you a job, but is it a job worth having?
    • Anecdotal evidence - we had three contractors come from a job placement program at Dev Mountain. Their job was manual quality assurance testing.
      • Manual QA Testing is losing favor in many companies and may not be something people hire for much in the future
      • Quality Assurance requires a specific mindset and personality to handle the mundane repetition inherent in the job
      • My time as a QA engineer was often filled with nothing to do, which I was required to find something meaningful to fill it with (which I spent learning to program)
      • Most QA jobs don't utilize at all the skills you learned in bootcamp, so you pretty much just paid a finders fee to the bootcamp (who is also paid a fee by the company you are placed with most of the time)
    • If the job is not in QA, is it going to be a job that pays well? I don't have any data on the types of jobs or placement rates or placement locations, so do some research to see what life after the camp offers
  • Most grads of the camp are not actually well prepared for the type of work they go into (again, anecdotal as most people I have talked to go into QA jobs, which skillset is very different)
  • Bootcamps tend to focus on the "hottest" tech versus the sound principles needed to succeed as an engineer
After considering those points above, I would mention that I don't think money spent on most education from a reputable source is ever wasted. In this case, the idea of "reputable" comes into question for me and I am not convinced that it has worked out as many hoped. More research on this would probably be good.

Alternatives to Bootcamps

The alternative approach to bootcamps are also largely based on my own experience and are anecdotal in nature. That being said, it doesn't mean that there isn't merit in my advice. Some of the best engineers here at my job are self-taught, and some have never even graduated from college. They also have been doing it for years and years, so that experience has propelled them to where they are today, but that doesn't mean you can't spend the time to get their yourself.

University - Still an option?

I know that for the most part people in your situation will not want to consider going back to a university. It is true that it is probably going to be more expensive and will be fraught with other issues. I wouldn't rule out the idea though of getting a BS in computer science or software engineering. It will mean a four year degree, but your odds of landing in a field that you actually want to be in is increased. 

Another option that I have seen which I am a fan of is going back to school for a Masters degree in something like Information Systems, Data Science, or Computer Science. These degrees may require some undergrad work before you can apply, but they will certainly make you far more employable (and at a higher salary I might add) then a bootcamp would offer. I have seen it done and it is not a bad idea. In my opinion (for what it's worth), the accreditation a University brings to your degree is worth far more than the certificate you will get from the bootcamp, not to mention a much deeper understanding of basic programming and software development principles. Also, you can get internships as a student at bigger companies that will pad your resume faster than a bootcamp will (and the likely jobs you would get from a bootcamp won't look as good). So I wouldn't count this option out without further investigation into costs and capacity to attend.
Learning to Code on Your Own

Learning to code by yourself is probably my favorite option for those who are truly driven to make a switch to software development but still want to maintain a quality of life for their family. My personal journey from being a Quality Assurance Engineer to a Software Engineer required that I spent a lot of time learning the fundamentals of programming that I didn't receive in school. It may be worth noting that my formal education in software development was minimal, really only consisting of three official classes and a few projects that I more or less struggled through on my own because of teacher incompetence.

Why do I think this is my favorite option? For the following reasons:
  • As mentioned above, I found teacher incompetence in school to be some of the biggest hangups I had later in getting a job. My teachers were not CS professors, so my experience is skewed, but it does highlight an issue you may run into going to University or Bootcamp.
  • So much of what you learn in these schools will be done by yourself anyways, meaning that you will pretty much end up being self taught regardless of the option you choose.
    • Worth mentioning that there is great value in having someone to look over what you have done and critique it. You can probably hire someone for a fee to critique your code and save lots of money in the process.
  • The projects you work on can be much more personally relevant and, therefore, more fulfilling and fun. Academia is so caught up in projects that are easy to grade that they don't make it fun or personally relevant.
  • Free online courses will be able to cut out most of the cruft that you don't actually need that a university degree will give you, and will be more in depth and useful than a bootcamp.
  • You can start now and don't have to worry about schedules. Online learning is great at fitting into anyone's schedule.
  • The experience is tailor made to your interests/career aspirations. Want to go into machine learning? Then skip all the stuff on website development and go straight to the goods.
  • Both paid and free options are available for you to learn, so you can use your budget more wisely to focus on learning needs.
Cons of this approach are plenitude, so here are a few that I think worth mentioning:
  • You are only as good as your ability to prove yourself. In other words, if you aren't given a chance to show what you can do, you won't ever get a job. This is where an accredited degree or bootcamp would come in most useful.
  • Your first job may be a much riskier job than you would otherwise receive. In other words, you may get a job at a small company doing very basic website design and getting paid little to do it. But, remember that experience is worth its weight in gold.
  • The industry is still relatively young and so it is uncertain what will happen in the future in regards to those who are self-taught with no experience. It may be that suddenly the industry will require certificates or degrees, so there is always that risk.
  • Without guidance from someone who is "in the know" of the industry it will be harder to know what you will need to focus on. Getting that sort of help for free or even cheap can be difficult.
  • Finding a mentor to do it for free is going to be hard. Paying someone to do it may also be difficult. Mentorship of some sort is very valuable so it may be good to think about the benefits a University or Bootcamp could provide if you can't find a mentor (unless you are confident you can do it alone).

Languages

Perhaps the thing I am most often asked is "what language should I learn?" when someone is interested in transitioning into this field, and so it is no surprise that you ask it to. The answer is, as always, "it depends." What are you looking to do? Likely you have no idea what field of software engineering you want to go into (they are quite different actually). Also, the idea that a particular language is going to get you a job is more or less irrelevant - a good programmer can pick up the majority of the syntax and useful features of a language within a month, and can be a useful contributor to a project of a new language even sooner than that. 

But, you have to start somewhere, and getting familiar with a language does have some bonuses when it comes time to get a job. Therefore, here are my loose recommendations:
  • Web Development: Javascript
  • Data Mining/Machine Learning: Python
  • Application Development: Java
  • Server Administration: Python
  • Embedded Devices: C or Java (not really sure, i don't know what language is used on these things)
  • Scripting Jobs (such as data mining jobs): Python
You may think from the list above that a good general purpose language to start with may be Python, and in some ways that is true. I warn you now though that Python does some wonky things that don't translate very well into other languages. I would recommend that you start with Javascript if you don't know what you want to go into as it has several advantages over the others (such as the size of the potential job market, and the fact that it bridges several different types of paradigms when it comes to programming in general). But do not let a particular field's chosen language stop you from learning the other languages - they each have some benefits and ideas over the others that will be useful in learning the skills you will need.

How to Learn and Why Coding Challenges are a Good Starting Point

I mentioned that coding challenges are a good place to learn basic programming principles because they do two things:
  1. It will test you on your ability to extend your knowledge you will learn from the courses you will take
  2. The majority of the interviews you will have (for better or worse) will test you on your skill in coding challenges and not on your ability to actually be a software engineer
Point 2 is a surprise for most people, but it is an unfortunate reality in our industry that doesn't seem to be going away anytime soon. These challenges are good at being able to assess your ability to think outside the box; but, like most jobs, the majority of your work is not going to require you to think outside of the box too much. Therefore, be cautious in how much time you spend in these challenges because they don't translate well into the actual work you will be expected to do.

That being said, coding challenges do a good job at also teaching you skills by simply trying to solve a hard problem and failing at it over and over and over. The best teacher in software development is failure and seeing how you failed so you can avoid it again.

Other ways to learn are obviously to take online courses, watch videos, read tutorials, and work on a personally relevant project. I have been working on several little side projects for years now and none of them have really ever taken off, but I can trace most of my skill and new experience from these projects. These projects have forced me to learn things that my everyday job just doesn't give me opportunities to explore.

Listed below are a few things I have found useful in my self-learning journey:
  • coursera.com
    • Has lots of great courses that play out like a traditional University course
    • You can earn certs from this site if you would like (of questionable value when it comes time for a job)
    • Several of the classes have interactive homework assignments that will provide feedback (all automated, so of varying usefulness)
    • Can be enrolled in as many as you like and can finish at your own pace
    • Many of the courses have others taking it at the same time that you can ask questions
    • https://www.coursera.org/learn/algorithms-part1 has been one of the most influential classes on my ability to program i have ever taken (by Stanford)
  • codecademy.com
    • Bite sized coding tutorials offered in several languages
    • much quicker to get through than the online lectures of coursera.com
    • more immediate feedback when trying to learn the code
    • pretty simple stuff, doesn't really handle more advanced concepts
  • youtube.com
    • great walk-throughs and discussions by experts
    • hard to determine who the expert is
  • egghead.io
    • paid courses
    • seems to focus on web development
  • udacity or udemy
    • same sort of thing, paid courses on a variety of topics
  • wikipedia
    • a great resource to introduce several topics of interest
  • hackerrank.com
    • a coding challenge website
    • good at giving you interesting problems scaled to your level
    • it is, of course, coding challenges and so see above
  • coderbyte.com
  • online textbooks
I would start off with the very simple things like codecademy.com or something similar and then move on to looking for more advanced topics to master. Perhaps the most important thing you can do is just use google to ask about the following topics (and to look for courses/lectures on these):
  • Data Structures and Algorithms
  • Object Oriented Design
  • Data Primitives
  • Functional Programming
  • Service Oriented Architecture
  • Distributed System Design
  • Application Programming Interface
  • Programming/Coding/Software Best Practices
I would also be sure to get a github account and start looking for projects that you may want to contribute to (such as Instructure's Canvas). Open Source software is a great way to learn, both by writing it yourself and by seeing how other people have tackled problems.

A Few Other Things

There are a few other things that most people don't really know about that would be invaluable in your ability to procure employment. These are just a few things and don't necessarily give you a great edge, but they are common tools and skills that you will use in the industry.
  • Version control tools
    • most common one is git (used on github). Tutorials are plentiful on using this tool. It is awkward at first to learn, but it is certainly a great tool and skill to have.
  • Project management boards
    • Trello or Jira are good examples (use Trello though, can use it now in your everyday life and is free and gives you a good grasp of how work is organized in the industry)
  • Learn SQL
    • Technically it is a language, but more importantly it will force you to understand database technologies.
    • Not listed under language because really this is a whole area of study that would give you a real edge if you were at least proficient
    • Most developers are only ok with it, but being good or even great will help in lots of ways
  • Data modeling
    • goes in hand with database tech and sql
    • a way to learn how to store data into efficient, queryable datastores
    • be sure it is a relational data model course you are taking
      • There are such things as no sql databases, but those should be avoided until you know what they are good for
      • Perhaps a primer on no sql would be good, but don't spend too much time learning about them
  • Cloud computing
    • Amazon Web Services is a good source to start as you can set up an entire stack for free
      • If you don't know what i mean by stack, that should be something to research and learn (the system stack, or computing stack). It deals with architecture, which is useful
    • Google, Azure (Microsoft), and IBM all are competitors and offer some sort of free tier stuff
    • Note that this may not be applicable in some job descriptions, but the majority of things will run in the cloud that you would be interested in doing, so learn it
  • Developer Meetups
    • a good place to network and learn from a host of experienced people (and possibly meet mentors) is to find local programming meetups
    • these tend to be language/framework specific, so may be a little too advanced initially
    • great way to develop your skills
  • Learn Linux
    • Much of computing is done with Linux, so it would be good to be familiar with it
    • Linux is similar to MacOS, so it shouldn't be too difficult to jump to
    • Windows Subsystem for Linux is a thing now if you are a Windows fan
    • Really there is no reason to not learn it and it would be good to know in any job you do
    • By learn I mean be familiar with the bash shell, the computing model it uses (such as permissions, file structures, what an inode is, etc).
    • Lots of classes and tutorials online that would be useful
Final Thoughts

I know that this seems like a lot, and in reality it is. It is a difficult proposition to get into computer programming, but right now is the best time to do it. The perks of the industry are great, the jobs are exciting and fun, and it is a comfortable way to live with several jobs offering flexible hours and remote work opportunities. Utah is chalk full of small, risky jobs, to large, enterprise level jobs. The interviews range from being ridiculously hard to being baffling in incompetency (but still will never return your call). Even if you manage to learn these skills, it is likely that it will take months before anyone hires, especially if you try to learn it by yourself. Using a university or a bootcamp has the benefit of connections, which can very much be worth the thousands of dollars you can spend. 

I hope this helped. If you have any other questions, feel free to ask. I like helping when I can :)

Thanks,
Marc

Monday, March 23, 2015

Easy View-Based Routing in Django

If you are like many (myself included), you are not a particularly big fan of the idea that the route definition for your app is located inside a single file called urls.py. Sure, it allows you to include the urls from any location, allowing for a more modular approach to how you store these apps, but often it is annoying to break up your routes across multiple different files if you are building a simple website that may need lots of different inputs and don't want to use query params in the url.

One solution is to produce complex regexes that return a different set of variables for a url pattern (i.e. you want to be able to query a url by both the entity's id and its english name: /foo/{foo_id}/$ Or /foo/{foo_name}/$). Though this use case may not be universal, I find it annoying to have to create a complex, error-prone regex or define a simplistic one for each needed url.

Another downfall of the Django urls.py is that it requires you to spell out a new url for every single view you have of your code. Several plugins are available to attempt to solve this problem for you, but I have not liked any of them. The reason I don't really like them is 1) requires an external dependency for something that should be really simple, and 2) usually takes control of the internal routing design that Django ships with. Neither of these are really that bad since Django is very customizable and in a sense is made for this sort of behavior; however, I do not like it. Therefore, I have created a means whereby to get the url functionality I wanted without having to download a third party app (and it is contained all within about 200 lines of code, including comments and documentation).

This tutorial supposes that you are familiar with Django routing. If you aren't, it might help to read the Django tutorial on routing first.

You can see the source code for this also at my github page.

Requirements For Custom Router


Before I walk through how to create your own custom router, I want to list a few requirements I had for this module:
  1. Foremost it had to be self-contained, meaning that it couldn't rely upon another python package to work. 
  2. Any routing had to be done through the default Django route system as it has already proven to be sufficiently performant.
  3. Error checking of urls was a must (why Django doesn't come with some sort of error checking beyond the regex compiling module is beyond me). Error checking in this case means that the base path of the url (a path without any regex patterns included) was different for each new urls. Thus you wouldn't ever have the infuriating problem of wondering why you aren't seeing what you expect on very similar but slightly different urls.
  4. Had to be able to keep the same functionality of original url() function that ships with Django.
  5. Must be able to create custom routes as well as automatic routes for the application. That means that it will produce <app>/<view>/* urls were the star will be treated as variables split by the /, and any other custom url you wish.
  6. View routing must be attached to the view. I hate having to go back to the urls.py to figure out what the url is I am looking at to see the grouping I had used for the variable I need.
  7. Must be easy to use and require very little configuration.
Most of the requirements above were met with my first attempt within an hour's worth of work. It is really quite simple, and I was able to make 149 different unique urls with the following urls.py definition:
urlpatterns = patterns('',
    url(r'^admin/', include(admin.site.urls)),
    url(r'^accounts/', include(accounts)),
    url(r'', include(routes.urls)),
)
Looks much cleaner than the normal 150 lines of excessive regex that would be required. The next section will detail how to write the router used, with the full code pasted at the bottom.

Using Django for Inspiration


The basic requirements of making this router self-contained and not circumvent the normal Django routing requires that all the routing creation occurs before Django ever serves up a page. Several things happen when Django is first loaded, and the event we are interested in most is how the settings.py is set up to be global. If you are not familiar with the Singleton pattern, just know it is a way to make one - and only one - instance of an object. Python makes this very easy because modules are, in Python, objects themselves. This allows for very simple singletons, and it is a pattern that Django uses to make the settings Global and singular (see the Django github for the settings moudule for more details).

Because Django loads things once and only once at the initialization of the system, it does not have to worry about linking to sources and hoping that they have not been moved, while still having the advantage of storing things in an encapsulated object and not inside the global namespace. This provides the benefits of OO programming with the accessibility of functional programming. Elegant and simple. We will do the same for our router.

So the question arises: How do we get Django to scan the views for the routes that we will be adding? At this point you may give up and think that this isn't really worth it (I certainly did), but the reality is that this problem is not a hard problem to solve. To get Django to scan, all you need to do is import the modules that hold the views into the urls.py module. That is it! Python will scan those modules for you. Using this fact will make almost all of our requirements possible. So, first thing we need to do is import the modules that hold our views into urls.py.

Importing the modules we need solves the most basic problem of getting the routes into a position to be added to urls.py. However, this does not establish an easy way of creating app/module/* routes since simply importing the files would require that we do lots of writing of code to add them and removes the automatic nature of adding the routes to urls.py. We could simply write the app/module/* in the view that will be calling it, but that just seems against the auto-magicness of python. We can do better, and we will.

Setting Up a View Based Routing Paradigm 


Since one of our goals is to make routing definitions found on the view that it is routed to, we will take advantage of the fact that class-based views are the new way of making views in Django. Don't worry, the old functional views paradigm will also be supported, but it will be clunkier or less flexible. To make this work, we will simply define a new variable on the view:

routes = ...

What should routes be set to? I went through a number of ideas, including an overly complicated tuple of tuples, which in the end was abandoned for a pythonic (and much easier) way of doing things: dictionaries! Because we decide to use dictionaries right off the bat, we are free to change the structure as we go along and no longer are weighed down by needing to keep track of what order our route definition is defined. If you ever think of tring to define configurations using a static list, just step back, slap yourself awake, and realize that dictionaries are the best thing you could possibly use in this situation.

That being said, I decided upon a routing definition structure as follows:

routes = {"pattern" : '',
                "map" : [(.)],
                "kwargs" : {}
               }

To explain each in turn, remember that we are making it so that each route can define variables in the url to be of one type or another (ie. id based or name based for lookups). Therefore, we need to allow for a way to define the pattern once and then add new regex groups to that pattern. This allows us to also error-check the pattern to make sure that we are not registering the same url pattern more than once unwittingly.

Pattern

The pattern is simply a string that uses the syntax for the format() function of a string to change the curly braces (i,e, {}) inside that string to the positional value of the args passed in to format(). For example, if we want to make a route of foo/{foo_id}/ and foo/{foo_bar}/ we make a pattern:

foo/{}/

Simple!

Map

Because we are defining string templates with pattern, we are going to create a list of lists that can be used to substitute those curly braces with the desired regex pattern. These lists must have the same number of regex patterns as the number of {} in the string in order for this to work, or it will throw an error. The map variable for our pattern above should therefore look like the following:

[('foo_id',), ('foo_name')]

This produces to distinct url patterns:

foo/foo_id/

and 

foo/foo_name/

Note that these are not regex patterns, but simple strings. This is intended to keep it simple, but you can add a regex pattern in there to capture a variable just fine.

Kwargs

The typical name for passing around key-value arguments is to name the dictionary that Python creates as kwargs. To keep with this tradition, we will name our configuration dictionary the same. Any arguments that you need to pass on to the underlying view should be given in the kwargs. This directly correlates to the kwargs argument of the url dispatcher in Django.

Miscellaneous Notes

Because we are using a dictionary to setup our configuration, we now can make use of the 'key in dict' pattern where we can search if a configuration has been defined. By doing so we can omit and add configurations for a route as needed. 

One such example is the name variable defined in a url dispatch object. The name given to a url dispatch object is used in reverse url lookups within templates and can come in handy in many ways. Because name is such a common variable to use in so many applications, it has a high chance of clashing with predefined variables in the kwargs section our our route definition. To circumvent naming collisions (and to make the intent of adding a name for reverse lookup to a url more clear), I later added the 'django_url_name' route configuration option to my router. I was able to add this in to my code without it affecting any other aspect of the routes setup. The same can be said about any new routing configuration that you may want to add in the future.

Creating the Routes Table

Now to the good stuff. The routes table will make use of the singleton pattern described above. This was heavily inspired by Django, including the workaround of LazyRoutes (which will be discussed later). At the bottom of our routes.py, add the line:

routes = Routes()

Routes() here is referencing a class that we have not yet defined. Go ahead and define it anywhere above that line:

class Routes(object):

Now that we have defined the singleton (routes =  Routes()), and the class, we are ready to start adding the structure of our router.

I will not go into full detail about all the different ways you can set up your Routes() object, but I will cover three aspects of it: the initialization (which will be where we automatically add the app/module/* routes as well as any class-based view routes), error checking of routes, and the add() function.

The first thing to do is make sure that the routes are unique. To do this, we will add a set object to the Routes module definition (not the routes instances). This is significant since we will be using this to keep track of all routes added, either by Routes or LazyRoutes (again, discussed later). To do so, you will write a class that looks like this:

class Routes(object):
    tracked = set()
    routes = []

You will also want to add the routes list at this point since want to make a single source for both Routes and LazyRoutes to place their urls. Now, when you add a pattern to the routes table, it can check the tracked set to see if it has already been defined by calling

pattern in tracked

If pattern is in tracked already, then this gives us a chance to throw a meaningful error, one that can be used to denote the duplicate pattern. I have wrapped all this in a function contained within the Routes class as follows:

def _check_if_format_exists(self, route):
        '''
        Checks if the unformatted route already exists.
     
        @route the unformatted route being added.
        '''
        if route in self.tracked:
            raise ValueError("Cannot have duplicates of unformatted routes: {} already exists.".format(route))
        else:
            self.tracked.add(route)
Now that the routes base pattern is unique, we can with confidence add route patterns to each view and know that we won't inadvertently step on a route we already defined.

With the above function we now have an adequate check to use in our add function. We create our add function in such a way that we can pass the pattern, the map, the function to call, and the kwargs to add to the url. Note that I said the calling function. Here we define a way to add function based views and their routes. My add function looks like the following:

def add_url(pattern, pmap, ending=False, opts={}):
            url_route = '^{}{}'.format(pattern.format(*pmap), '/$' if ending else '')
            if "django_url_name" in opts:
                url_obj = url(url_route, func, kwargs, name=kwargs['django_url_name'])
            else:
                url_obj = url(url_route, func, kwargs)
            self.routes.append(url_obj)
Since Django's url dispatcher literally stores the function signature to use when routing a url to a view, we can safely add any function that fits the url dispatch function parameters. If you look at the source code for Django View object, you will see that the as_view() function that is required to be passed in to a url dispatcher object literally returns another function called view. This function fits the old function-based view pattern of:

def view(request, *args, **kwargs)

Since the class-based views are just passing this function as the view, it is therefore clear to see that any function with this pattern can be passed safely. Note that in order for it to work it must return a django.http.HttpResponse object, but you could register a function with this pattern and almost get away without it. So, because we know that all we really need is a function, we now have the ability to add any view function to our routes. Isn't that great!?! An example of what I mean is as follows:

Here is your view function:

def showMeAll(request, *args, **kwargs):
    ....

All you need to do to add it is to first import routes, and then add it as follows:

routes.add_url('foo/{}', [('foo_id',),('foo_name',)], True, {})

The ending variable of the add_url() definition is to add the '/$' at the end of the url, thus eliminating mistakes that arise from not including the pattern end clause and preventing the need to make sure all urls end with a /. The opts is simply the kwargs as described in the class-based view route table.

The LazyRoutes Object

The LazyRoutes object pertains to the automatic loading of view-based classes into the routes table along with the automatic addition of app/module/* routes. The detail behind how to make an automatic loader is also heavily influenced by how Django registers apps and Models. The way Django loads apps is confusing, and it took a little bit of trial and error to finally figure it out, but if you want to see more about it, here is the link. 

From what I can make of it, since Django imports everything it needs at once before it ever serves up a page, there are times when recursive import statements can become a problem. What I mean by that is say that, as in our situation, we need to load the views modules when we have custom routes to add to the urls.py, but we also want to automatically add the app/module/* routes along with these custom routes. To automatically create these routes, we need to create them through introspection upon the creation of the Routes object. Simple enough, so what is the problem?

The problem arises by the fact that, when scanning the modules for views, we may come across a line of code like the following:

routes.add_url(...)

This is for a functional route, one that cannot be added through introspection upon creation. What are we to do? We do what Django does and create a LazyRoutes object. Technically this is not a real Lazy Object because it creates the url objects when they are found, but the idea is that they are not added to the Routes object that is still being created. The LazyRoutes object is used to store url patterns in the Routes.routes list (remember that the class definition Routes is its own object and is not an instance of Routes). This LazyRoutes object seems to act as though it adds the routes definitions defined by the routes.add_url() function after the Routes object has been created and has finished making the app/module/* routes. In reality, it was adding them to the master list all along. But this is a detail that is needed to be known in very few, if any, circumstances.

The added bonus of having a LazyRoutes object is that we can also completely circumvent the automatic Routes behavior if ever we wished and just stuck with LazyRoutes and only making defined routes accessible. Whatever your style of coding is will determine whether you will use it in this way. 

How to Introspectively Create App/Module/* Routes

Finally, we will discuss the trickiest part of this whole routing setup. So far it has been very easy, no? Hopefully you will have figured out that to add a class-based view would mean iterating over the routes dictionary you defined in the view class, but if not, here is a hint that that is what you should do. This aspect makes it so that you don't really have to even do that, for as you will see, you can add a function called add_view to your Routes definition that will take advantage of the introspective magic we are about to cover to add these views automatically (meaning you don't have to register them with routes.add_url()).

Python comes with a bevy of really cool tools for introspection (something made very easy since it is an interpreted language). Since Django already requires us to list the importable app names of our application, we will just use this list: settings.INSTALLED_APPS. Using this list we will attempt to load the modules as defined in the settings.INSTALLED_APPS list with the importlib module. This module comes with a handy feature called import_module(), which takes a string (aka the string listed in the settings.INSTALLED_APPS) and attempts to import it. Once imported it returns the module object that it found.

Now is a good time for me to state that I truly love the idea that everything (and they mean everything) is an object in Python. The module object is literally an object that describes a module that has been loaded, or in this case, the app's module. From that we can get the path to the module, and using some other python tools, we can grab the name of all the other modules inside this app. This means that we can load every module in an app and never need know what the app structure looks like beforehand! Isn't that cool? Because of this, we can make use of another nifty tool: inspect.

inspect is a module in python that allows you to inspect a module, directory, or whatever it is that you may need to inspect. In this case we are going to inspect all the modules of an application that we have loaded from the settings.INSTALLED_APPS list above. It is probably a good idea to filter out the django.* apps since they wont have defined the routes as we have hear, but that is up to you. 

The function from the inspect module that we are interested in is the get_members() module. This is a really neat function because you can pass in a module and a predicate (which means a declaration of need) to find what you are interested. We are interested in finding all the view classes of our modules. Doing it this way allows us to define a view wherever we like inside our application - we needn't be limited to a single views.py module. To find just the classes, do the following:

inspect.get_members(module, inspect.isclass)

This will return a list of classes. There is no way to see if a class definition descends from another class type without first creating an instance. Again, the everything-is-an-object paradigm means that the class definition is an object too, but it is simply of type type (the reason for this is far beyond the scope of this tutorial). To check if our class is of type View, we need to make an instance. Luckily the call inspect.get_members() actually returns a list of tuples, with the second position of the tuple being the class definition object. Since these objects are used to create an instance of the class, all we need to do is get the second position member and call it. An example below:

klasses = inspect.get_members(module, inspect.isclass)
inst = klasses[0][1]()

The parentheses at the end creates the object inst, which is an instance of the class that was defined by the second position of the first item in the klasses list (that is a mouthful, read it a few times to make sure you understand). We can now check that the inst class is of type view by doing the following:

isinstance(inst, View) #view must but imported before use from django.views.generic.base.View

Checking to make sure that a class is a view is necessary because now we can take the other information we have about the module, the app, and the view name and create the app/module/view route. Also, since we already have the view with its route table, we have all the information to add the custom routing that is defined on the view. Pretty sweet!

Note that it is a pretty trivial matter now to do something very similar to look for a views.py module where you can define all your functional based views and add them to the routes table as well. This makes it so that you won't have to define any routes.add() outside of your routes folder. Doing this makes it much more automagical, but it is really up to you.

Conclusion


Usually my tutorials are a lot more straightforward, but I felt like for this example it would be too long and too much to go over every aspect of the code. Also, the intent of this tutorial was to try and give some idea of how to do something versus giving just one idea of how to solve the problem. It was a lot more difficult, and so it is probably pretty unclear at times what I was attempting to do. I said I would give the source code to you to view, and so I have attached it at the bottom here. But I would encourage you to view my solutions on Github. Github has automatic syntax highlighting which makes things easier to read.

I hope that I was able to explain in some detail some of the cooler aspects of this routes creation. It took me awhile to figure it out, but now that it is done I am very proud of what it can do. I am sure that there are several people who have done this, but it seems to me that I always get more joy from figuring it out on my own. Hopefully someone can use this to their advantage.

Source


'''
Created on Mar 12, 2015

@author: derigible
'''

from django.conf.urls import url, patterns
from django.conf import settings
import importlib as il
import glob, os, sys, inspect
from django.views.generic.base import View

def check_if_list(lst):
    if isinstance(lst, str):
        '''
        Since strings are also iterable, this is used to make sure that the iterable is a non-string. Useful to ensure
        that only lists, tuples, etc. are used and that we don't have problems with strings creeping in.
        '''
        raise TypeError("Must be a non-string iterable: {}".format(lst))
    if not (hasattr(lst, "__getitem__") or hasattr(lst, "__iter__")):
        raise TypeError("Must be an iterable: {}".format(lst))

class Routes(object):
    '''
    A way of keeping track of routes at the view level instead of trying to define them all inside the urls.py. The hope
    is to make it very straightforward and easy without having to resort to a lot of custom routing code. This will be
    accomplished by writing routes to a list and ensuring each pattern is unique. It will then add any pattern mapppings
    to the route for creation of named variables. An optional ROUTE_AUTO_CREATE setting can be added in project settings
    that will create a route for every app/controller/view and add it to the urls.py.
    '''
    
    routes = [] #Class instance so that lazy_routes will add to the routes table without having to add from the LazyRoutes list.
    acceptable_routes = ('app_module_view', 'module_view')
    tracked = set() #single definitive source of all routes
    
    def __init__(self):
        '''
        Initialiaze the routes object by creating a set that keeps track of all unformatted strings to ensure uniqueness.
        '''
        #Check if the urls.py has been loaded, and if not, then load it (for times when you want to create the urls without loading Django completely)
        proj_name_urls = __name__.split('.')[0] + '.urls'
        if proj_name_urls not in sys.modules:
            il.import_module(proj_name_urls)
        if hasattr(settings, "ROUTE_AUTO_CREATE"):
            if settings.ROUTE_AUTO_CREATE == "app_module_view":
                self._register_installed_apps_views(settings.INSTALLED_APPS, with_app = True)
            elif settings.ROUTE_AUTO_CREATE == "module_view":
                self._register_installed_apps_views(settings.INSTALLED_APPS)
            else:
                raise ValueError("The route_auto_create option was set in settings but option {} is not a valid option. Valid options are: {}".format(settings.route_auto_create, self.acceptable_routes))
    
    def _register_installed_apps_views(self, apps, with_app = False):
        '''
        Set the routes for all of the installed apps (except the django.* installed apps). Will search through each module
        in the installed app and will look for a view class. If a views.py module is found, any functions found in the 
        module will also be given a routing table by default. Each route will, by default, be of the value <module_name>.<view_name>. 
        If you are worried about view names overlapping between apps, then use the with_app flag set to true and routes 
        will be of the variety of <app_name>.<module_name>.<view_name>. The path after the base route will provide positional 
        arguments to the url class for anything between the forward slashes (ie. /). For example, say you have view inside 
        a module called foo, your route table would include a route as follows:
        
            ^foo/view_name/(?([^/]*)/)*
        
        Note that view functions that are not class-based must be included in the top-level directory of an app in a file
        called views.py if they are to be included. This does not make use of the Django app loader, so it is safe to put
        models in files outside of the models.py, as long as those views are class-based.
        
        Note that class-based views must also not require any parameters in the initialization of the view.
        
        To prevent select views from not being registered in this manner, set the register_route variable on the view to False.
        
        All functions within a views.py module are also added with this view. That means that any decorators will also have
        their own views. If this is not desired behavior, then set the settings.REGISTER_VIEWS_PY_FUNCS to False.
            
        @param apps: the INSTALLED_APPS setting in the settings for your Django app.
        @param with_app: set to true if you want the app name to be included in the route
        '''
        def add_func(app, mod, func):
            r = "{}/{}/(?:([^/])*/+)*".format(mod,func[0])
            if with_app:
                r = "{}/{}".format(app, r)
            self.add(r, func[1], add_ending=False)
            
        for app in settings.INSTALLED_APPS:
            if 'django' != app.split('.')[0]: #only do it for non-django apps
                loaded_app = il.import_module(app)
                for p in glob.iglob(os.path.join(loaded_app.__path__[0], '*.py')):
                    mod = p.split(os.sep)[-1][:-3]#get just the module name without the .py
                    try:
                        loaded_mod = il.import_module('.' + mod, loaded_app.__package__)
                        for klass in inspect.getmembers(loaded_mod, inspect.isclass):
                            try:
                                inst = klass[1]()
                                if isinstance(inst, View):
                                    if not hasattr(inst, 'register_route') or(hasattr(inst, 'register_route') and inst.register_route):
                                        add_func(app, mod, klass)
                                    if hasattr(inst, 'routes'):
                                        self.add_view(klass[1])
                            except TypeError: #not a View class if init is required.
                                pass
                        if mod == "views" and (hasattr(settings, 'REGISTER_VIEWS_PY_FUNCS') and settings.REGISTER_VIEWS_PY_FUNCS):
                            for func in inspect.getmembers(loaded_mod, inspect.isfunction):
                                add_func(app, mod, func)
                    except ImportError:
                        raise TypeError("Routes type found in view module when settings.ROUTE_AUTO_CREATE has been set. Switch Routes to LazyRoutes.")
        
    def add(self, route, func, var_mappings= None, add_ending=True, **kwargs):
        '''
        Add the name of the route, the value of the route as a unformatted string where the route looks like the following:
        
        /app/{var1}/controller/{var2}
        
        where var1 and var2 are arbitrary place-holders for the var_mappings. The var_mappings is a list of an iterable of values
        that match the order of the format string passed in. If no var_mappings is passed in it is assumed that the route has no mappings
        and will be left as is.
        
        Unformatted strings must be unique. Any unformatted string that is added twice will raise an error.
        
        To pass in a reverse url name lookup, you can use the key word 'django_url_name' in the kwargs dictionary.
        
        @route the unformatted string for the route
        @func the view function to be called
        @var_mappings the list of dictionaries used to fill in the var mappings
        @add_ending adds the appropriate /$ is on the ending if True. Defaults to True
        @kwargs the kwargs to be passed into the urls function
        '''
        self._check_if_format_exists(route)
        
        def add_url(pattern, pmap, ending, opts):
            url_route = '^{}{}'.format(pattern.format(*pmap), '/$' if ending else '')
            if "django_url_name" in opts:
                url_obj = url(url_route, func, kwargs, name=kwargs['django_url_name'])
            else:
                url_obj = url(url_route, func, kwargs)
            self.routes.append(url_obj)
            
        if var_mappings:
            for mapr in var_mappings:
                check_if_list(mapr)
                add_url(route, mapr, add_ending, kwargs)
        else:
            add_url(route, [], add_ending, kwargs)
    
    def add_list(self, routes, func, prefix=None, **kwargs):
        '''
        Convenience method to add a list of routes for a func. You may pass in a prefix to add to each
        pattern. For example, each url needs the word workload prefixed to the url to make: workload/<pattern>.
        
        Note that the prefix should have no trailing slash.
        
        A route table is a dictionary after the following fashion:
        
        {
         "pattern" : <pattern>', 
         "map" :[('<regex_pattern>',), ...],
         "kwargs" : dict
        }
        
        @routes the list of routes
        @func the function to be called
        @prefix the prefix to attach to the route pattern
        '''
        check_if_list(routes)
        for route in routes:
            if 'kwargs' in route:
                if type(route['kwargs']) != dict:
                    raise TypeError("Must pass in a dictionary for kwargs.")
                for k, v in route["kwargs"].items():
                    kwargs[k] = v
            self.add(route["pattern"] if prefix is None else '{}/{}'.format(prefix, route["pattern"]),
                      func, var_mappings = route.get("map", []), **kwargs)
    
    @property
    def urls(self):
        '''
        Get the urls from the Routes object. This a patterns object.
        '''
        return patterns(r'',*self.routes)
        
    def _check_if_format_exists(self, route):
        '''
        Checks if the unformatted route already exists.
        
        @route the unformatted route being added.
        '''
        if route in self.tracked:
            raise ValueError("Cannot have duplicates of unformatted routes: {} already exists.".format(route))
        else:
            self.tracked.add(route)
            
    def add_view(self, view, **kwargs):
        '''
        Add a class-based view to the routes table. A view that is added to the routes table must define the routes table; ie:
        
            (
                  {"pattern" : <pattern>', 
                   "map" :[('<regex_pattern>',), ...],
                   "kwargs" : dict
                   },
                 ...
            )
        
        Kwargs can be ommitted if not necessary.
        
        Optionally, if the view should have a prefix, then define the variable prefix as a string; ie
        
            prefix = 'workload'
            
            or
            
            prefix = 'workload/create
            
        Note that the prefix should have no trailing slash.
        '''
        if not hasattr(view, 'routes'):
            raise AttributeError("routes variable not defined on view {}".format(view.__name__))
        if hasattr(view, 'prefix'):
            prefix = view.prefix
        else:
            prefix = None
        
        self.add_list(view.routes, view.as_view(), prefix = prefix, **kwargs)

class LazyRoutes(Routes):
    '''
    A lazy implementation of routes. This means that LazyRoutes won't add routes to the Routes table until after the
    routes table has been created. This is necessary when the ROUTE_AUTO_CREATE setting is added to the Django settings.py.
    All defined routes using the routes.* method must now become lazy_routes.* methods.
    '''
    
    def __init__(self):
        '''
        Do nothing, just overriding the base __init__ to prevent the initilization there.
        '''
        pass
        
lazy_routes = LazyRoutes()
routes = Routes()

Friday, March 6, 2015

How to Remove PostgreSQL from Server

At work we have a reporting server that was setup by an employee here who deemed himself the foremost expert at Linux. Needless to say, if you are touting your Linux skills, you had better be able to back them up. Turns out that he wasn't quite up to snuff with his skills, and after following a tutorial he put together for setting up a server, I found myself unsure of what PostgreSQL server I was actually using. Somehow I had both 9.1 and 9.2 running.

I was in the process of cleaning up the reporting server and decided that it would be good to only have one server of PostgreSQL running, so I discovered how to do it. In the process of cleaning it up, I also discovered that the server version you are using stores databases by default in the same parent directory as the server itself is located. In other words, if you follow this tutorial, be aware that you will lose your data if the default storage location had not been changed. Luckily it was data that really didn't need to be kept around for a long time so it wasn't that big of a deal. But be aware that it does destroy data doing this method.

First, run the command

sudo /etc/init.d/postgresql stop

If two versions of the server pop up, then it means you are running two instances of PostgreSQL. Decide which one you want to remove and then run the following command:

sudo apt-get purge postgresql-x.x

where the x.x stands for the major.minor version number.

If you are purging a server from multiple different servers being used, then you will need to restart the server:

sudo /etc/init.d/postgresql start

The server is now up and running and ready to be used.

Thursday, March 5, 2015

How to Setup Django on an AWS EC2 Instance Using VirtualEnv

Setting up an ec2 instance is as easy as following the launch wizard provided by Amazon, and as such will not be covered in this tutorial (to setup ec2 for yourself, you can follow Amazon's guide found here). After following the instructions for setting up a Ubuntu server, you should be ready to follow the rest of this tutorial. Though the server used is Ubuntu, the steps will be fairly similar for any Linux distribution (just use that distros package manager calls instead). You can also follow this tutorial for connecting to an EC2 instance from Windows if you are unfamiliar with the process.

Update Ubuntu


The first thing that you should always do when starting a new server is running an update call on the server. This provides all of the security updates and functional fixes that have been released since Amazon took the image of Ubuntu that you are using.

To do so, enter the command:

sudo apt-get update

A bunch of packages will be downloaded and installed on the operating system. Normally this step doesn't take more than a minute or two, and once you are done you can continue setting up the server without having to restart (a nice advantage over the typical Windows update cycle).

Update the Distribution of Python to Python 3


NOTE: If you want to just stick with Python 2 instead of going through these steps, then skip ahead to the next section and treat python 3 references as though it were python 2 by simply removing the 3 in the call.

Before you do anything with your current installation's python, you need to take note (write on a piece of paper or store on a notepad document) of what version is the default version of python. You do so by running:

python --version

The above command will return something like Python 3.2.3. Make sure to make note of this as it will be important going forward.

My personal opinion is that Python 2, while great, should be deprecated and the world should move to Python 3. I won't go into an explanation here as to why I feel this way, but it is easy enough to do and writing things in Python 3 will ensure that your code will continue to work when (if ever) they decide to finally stop supporting Python 2.

To install python 3, you should run the following (this may not be necessary for server version 12.10 and up, see here):

sudo apt-get install python3

A quick Google search brought up some stackoverflow.com answers that seem to indicate that it is a bad idea to switch the system default python version to Python 3 (primarily because Python 2 and 3 are not compatible) and will likely break some scripts that rely on Python 2. I am not a Linux expert by any means, and since this is a practical tutorial on how to set up a server that will work, we will do as we are told (see more about it here). 

To set up Python 3, we will create an alias by doing the following command:

echo 'alias python=python3' >> .bashrc

The above command will edit the default bash shell setup (or, if you are not familiar with Linux at all, the command prompt you see through a terminal to your Linux box) with an alias to the word python that will point to python3. You will not see the change until you reconnect. To make sure it works, reconnect to the server and enter:

python --version

It should now be Python 3.x.x


Symlink Your Python Executable


We will also want to setup a symlink (basically an aliased path to a directory) to point /usr/bin/python to Python 3, do the following:

Make the directory:

sudo mkdir ~/bin/python -p

Then run:

sudo ln -s /usr/bin/python3 ~/bin/python

Essentially we just made a path called /home/ubuntu/bin/python that points to the Python interpreter in /usr/bin/python3. This link will be useful in the next step.

NOTE: If you didn't setup Python 3, then you will need to replace python3 with python2.

Install Python Package Manager


Pip is the preferred way by many to manage packages specific to python. There are other ways, but pip is so easy to use that it doesn't make a lot of sense not to use it. We will need to install pip (or some other package manager if we want to make this easy), before we continue. You can try another python package manager, but this tutorial is specific to pip.

To install pip, run:

sudo apt-get install python-pip

Pip should now be installed and ready to use.

Setup a VirtualEnv


Several places online encourage the use of virtualenv to run your Django instance. Since it is not a very hard thing to do, we will set it up as well. I will not go into detail as to why it is a good idea, but you can Google the reasons yourself if you want. We will follow the install instructions found at http://docs.python-guide.org/en/latest/dev/virtualenvs/.

To install virtualenv, call:

sudo pip install virtualenv

Since we set a symlink in the previous step to Python 3, we can use it to set up the virtualenv with the Python 3 interpreter:

virtualenv - p ~/bin/python venv

Note that it is possible to sidestep the symlinking and do something like virtualenv -p /usr/bin/python3, but symlinking provides the advantage of being able to change the python version without having to update this call. Essentially, if we wanted for some reason to move back to Python 2 (or if you never went to Python 3 using the steps above), we could set the symlink to /usr/bin/python2 and the virtualenv wouldn't know the difference (unless we broke compatibility by going back to python 2). It is therefore a good idea to make your calls using symlinks as it makes for more flexibility to change things in the future. Although in the case where you are doing this manually, it is probably fine to direct-link to the python version you want. It is good practice to write things in such a was as to makes portability into scripts much easier.

Now we need to activate the virtualenv:

source venv/bin/activate

Now that it is setup, you should see the (venv) on the left side of your command prompt indicating that you are in a virtual environment. Virtual environments only last as long as the shell is alive, so you will need to run the above command each time you want to edit your venv after closing the shell (or after running deactivate). Go ahead and enter deactivate for the next step

NOTE: Python 3 comes with a virtual environment package built in called, conveniently, venv. I didn't know this until after I started writing this tutorial. It is basically the same as virtualenv, and it is likely easier to use. I would read about it here: https://www.python.org/dev/peps/pep-0405/, or the documentation here.

Setup PostgreSQL


PostgreSQL is the recommended database for Django as it is the most supported of all the databases. You will need to install it on your system by doing the following:

sudo apt-get install postgresql

We now need to log in to the server and setup the postgres user (probably a good idea to try and set up a different user other than postgres since postgres is the superuser for your database, but for now we can just use postgres). Do so by entering the psql (postgres database management prompt) by typing the following:

sudo -u postgres psql

You should see a prompt that looks like:

postgres=#

This is the command prompt for postgres and will allow us to perform operations on the database. First we set up the user password:

ALTER USER postgres PASSWORD '<password here>';

Which should be followed by the words ALTER ROLE.

Now we will create our database:

CREATE DATABASE <db_name_here>;

You should then see CREATE DATABASE to confirm that it was created.

Now you will need to install a few other packages so Django will be able to talk to the server. Run the command:

sudo apt-get install postgresql-server-dev-x.x

where x.x is the version number of your PostgreSQL database. You can find out the version of PostgreSQL by running the following:

sudo /etc/init.d/postgresql stop

will show you the version that was stopped. Restart it be running:

sudo /etc/init.d/postgresql start

Then run the command:

sudo apt-get install python3-dev

which will install some python files that are not included in the original python 3 install. Next you will reactivate your virtual environment:

source venv/bin/activate

and then run:

pip install psycopg2

which installs the actual interface between Django and the PostgreSQL server. I don't know the reasons behind why all of these files are needed except that often times, when developing, developers will use files found in a package to help them speed up development time. That is what is going on here and thus requires us to install so many additional packages.

Install Django Inside VirtualEnv


Start up the virtual environment again:

source venv/bin/activate

Now run the command:

pip install django

Note that you no longer have to use sudo in front of pip to install packages. This is one of the best benefits of using virtualenv. 

Next we will make a symlink to the python 3.x site-packages directory, to be used later in the apache setup:

sudo ln -s ~/venv/lib/python3.x/site-packages  /var/lib/python/site-packages

where the x in 3.x is the name of the directory for the Python 3 version you are using.

Setup Your Django Project


If you have a Django project that you have already built, then you have a variety of ways you can get it onto your server. We will focus on the case of when you already have a Django project built and leave the other case up to the user to figure out (see Django's excellent documentation on how to get started with Django for building a Django app, though I recommend building it on your local computer first). If you don't know how to put files onto a server, you can follow up on how to do so with my tutorial here (the part on connecting with WinSCP is towards the bottom and is a little outdated, but should be sufficient). I will not go over how to get the project onto your machine, my previous tutorials details how to do so. We will start off assuming that you have your project on your server.

To begin, make a symlink to your django files wherever they may be (for ease of finding later in this tutorial):

sudo ln -s /path/to/django/files/directory /var/lib/<projectName>

We will need this path when we have setup apache on our server.

After Django is installed and your project setup, we will need to setup our Apache server. We could use just about any other server, such as nginx or lighttpd, but will stick with Apache because of its popularity and because I know it already. An added bonus is that Django has documented how to setup Django with Apache, so why not make it simple?

Setup Apache Server - Install


First, if inside your virtual environment, deactivate your virtual environment:

deactivate

Now enter the command:

sudo apt-get install apache2

Pretty straightforward. You can navigate to your server's url and you should see the Apache default page, which looks something like this:

It works!

This is the default web page for this server.
The web server software is running but no content has been added, yet.
Once this is complete, you will need to install a plugin for apache called mod_wsgi. Mod_wsgi needs to be compiled with the same version of python that your scripts are running. This creates a major headache, and I couldn't find an easy way with package managers to simply point the install to the right python version. Therefore, if you are running a version of Ubuntu that doesn't have Python 3 as the default version, you will need to do this (hopefully you checked what the system default python is as instructed above; if not, then it is up to you to figure it out).

To setup your mod_wsgi to work with apache in python 3, do the following (taken from this stackoverflow.com answer):

Install more packages needed to modify apache2 mods:

sudo apt-get install apache2-dev

Change directories to a common place to store source code:

cd /usr/local/src

Download and install the mod_wsgi code from the code repository. The following steps are all needed to take the code from the repository and make it into something Linux can use:

sudo apt-get install make

sudo wget https://modwsgi.googlecode.com/files/mod_wsgi-3.4.tar.gz

sudo tar -zxvf mod_wsgi-3.4.tar.gz

cd mod_wsgi-3.4/

sudo ./configure --with-python=/usr/bin/python3.x

where x is the version of Python 3 on your machine.

sudo make

sudo make install

Now you have mod_wsgi in an executable binary format and can be loaded into apache. To tell apache to load this module, we will have to edit the apache configuration, which we cover in the next section.

NOTE: If you decided to stick with Python 2, things are a lot easier. To install, simply run:

sudo apt-get install libapache2-mod-wsgi

You are now ready to setup Django to run with Apache.

**UPDATE: go to https://launchpad.net/ubuntu/trusty/+package/libapache2-mod-wsgi-py3 instead if using latest edition of Ubuntu. This is for python 3.

Setup Apache Server - Configure to Run with Django


One reason I really recommend Django over other web frameworks (at least for python users) is that the documentation is excellent. Django comes with a tutorial of deploying Django with Apache. I will attempt to distill the finer points here, but you can always see the Django tutorial at https://docs.djangoproject.com/en/1.7/howto/deployment/wsgi/modwsgi/.

First thing we need to do is navigate to the /etc/apache2 directory:

cd /etc/apache2

If you look in the directory (ls command), you will see several different files, each of them dealing with different aspects of apache's configuration. The apache2.conf is the main apache configuration file and any configuration changes you make there will be used. However, it is generally a good idea to leave custom changes to apache's configuration outside of the main config file. Therefore, apache comes with a httpd.conf file, which is a user-defined configuration file that is added to the main apache2.conf file. It is good practice to edit this file as it helps to segment the changes you made with what comes standard with apache. All we need to do is make sure that the apache2.conf file includes httpd.conf.

Open the apache2.conf file (if you don't know how, there are a number of ways to do so, each of which can be tricky to use if you don't know Linux). Since this is a very basic tutorial, we will use the text editor vim to open our file. It has some nice features to help read text files from a terminal, and it is widely regarded as one of the most useful text editors on Linux. Just note that if you are new to vim, only enter the commands you see here or else you will be totally confused as to what is going on.

Enter:

sudo vim apache2.conf

Your screen should now have a bunch of blue text. What you are reading are the instructions for how to use apache. Take some time to read it as it does give some useful information, but for our purposes we are just going to use the up and down arrow keys (you can also use the page up and down keys to scroll a whole page) to find what we want.

After scrolling down for a bit you should see:

Include httpd.conf

If this line is in there then you are ready to edit the httpd.conf file. Exit out of your current view by typing the keys

:q

and then pressing enter. This will return you back to your regular command prompt. If that line is not there, then enter the following sequence of commands:

  1. Press the i key.
  2. Write Include httpd.conf on its own line.
  3. Press the Esc key.
  4. Enter the character sequence :wq
  5. Press Enter.

You have successfully added the httpd.conf file to your apache2.conf file.

Now open the httpd.conf file as follows:

sudo vim httpd.conf

Using the same basic pattern described above for editing a file in vim, write the following config information in your httpd.conf file (do not save after this, more will be written):


WSGIDaemonProcess <projectName> python-path=/var/lib/<projectName>:/var/lib/python/site-packages
WSGIProcessGroup <projectName>
WSGIScriptAlias / /var/lib/<projectName>/<projectName>/wsgi.py
Alias /static/ /var/lib/<projectName>/static/
Note that <projectName> should be the name of your Django project.

The above configuration is telling apache to run your Django project that we set up previously and is also set to retrieve your static files, like your css and js files, from the Django static folder directly. Django strongly discourages the use of Django as a means to send static files to a user, so that is why we tell apache where to look for static files. This presupposes that your static files are pointing to ./static/ in your html.

If Python 3 is your python version, then do the following step; skip it if you stuck with Python 2. We are going to add the configuration now that tells apache to load the mod_wsgi executable. Write the line:

LoadModule wsgi_module /usr/lib/apache2/modules/mod_wsgi.so

Save the file as discussed in steps 3 through 5 in the vim editing example above. You are done setting up apache.

Restart Apache


After you have completed all of this, you are ready to go. Restart apache by issuing the following command:

sudo service apache2 restart

Navigate to your server's homepage and you should now be seeing the homepage of your django app!

Troubleshooting


I hope that this helps out aspiring developers in the future. Some of the pitfalls I faced when first setting up a Django app are as follows:
  1. Couldn't Find Anything with PIP - I didn't have my https port open on AWS. Though it seems like such an obvious reason for some of the issues I was facing, but it wasn't immediately apparent when pip was failing that it just couldn't see the pip repository. PIP uses https to get the packages you need on your system, so it requires that https be open. I didn't know it at the time, and I spent hours working with the extremely unhelpful error messages before I figured out the solution.
  2. Apache is Returning 500 Errors - It took me awhile to figure out why my first install of Django wasn't working, so I had to do a lot of searching just to figure out where the log files for apache were so I could see what is happening. For our purposes (since it really depends on the Linux distribution for where the log files are located) you can find the log files under /var/log/apache2. The most useful troubleshooting log is, naturally, error.log. Use the tail command to see what happened last: tail -100 /var/log/apache2/error.log
  3. My Apache error.log tells me that permission is denied with file /path/to/file/__pycache__ - Apache runs as user www-data and as such has very limited space in which it can edit files on the system (for obvious security reasons). You will need to edit the folder that stores your file (most likely going to be /var/lib/<projectName> as we setup above). To do so, enter the following: sudo chown -R www-data /var/lib/<projectName> assuming that the __pycache__ is within this directory. Then run the command sudo chmod -R 775 /var/lib/<projectName> . Note that this last command is somewhat insecure but should be sufficiently secure for now. Most security settings will have to be adjusted when you decide to really get serious about security, so we won't bother with it now.
  4. Makemigrations is Giving Me Permission Denied Errors - Permissions need to be edited to allow the ubuntu user (the default login user) to make edits as well. Do the following: sudo chown -R www-data:ubuntu /var/lib/<projectName> . For good measure, run the command sudo chmod -R 775 /var/lib/<projectName> .
  5. I can't see my media files - This one was really obvious but somehow I missed the explanation Django provided. You need to create another configuration entry in httpd.conf that points /media/ calls to your media folder (wherever that may be). Google Django set up media files for more information.
  6. VirtualEnv is getting a permission denied error. - This problem occurred because I failed to symlink my python interpreter correctly. Linux doesn't always (perhaps not so often) gives good error messages, and I had to bang my head for awhile with this one. After I removed the symlink to my python interpreter and remade it the correct way, everything worked. But, to be thorough, here is a resource you can use.
  7. Apt-get not working because lock can't be removed. - Again, another stupid problem with me just being impatient and ending a process before it finished and Linux couldn't recover. Basically you just have to end the apt-get processes and then remove the lock file if it doesn't work. See more here.
  8. I'm having trouble migrating my models to the database. - Remember that anything you do with Django has to be run inside the virtual environment. Before running migrations or other Django management, you must run source ~/venv/bin/activate .
  9. There is a problem with my virtual environment saying that I do not have setuptools installed when attempting to install pyscopg2. Follow the answer here: https://www.reddit.com/r/learnpython/comments/3jlbep/error_msg_pip_setuptools_must_be_installed_to/

Sources


I used a plethora of sources to make this work. As I have mentioned before, I am not a system administration guru, but I am fairly decent with Linux. That being said, several things about setting up a server were not immediately straight-forward to me since error messages on Linux can be extremely unhelpful at times. To help me get to where I am now, I have listed several of the sources I used.
  1. http://www.tonido.com/blog/index.php/2013/11/25/working-with-virtualenv-on-django-projects/#.VPjDHvnF_DQ
  2. https://www.digitalocean.com/community/tutorials/how-to-run-django-with-mod_wsgi-and-apache-with-a-virtualenv-python-environment-on-a-debian-vps
  3. https://virtualenv.pypa.io/en/latest/userguide.html#usage
  4. https://docs.djangoproject.com/en/1.7/topics/install/
  5. https://docs.djangoproject.com/en/1.7/howto/deployment/wsgi/modwsgi/
  6. https://www.digitalocean.com/community/tutorials/how-to-install-and-use-postgresql-on-ubuntu-14-04
  7. http://www.postgresql.org/download/linux/ubuntu/
  8. http://stackoverflow.com/questions/1951742/how-to-symlink-a-file-in-linux
  9. http://ubuntuforums.org/showthread.php?t=2141770
  10. http://askubuntu.com/questions/197626/where-is-a-postgresql-9-1-database-stored-in-ubuntu-12-04
  11. http://www.postgresql.org/message-id/006201c74b23$17cce130$9b0014ac@wbaus090
  12. http://askubuntu.com/questions/15433/unable-to-lock-the-administration-directory-var-lib-dpkg-is-another-process
  13. https://www.digitalocean.com/community/tutorials/how-to-read-and-set-environmental-and-shell-variables-on-a-linux-vps
  14. http://stackoverflow.com/questions/16618071/export-a-variable-to-the-environment-from-a-bash-script-without-sourcing-it
  15. http://askubuntu.com/questions/320996/make-default-python-command-to-use-python-3
  16. http://askubuntu.com/questions/401132/how-can-i-install-django-for-python-3-x
  17. https://docs.djangoproject.com/en/1.7/faq/install/
  18. http://stackoverflow.com/questions/5846167/how-to-change-default-python-version
  19. http://askubuntu.com/questions/244544/how-do-i-install-python-3-3
  20. http://docs.python-guide.org/en/latest/dev/virtualenvs/
  21. http://stackoverflow.com/questions/22938679/error-trying-to-install-postgres-for-python-psycopg2
  22. http://stackoverflow.com/questions/20913125/mod-wsgi-for-correct-version-of-python3
  23. http://askubuntu.com/questions/483744/config-status-error-cannot-find-input-file-makefile-in
  24. https://code.google.com/p/modwsgi/wiki/CheckingYourInstallation
  25. http://httpd.apache.org/docs/2.2/mod/mod_so.html
  26. https://launchpad.net/ubuntu/trusty/+package/libapache2-mod-wsgi-py3