Scalable Web Apps: Erlang + Python

Motivation

This post describes how to program scalable web applications with Erlang and Python using computational parallelism. Caching and load balancing are well documented elsewhere and beyond the scope of this post.

Web applications, by nature, span two drastically different programming domains: The high-level web design and development domain and the low-level, high-performance, and distributed domain. Since the internet has provided a bridge between these two domains, it’s now possible to realistically provide high-level user interfaces to high-performance back-end applications.

I prefer to do as much programming in Python and the Django web framework as possible because it’s just so easy. ErlyWeb is an Erlang web framework which should provide similar features to Django, though I’ve never used it. In a web application setting, it’s easy to have too many Apache/Python processes which max out server memory or the CPU. For those tasks, I like Erlang. Erlang has parallelism and distribution primitives, which are arguably more elegant than Python’s concurrency primitives, and has SMP support.

Erlang + Python Communication

It seems like a no-brainier to use domain specific languages to ease development efforts in the corresponding domain but the tricky part is making multiple languages communicate when multiple domains are spanned. MochiMedia is a small company that builds many of their products around Erlang + Python and, after shopping around, I chose to use their method of interfacing between the two languages: HTTP + JSON. The data being sent between the two languages is serialized, sometimes with JSON, and then passed along via HTTP. This has a few benefits. First, since HTTP is being used, my Erlang cluster can be across the internet from my Python front-end server. Second, since I’m using an independent intermediate representation to serialize the data (JSON), any component of the application stack may be swapped out for something completely new.

Here’s how it’s done:

  1. Use MochiWeb to enable the Erlang nodes to communicate over HTTP. The latest version of Erlang and OpenSSL headers are needed to compile MochiWeb.
  2. Create a MochiWeb project skeleton. Since MochiWeb is a framework, a script is provided to help create a web server which uses MochiWeb.
  3. Modify the request handler to understand JSON. The only Erlang that really needs to be modified is in [project name]/src/[project name]_web.erl. This is where the processing code goes (ex: map/reduce).

On the Python side, a simple urllib2.urlopen can be used to build Request objects to send to Erlang. Django comes pre-packaged with SimpleJSON to serialize the body of the HTTP request:

def send_to_erlang(data):
    url = "http://erlang.nodes.tld:8000/"
    body = json.dumps(data)
    headers = {'Content-Type': 'application/jsonrequest',
               'User-Agent'  : 'Python/Project/0.1'}
    urlopen(Request(url, body, headers))

Parallelism

Kevin Smith, of Hypothetical Labs, did a great interview with Bob Ippolito, CTO at MochiMedia, which is a great case study for Erlang + Python. Bob talks in-depth about the engineering tasks the model helps overcome.

Computation tasks which can be executed in parallel are key to utilizing Erlang’s distributed parallelism. A relatively small message with big computations is the desired abstraction. For example, the Django web interface could wrap an Erlang distributed map/reduce implementation. The Erlang book enumerates many different paradigms for Erlang distributed parallelism and for programmers who already have an idea, the plists library takes care of all the distribution automatically. A programmer with at least a little experience in both Erlang and Python should be able to hack their way through to a fully functional and scalable web application from here.

    None Found
  • itay

    Scalable in what way? I find the notion of combining the two compelling, and I have done similar things with them, but I am not sure what exactly you are advocating. Is it to use the Erlang servers as a reverse proxy (in a way) to the Django app? Is it to allow computation to be done on the Erlang servers that would be expensive on the python machine? What sort of topology are you talking about?

    Sorry for the barrage of questions, I just want to know the final step of what you had in mind.

  • https://www.humani.st Luke Hoersten

    I expect my readers to decide how to scale their applications. Both the reverse proxy and spawning expensive computations would fit this Erlang + Python model well. My intent was to simply show how to use MochiWeb to make Erlang and Python communicate. There are many more articles online about why it my be desirable to use Erlang to scale web applications.

    Personally, I wrote an application which uses an Erlang cluster to run expensive computations in parallel but clearly this design could be used for other scalability issues (even database issues using Mnesia). I have another post coming specifically about the design of this application.

    Perhaps I should add more speaking specifically to this issue, though. I had the notion that this post was not complete but couldn’t put my finger on it. I had a few friends read it over but they could not pin it down either. Can you be more specific about what you’d like me to address?

    Thanks for the comment, itay!

  • http://rsaccon.com Roberto Saccon

    Hi Luke

    Seems to be popular these days, to combine Python and Erlang, I am also doing that right now, the guys at erlware as well, and probably many more …

  • https://www.humani.st Luke Hoersten

    What project(s) are you working on? I’d be interested to compare ideas. I’ve been extremely impressed with the results of my projects.

  • http://jtarchie.com jtarchie

    I think you should retitled your post. As I learning Erlang currently, and thought this would be an actual example of a scalable web app. All this really is a method of passing information from python to Erlang, which is ok, but totally not what I expected. I am interested to see what your next article is that you describe.

  • https://www.humani.st Luke Hoersten

    Understood but I guess it’s all the perspective you have. To me, interlanguage communication was the hardest part of the scalable app and tools like mapreduce and plists made the actually distribution and parallelism easy. Have you tried those libraries yet?

    Seeing such a huge response requesting an example though is a good sign. I’ve got that on the way.

  • Andrew Wooster

    I’m doing almost exactly the same thing at my startup. :-)

  • https://www.humani.st Luke Hoersten

    That’s awesome. Is your startup running in stealth? I’d like to see how it’s working out for you?

  • http://www.nextthing.org/ Andrew Wooster

    Still in stealth mode, yeah. I’ll make sure I write up my experiences at some point, but it will be awhile.

  • https://www.humani.st Luke Hoersten

    well good luck! let me know if you need any freelance work done in erlang/python/django. I’ll have a month free before I go to work with algo trading.

  • Pingback: Humanist → Talking to Erlang

  • https://www.humani.st Luke Hoersten

    Just like to point out Disco which only adds to the coolness of this idea.

  • http://www.mogs.com/ffxi-gil ffxi gil

    I’m very excited to hear that. Though I’m a bit skeptical that all of the thread-safety issues in Rails will be worked out in time for 2.2. That kind of thing is hard to bolt onto a framework after the fact. You need to have thought through the issues up-front and come up with a sane way to handle threads, or you end up in a nightmare scenario of mutexes, race conditions, and very hard to reproduce bugs.

  • http://www.gambling.ph slots

    I've just started combining Erlang with Python.. and the results seem to be alot better.

  • http://www.commercialenergysurveyor.co.uk/information/sap-reports/ SAP

    Thank You, Luke Hoersten for sharing your knowledge with us. This is really great and helpful to us. Keep going.

  • aaronasjones

    Thank you for sharing this information.
    Reverse Access Livedoor

  • http://addfinances.com Laurynas

    Great article. Thanks a lot.

  • http://www.bcm-websolutions.de/de/suchmaschinenoptimierung-stuttgart.html Suchmaschinenoptimierung

    Nice work buddy!!

  • http://www.club-penguin.org/ clubpenguincheatcodes

    Is it to use the Erlang servers as a reverse proxy (in a way) to the Django app? Is it to allow computation to be done on the Erlang servers that would be expensive on the python machine? What sort of topology are you talking about?

  • http://onlinekidsgames.wikispaces.com/ Free Kids Games

    I prefer Django and Python as well. Great post!

  • Pingback: 50 + Dofollow high page rank links to direct posts

  • Pingback: Erlang: Links, News and Resources (1) « Angel “Java” Lopez on Blog