Google Friend Connect: Deface Your Site

After adding a few Google Friend Connect widgets to my site, I’ve determined that relationship just wont work out. GFC looks like trash. All the widgets must be pre-formatted to a specific pixel width and height. I’ve spent hours searching for this theme, tweaking it, crafting everything to the em, and making sure everything integrates perfectly. Imagine spending hours baking the perfect multi-layer wedding cake and then at the end, scribbling the name of the bride and groom on a few sticky notes and slapping them on the frosted flowers. Sure, value is being added, but at what cost?Cake

I was a bit afraid that Google had the girth to capture some of the potential mainstream demographic from Disqus but now I realize that Google’s going for the cut-copy-paste website owners. Google used to be so good at design but somehow they’re becoming the web2.0 version of good ol’ 90′s Geocities.

Disqus will soon have more mainstream accessibility with the comming addition of Facebook Connectect. Facebook Connect will achieve the same “value added” as Google— linking Facebook accounts with websites— but with the elegance and tailored integration of Disqus blog connection.

Talking to Erlang

In a previous post, Scalable Web Apps: Erlang + Python, I talked broadly about using HTTP to make external applications talk to an Erlang cluster over the internet or network. The following code is an example of a MochiWeb server set up to receive HTTP requests with embedded JSON. HTTP is basically used as a nice wrapper around whatever data is being sent and data is wrapped-up in a serialized form called JSON. The following goes in the generated MochiWeb skeleton code file called project_name_web.erl:

%% @author Luke Hoersten <Luke@Hoersten.org>
%% @copyright 2008 Luke Hoersten.

%% @doc Web server for example.

-module(example_web).
-author('Luke Hoersten <Luke@Hoersten.org>').

-export([start/1, stop/0, loop/2]).

%% External API
start(Options) ->
    {DocRoot, Options1} = get_option(docroot, Options),
    Loop = fun (Req) -> ?MODULE:loop(Req, DocRoot) end,
    mochiweb_http:start([{name, ?MODULE}, {loop, Loop} | Options1]).

stop() ->
    mochiweb_http:stop(?MODULE).

loop(Req, _) ->
    io:format("~nReceived:~n~p~n", [Req]),
    case {Req:get(method), Req:get_header_value("Content-Type")} of
        {'POST', "application/jsonrequest"} ->
            do_stuff(mochijson2:decode(Req:recv_body())),
            Req:ok("text/plain",
                   [{"User-Agent", "Erlang/example/0.1"}],
                   <<"Processing Request">>)
        _ ->
            Req:not_found()
    end.

%% Internal API
do_stuff(Data) ->
    io:format("~nData: ~p~n", [Data]),
    Nodes = [{'erlang@pod5-1', 4}, {'erlang@pod5-2', 4},
             {'erlang@pod5-3', 4}, {'erlang@pod5-4', 4},
             {'erlang@pod5-5', 4}],
    plists:foreach(fun stuff:do_stuff/1, Data, {nodes, Nodes}).

get_option(Option, Options) ->
    {proplists:get_value(Option, Options), proplists:delete(Option, Options)}.

How it Works

The server is basically a loop which uses MochiWeb to pull HTTP requests out of a TCP/IP socket. If the HTTP header is a JSON HTTP POST, then the JSON is deserialized and applied to the function do_stuff. do_stuff then parallelizes the data over a cluster of Erlang nodes using the plists library for processing which can talk to each other with message passing as needed. The “OK” response is sent back to the application which sent the HTTP request, after the parallelized processing completes, without doing any exception handling.

For reference, here is the example Python from the other end of the connection:

def send_to_erlang(data):
    url = "http://erlang.nodes.tld:8000/"
    body = json.dumps(data)
    headers = {'Content-Type': 'application/jsonrequest',
               'User-Agent'  : 'Python/Project/0.1'}
    urlopen(Request(url, body, headers))

It may seem simple, and it is simple because of Erlang and MochiWeb, but before seeing how this all fits together, it can seem mysterious. Please feel free to comment with questions and I’ll try to clarify as much as possible. Thanks to the MochiWeb mailing list for helping me when I was starting out.

Why Make Erlang a Functional Language?

Why Does Erlang Have Weird Syntax?

I’ve heard the argument many times. People “don’t like Erlang’s syntax so [they] don’t like Erlang.” I, for instance, didn’t understand the block terminator syntax when I was first learning Erlang, so I asked Yariv Sadan about it:

Erlang syntax came from Prolog. The ‘end’ keyword is used to end code blocks that are contained inside the body of a function, and the ‘.’ symbol is used to close top-level function definitions.

Yariv Sadan

Amaestramiento de bombillos, apertura de persiana metálica, Reparación de cerraduras de todas las marcas: Fichet, Dorma, Mottura, Tesa, Fac, Amig, Abus; Apertura de puertas acorazadas Barcelona con la mayor rapidez y sin romper la cerradura. También abrimos puertas de coche.

The answer in a nutshell: To attain more language features, like message passing in Erlang’s case, the language incurs some overhead so efficiency gains must be made to compensate. It’s the same concept behind data structures: In order to know more about data, semantic restrictions can be imposed. In this case, the data is the program source itself and the imposed restriction is data immutability.

Erlang’s data immutability can be split into two levels: intra-process variable immutability and inter-process data isolation.

Why Doesn’t Erlang Have Share Data?

Concurrency is obviously the justification for Erlang’s isolation memory model, with the alternative being locking to synchronize destructive shared data operations. Lock complexity makes scaling hard because locks aren’t composable and are unassociated with the shared data. If data isolation is enforced between concurrent components (what Erlang calls processes), composibility can be maintained because destructive shared data modifications do not occur. Composability, or building instructions with instructions, is an inherent concept of programming and it’s why sequential programs are desirable.

This example excerpt from an older java.lang.StringBuffer class illustrates a common locking misconception. The author wanted to compose the length and getChars calls into one atomic operation to ensure critical data wouldn’t be modified between them. The author attempted to compose the calls by wrapping them in a synchronized method which locks this, not sb. Unfortunately, another thread could still modify sb in between the calls.

public final class StringBuffer {
  public synchronized StringBuffer append(StringBuffer sb) {
      int len = sb.length();
      ... // other threads may change sb.length(),
      ... // so len does not reflect the length of sb
      sb.getChars(0, len, value, count);
      ...
    }
    public synchronized int length() { ... }
    public synchronized void getChars(...) { ... }
    ...
}

Software transactions and message passing are fundamentally better synchronization abstractions than locks because of, among other things, composibility and accuracy, at the cost of greater overhead. STM can be built from message passing and I consider both to be acceptable replacements for lock based synchronization.

Why Does Erlang Have Data Immutability?

It seems that intra-process data isolation would be enough for a clean synchronization scheme, yet this is not entirely true. The immutable shared data model can be made even more complete when applied at the local sequential data level to achieve even more benefits:

  • Immutability efficiency gains compensate for message passing overhead.

    Immutability of data can, in many cases, lead to execution efficiency in allowing the compiler to make assumptions that are unsafe in an imperative language.

    Wikipedia

    To reiterate what I said earlier, the restriction of data immutability allows the compiler to know more about the program and as a result, regain some efficiency.

  • Immutability simplifies type policies for message passing.

    In Scala, [an OOP language with message passing,] you can send between actors pointers to mutable objects. This is the classic recipe for race conditions, and it leaves you just where you started: having to ensure synchronized access to shared memory.

    Yariv Sadan

    To expand on the type policy issue a bit more, it’s simpler to make all data immutable, both local and shared. This reduces the need for special cases intended to disallow sending pointers, or data structures containing pointers, through message passing. This congruency keeps message passing at the core of Erlang’s semantic focus.

  • Immutability makes Garbage Collection simpler, faster, and soft real-time.

    The generational [Erlang] collector is simpler than in some languages, because there’s no way to have an older generation pointing to data in a younger generation (remember, you can’t destructively modify a list or tuple in Erlang).

    James Hague

    Again, more restrictions means different ways of efficiency gains.

Downsides to Erlang’s Non-destructive Memory Model

Many common algorithms are designed with destructive memory in mind. As a result, Erlang doesn’t have some of the same types of data structures and libraries as imperative languages. This, combined with the confusion of learning a completely new programming paradigm, can be a large deterrent for learning functional languages. But what libraries Erlang lacks is largely made up for by an extensive actor based library platform.

Virtually all “big” Erlang libraries use Erlang’s features concurrency and fault tolerance. In the Erlang ecosystem, you can get web servers, database connection pools, XMPP servers, database servers, all of which use Erlang’s lightweight concurrency, fault tolerance, etc.
Yariv Sadan

Erlang is clearly a domain specific language, focusing on problems that can be solved with parallelism. Erlang’s libraries favor this same domain and Erlang holds its own against even C in the targeted many-thread/process arena. Note: the “thread-ring” test on the last line of the chart:

note the thread-ring test

An Erlang web server was also compared to Apache as another more practical comparison (again, intentionally a problem in Erlang’s domain). “Apache dies at about 4,000 parallel sessions. Yaws is still functioning at over 80,000 parallel connections.” So, Erlang may be hard to use because of its lack of conventional libraries or its different syntax, but as I’ve hopefully shown, the differences were chosen by design with good reason. Erlang is clearly one of the strongest languages in its parallelism domain and after a little over 20 years, is a very mature language.

Google Health Can Fix U.S. Healthcare

Current Healthcare Situation

Google Health

Healthcare in the U.S. is far from equilibrium, with hyper-inflated costs in response to insane insurance costs. This causes healthcare to be unattainable by most lower class and an increasing number of middle class families. Healthcare should be an inalienable right but people avoid visiting the doctor because to them, the impending financial situation is worse than health risks. Still, the U.S.’s healthcare system is one of the best in the world in terms of quality and speed. The looming threat of being sued helps keep us safe though it’s grown so large from being unchecked that it’s too costly for both parties: the healthcare providers and the patients.

Insurance Issues

A free market is a market in which prices of goods and services are arranged completely by the mutual consent of sellers and buyers. By definition, in a free market environment buyers and sellers do not coerce or mislead each other nor are they coerced by a third party.
Wikipedia

Healthcare is another story. Insurance companies are the “third party” behind both sides, the buyers (patients) and the sellers (healthcare providers), and are not working for the “mutual consent of buyers and sellers.” Furthermore, most healthcare is not actually paid for by the patients directly, but instead by patients’ employers who are, by law, not allowed to know about the employees’ health issues. This dichotomy is what has enabled healthcare costs to grow so wildly out of control.

Google Health’s Fix

Eric Schmidt’s keynote speech at the Healthcare Information and Management Systems Society Conference directly addressed what Google sees as “the problem” with U.S. healthcare and how Google could help fix it. In short, put information in the hands of patients to enable informed buying decisions and to rebuild the “mutual consent.” Currently, when a patient needs a medical procedure, say a surgery, they don’t shop around for the best price. The patient goes where their insurance company tells them; Where the insurance company has business affiliations. So informing patients will theoretically break the “employer/insurance payment” dichotomy, disarming the “third party coercers.”

Other benefits of Google Health are clear from a computer science standpoint. Technically it will serve three purposes:

  • Cloud storage — Let the IT professionals deal with data backup and retrieval instead of local medical staff.
  • Information channel — Directly connect patient information to patients and healthcare provider.
  • Privacy — Take sensitive information out of the clear-text paper world of office workers and into the human-free, automated, and encrypted world of cloud computing.

Privacy Issues

Clearly potential Google Health uses will have privacy concerns regarding Google obtaining access to medical records as well as having records available online. IT specialism in general has some clear benefits over the paper trail world:

  • Reduce human interaction — When paper information is being sent around, it passes though many hands in clear text. Most hands are simply staff members intended to help organize office work and should not have access to the sensitive content. Computerized health record retrieval and storage would minimize and automate the amount of human interaction needed, reducing the amount of human mistakes as well as accidental (or intentional) security leaks. Catch Me If You Can illustrates the 1960s banking industry’s similar situation to today’s healthcare industry. Today, all banking is computerized and online. Paper trail security is a myth. For example, George Clooney recently had his hospital records leaked to the press and something like 27 employees were facing repercussions. There were probably many more people that had seen his records as well.
  • Google already knows everything about us — Between searching and emails, Google knows what its users are thinking. They are data mining experts and easily have the ability to learn about most users’ health issues already. There is no hiding form large corporations and laws are in place to protect against intentional misuse of healthcare information.
  • Breaking into an office is easier than hacking into Google — Hacking requires a much higher level of knowledge and is many times more traceable than breaking into an office. If employers wanted access to medical records, it would be safer for them to stage a break-in physically rather than virtually. Local offices also store records in computers and probably gave access to IT consultants as well. Google Health gives the IT side some accountability.
  • Companies are companies — Whether it’s a hospital or an IT company, it’s still private firms charged with keeping our data private.

The office paper trails will probably be around for quite some time but Google is providing a layer beneath it: secure storage, access, and transfer for medical records. From a technical standpoint, our data only stands to be more secure and the small risk of this being false is worth it for what we stand to gain: The free market that was intended by the private healthcare design. Informing patients means educated and mutual consent on healthcare costs.


Suspicious Design Similarities

Luke.Hoersten.org Logo

While reading a design blog about new site designs and typography in early April of 2008, I noticed a site called “Hell Yeah Dude” which bore a strikingly similar title and logo design to my personal web site, Luke.Hoersten.org. In particular, the lighter colored top level domain (.com/.org) in contrast to the darker title text, and the asterisk logo and placement with respect to the title. It turns out the site owner, Patrick Algrim, released the site on April 9th, 2008. I have been using the lighter colored top level domain title style since around December 6th, 2007 and released the asterisk logo around March 31st, 2008. That’s at least nine days before Patrick’s design was released.

Hell Yeah Dude Logo

I don’t mean to suggest that I’m the first to come up with lighter colored TLD with a darker title or the asterisk logo, and perhaps I’m being a bit presumptuous to even suggest that Patrick may have even seen my site, but it seems suspicious the way the two design elements have been used together in a fashion so close to mine. I’m sure it’s just coincidence but I wanted to be clear that I didn’t rip the design from Patrick.