It's a Virtual Piece of Cake

I've gotten a bunch of questions about how Visi, the simple web front end to Spark works. This blog post is an overview.

Hosted Spark with a Simple Front End

Visi is a hosted Spark cluster with a simple web-based front end that allows Excel-savvy folks to enter formulas that get turned into Spark jobs.

The Spark cluster and front end are built on demand, hosted in Docker containers, and communicate over the network using Weave. The web UI is presented to the user via a dynamically updated HAProxy routing table.

Nuts and Bolts

Backing Visi is a farm of OS-level instances (can't really call them boxes or hardware, but they are the computing machines that run Ubuntu.) There are a certain number of machines in the farm "idling" and when the in-use level goes below a threshold, we fire up more instances. When too many instances are idle, we bring down the instances to a sane idle level.

Each of the farm instances is connected to a Weave virtual network. The front end is also connected to the Weave virtual network.

When a user wants a new Spark cluster and Visi web front end, the executor selects machines from the farm and fires up Docker containers which contain the Spark cluster and Visi web front end, assigning a custom subnet within Weave. The Visi web front end and the HAProxy host are also put on a custom Weave subnet. This means that the Visi front end can talk to the HAProxy box (but not other Visi front end instances) and the Spark cluster. The Spark cluster can talk to other members of the Spark cluster and the Visi front end, but no other Spark clusters.

When the web front end signals the executor that it has started and can accept requests, the executor adds a new "backend" to the HAProxy configuration and a front-end route based on a URL and the JSESSIONID of the browser (so only the logged in user can see the route to the back-end Visi instance).

End of Session

When the executor detects that the user's browser has disconnected from the Visi instance, the executor causes code to collect the current state of the Visi instance (including the user's current notebook status) and the Spark instances and pickles this information so it can be unpickled the next time the Visi notebook is requested.

Once the pickled information is saved, the executor signals the Spark cluster and Visi instance Docker containers to shut down. The table containing the state of the farm instances as well as the HAProxy configuration are updated. Finally, the excutor returns the Weave subnet used for the Visi instance and Spark cluster to the available pool.

Isolation

The Docker containers represent isolated execution environments for the Visi front end and the Spark cluster. Both the Visi front end and Spark cluster are executing arbitrary, untrusted code. Docker provides a mechanism for sandboxing the code so one user's code cannot access another user's code or data.

Weave provides isolation at the network level such that each Spark cluster and Visi front end can only see each other, not the other instances running on the farm.

Easy Networking