What's the Future of Spreadsheets? November 17, 2019
28 Years Ago
On November 17th, 1991, I started coding a spreadsheet for NeXT called Mesa.
I was a 27 year old, law school graduate and I wanted to write some cool software. I started a company, Athena Design, to publish Mesa.
Well, technically, I founded Athena Design to create an application store for NeXT software because in 1989, the only way to distribute software for NeXT boxes was via optical media and the media itself cost more than $100 per disc... but I digress.
Mesa was the first spreadsheet to accept data from external feeds, re-compute, and potentially trigger external events. This was the stuff of automated trading... and the formulas for automated trading could be updated in minutes.
Over the Years
Over the years, spreadsheets were my prime focus.
I did an OS/2 version of Mesa.
I did a Java-powered, multi-user, browser-based spreadsheet called Integer.
I've dabbled in trying to bring the power of programming to non-self-identified programmers.
Over the last 8 years, I've dabbled with a couple of incarnations of stuff I called "Visi" including an iPad version demonstrated at the emerging languages conference and a Spark/notebook front end.
This summer, I grabbed a copy of Mesa for OS/X from the kind folks at P&L Systems. The P&L folks kept Mesa for NextStep alive as an OS/X app for many years until OS/X stopped supporting 32 bit mode. I was looking to update the code to 64 bit mode... and given the number of "elegant engineering solutions" I did a lot back in 1991 to save memory (Mesa ran on boxes with 8MB of RAM... than 8MB was shared across the OS, windowing system, and other apps... so the reality was the spreadsheet ran in less than 1MB... every byte counted).
Digging through the code, I realized that it was likely not the best thing to do yet another spreadsheet for a platform that has good spreadsheets... but I started thinking...
I mixed in some thinking from Stephen O’Grady about how he uses R) to do analysis. And that triggered some ideas about taking spreadsheet formulas, turning them into R formulas that had the same semantic meaning.
Anyway, for the last bunch of weeks, I've been thinking...
The first thing I did was to think about how spreadsheets mostly exist today... the rigidity of the row/column paradigm is not just a presentation layer, but pervades the underlying implementations:
Since the 90s, not that much has changed in spreadsheets.
Yeah, pivot tables... but the spreadsheets are a still the row/column things defined by VisiCalc.
What is enduring?
I think the two enduring things about spreadsheets are:
- The grid presentation is a great way for humans to consume information.
- The simple mechanism that folks use to define the relationship between/among cells is an insanely powerful mechanism for people to do what they do best: understand things in context.
- Causality is deeply ingrained in spreadsheets... changing a cell causes the spreadsheet to recalculate... there is a simple, causal mechanism for exploring... playing "What If?"
Grids and Relationships
So... what if the spreadsheet was represented to the end user as a grid and the end user can still define the relationships among cells... but the internal representation of the data was not rigid?
In the above image, I have a bi-directional relationship between the cells and a database. Changes in cells are written back to the database and new/updated data in the database is sent to the spreadsheet... and, yes, I'm sure there are Excel plugins that do this.
But this is the start of thinking about a bi-directional projection of data.
The data may go through multiple projections.
Projections may combine multiple inputs.
Ultimately, the data is presented to a user in a grid.
Bi-directional projections all data to be both viewed and entered. Changes in the sources are reflected in the projections (this is a generalization of real-time data feeds and database queries).
This deals with data, but what about formulas?
Dealing with formulas
Here is the beginning of my thinking about formulas:
Basically, users get "formula in cell" just like Excel and unlike Improv.
However, based on automated analysis of the data, the formulas can be generalized from one cell to many. This could be offered as an option to the user.
And in ranges that grow or shrink (e.g., data from a database), the formulas automatically adjust. Mesa has "auto-grow ranges" that did just this... but the ranges were mostly a hack on the underlying grid rather than a first-class part of the implementation.
Another interesting attribute of generalizations of the formulas is that once a semantic representation can be built of the formula, the formula can be cross-compiled to R or Spark or some other system.
Thus, not only does the spreadsheet become a front-end for viewing data, but it also becomes a front-end to R or Spark or other systems.
That's nice... but...
So... I've ranted a lot about how I hate annotations in languages like Java and Python. I rant because annotations are another language glommed onto the base language... but the annotation language has no real structure or rules... it's the string-typed key/value pairs of the programming world.
"What does this have to do with spreadsheets?"
The projections I was talking about above should, themselves, be expressed with spreadsheet formulas.
This means that spreadsheet formulas have to expand a bit... expand to support most of the set operations that SQL supports. With set-focused functions, it strikes me that the projections can be defined with formula functions. With the projections, all else will follow.
Where from here?
These are some musings. I have a day job. But, spreadsheets are my passion and in my blood.
So... maybe there'll be ways we all can talk about these notions and turn them into a functional open source project.
Happy 28th anniversary, Mesa!