October 2015 – Page 2

Discover Supercomputer 2 by nasa_goddard on flickr

During the 1980s and 1990s, a powerful revolution took place in computing hardware. Programs had historically been executed on mainframe computers, fed commands manually at first and then from terminals. As computing power increased and chip prices fell, terminals went from thin clients that logged in and executed programs remotely to full-fledged computing devices. Apple and Microsoft notably capitalized on this revolution. Bill Gates’ dream was to put “a computer on every desk and in every home.”

By moving computation on-site, users were given unprecedented power. Even if Internet connectivity was not prohibitively slow, it would be difficult for mainframes to compete against the responsiveness and personalization of an on-site computer.

This selective retelling and framing of history is intended to form the foundation of a present-day analogy. The personal computer revolution of the 20th century is similar to a revolution I would like to see take place in personal computation.
Personal Computation

We can define personal computation as a system that uses data and algorithms to provide a highly personalized experience to a single user. Compared to a massive, centralized, mainframe-like system, a personal computation system would take different approaches on hardware, software, and data.

Hardware

Through virtualization and containerization, hardware is becoming less directly tied to computation. While there are other configurations, let’s compare two: infrastructure you don’t own or have access to (e.g. SaaS/cloud services); and infrastructure where nobody has more control or access than you (e.g. a web host or IaaS).

“There is no cloud. It’s just someone else’s computer.”

Running a “personal cloud” on a web host (or Infrastructure as a Service provider) is still technically using someone else’s computer, but gives you different levels of control, access, privacy, and security. Notably, if you rent a computing resource from a utility, they should not have access to your data and should not be able to dictate what software you can or cannot run on that resource.

This is a comparison between two cloud-based infrastructures. In order to optimize battery life on our mobile devices, we will ultimately need to perform many types of computation on a remote service.

Software

Cloud providers perform personalized computation by developing complex algorithms and running them against the vast amounts of data they collect on you, your friends, and people who resemble you.

Algorithm: a set of rules for calculating some sort of output based on input data.

Facebook’s News Feed, Google Now, and Apple’s Siri give users a lot of functionality, but the algorithms only work within each service provider’s walled garden. You can’t use Facebook’s News Feed to surface important content your friends posted outside of Facebook (if they didn’t also post it to Facebook). You can’t use Google Now to automatically remind you of upcoming travel, appointments, and bills if you don’t use GMail. You can’t use Siri’s speech recognition outside of iOS or OS X. You also can’t get new functionality or features to any of those services unless their engineers add it or if they provide a specific mechanism for 3rd party extensions or apps.

Currently, these service providers’ state-of-the-art algorithms are far more advanced and better integrated than anything you can install and run on your own. However, there is no technical reason you can’t get most of what they provide without connecting to a company’s centralized service. In fact, if you had access to and control of these algorithms, you could apply them to data that exists outside of a walled garden or to personal data you don’t want to share with a 3rd party.

That is to say, despite all the amazing machine learning algorithms deployed by centralized service providers, they’re only able to scratch the surface of what would be possible with software outside of a walled garden.

Data

Today, in order to get highly personalized computation, people give up access to a vast array of highly personal information—where you are at all times, who you talk to and when, what you say to your friends and family, and what you like to eat, wear, watch, listen to, and do. This information is valuable enough to advertisers that ad publishers can profitably invest tons of money into servers and software development and giving the computation away for free.

As we’ve discovered during the last decade, people in general are not at all reticent about giving up privacy in return for being able to be able to basic things like talk to friends online or get driving directions.

If you told everyone it’s not necessary to completely surrender their privacy to many 3rd party companies, would they care? It seems like only a minuscule fraction of people would. Ultimately, if there is a limit to how much privacy people are willing to give up, it would probably (hopefully) be different for a 3rd party company than for a system only the user has access to. Given more comprehensive data about the user, a personal cloud type of system has potential to deliver higher personalization.

The Alternative to the Mainframe

Given the striking resemblance of today’s web services to mainframes, could there be a revolution on the horizon similar to that of the personal computer? There is no shortage of personal cloud projects or decentralized services. Sadly, many of them attempt to be open source clones of existing centralized services, which is akin to installing a mainframe in your living room and calling it a PC.

What would the software equivalent of the personal computer look like? What if we could run whatever algorithm we want on whatever data we want? What if we owned and controlled a system that did not have ulterior motives, or get acqui-hired, or prevent us from escaping a walled garden?

Existing services, like those provided by Google, Apple, Facebook, and so on, set a high bar for ease of use, user experience, seamless integration, and powerful yet simple-looking (“magic”) functionality. But at the end of the day, it is just software, and there is no technical reason why it needs to run in a 3rd party’s data center.

Servers and storage are getting cheaper, software is easier than ever to distribute, programming languages and frameworks are enabling ever more complex and extensible applications, and complex systems are becoming more easily reproducible with containerization.

I’ve been thinking about and working on a rough proof-of-concept of this for a few years, and quit my job a couple months ago to work on it full-time. It’s certainly not as easy or simple as it sounds, but I’ve come up with some interesting strategies and solutions. During the next few weeks, I’m planning on writing about some of these concepts, as I’m stretching the limits of what I can accomplish while working in a vacuum.