To me, the most exciting software is the type that sparks creative use cases. Software that ends up being used in ways that its developers never anticipated.
This tends to happen primarily with software that has managed to transform itself into a platform. Rather than pretending to be able to cover every single use case under the sun, effort is put in making the product extensible by either the user itself or third parties.
This is not a novel concept. Offering an API so that others can programmatically poke at it has been a common practice for a solid few decades. Initially this happened at the operating system level (through interrupts and system calls), then progressively moved up the stack to REST APIs for SaaS products running in the cloud.
While offering public APIs has become standard, some products go beyond this, enabling even extension of the application’s user interface.
I myself became interested in the topic of extensibility about a decade ago when I worked on Cloud9 IDE. Integrated Development Environments operate in a virtually infinite problem space. They attempt to integrate your entire development environment. Therefore IDEs need to cater to different technology stacks, build tools, debugging tools, profilers, deployment tools, code analysis tools, communication tools, continuous deployment tools. The list goes on and on. There’s not a chance a single company could build an IDE that covers everything. Therefore all IDEs have an extension story. And indeed, so did Cloud9.
Obviously, having to host and run extension code in our infrastructure, we could not allow users to run their own extensions this way. This would a huge security risk. As a result, we always had to carefully curate extensions. Either we wrote them internally, or we carefully audited third-party ones before allowing them to be run in our infrastructure — with varying level of success. It was not uncommon for extensions to unintentionally break various pieces of UI, or to result in significant performance issues on the server. We never really found a great solution to this problem.
While code editors have a long tradition of extensibility (think Emacs and vim), these days the ability to turn products into platforms is becoming more and more widespread.
To summarize, running extension code alongside the rest of the application is a flexible and cheap to implement model, but challenging for a few reasons:
- Security: if you control the environment in which extension code runs and you trust the extension code itself — all is fine. However if not: just allowing arbitrary code execution in a hosted application, or even in a browser tab can be dangerous.
- Fragility: even if you create and carefully document extension points to hook into, it’s easy to break things even unintentionally. A CSS style may apply in surprising ways, a global variable may override something unexpected. If you allow server-side execution of extension code, this code may crash and take the server process down with it. It’s also very likely extension code will rely (without your knowledge) on undocumented APIs and break once those are changed or removed.
- Portability: it is a multi device world we live in. While practically all devices have a web browser these days, many people still prefer a native experience. How do you write extensions that do not rely on the internals of a particular client and run anywhere? On the server-side there is a similar challenge. You may want to run your code on a massive AWS-scale cloud, or on the Raspberry Pi in your closet. How do you write code that abstracts from that?
- Hot reloading and unloading: If you load code into your main process, it’s often hard to undo. This can be a hassle during development, often requiring to restart the server or client between updates.
A few years ago, I started to work on a hobby code editor project called Zed, which I ultimately abandoned after power houses like Microsoft entered the scene with VS code. One of the things I wanted to try to address with Zed was some of these extension-related challenges.
In Zed, I took a different approach than what most applications do today: rather than allowing extensions to have access to everything, I ran them in a pretty confined sandbox.
Sandbox all the things!
Zed ran as a Chrome App (once positioned as the “future of the desktop app” by Google, but since largely decommissioned). Chrome apps offered a sandbox feature, somewhat akin to running code in a sandboxed iframe today. Extension code would be
evaled in this sandbox. The code would not have access to the DOM or anything of significance. Every interaction with the main application had to happen through carefully curated APIs (that worked through
postMessage‘ing JSON objects across the sandbox boundary).
Here is an example of a basic “Git blame” extension in Zed. The
zed/session packages proxied requests to the main application. Yes, in this case this still exposed the ability to run arbitrary shell commands, but at least the hosting application (Zed in this case) had control over what the extension could and could not do.
Beside safety and isolation — having each extension run its own sandbox had another nice side-benefit: simply reloading this sandbox and loading the new extension code resulted in a very fast and safe hot reload workflow without having to restart the editor itself.
This approach trades improved security and reduced fragility for reduced flexibility. By default, extension code can do nothing more than spin up the CPU and hog some memory. Everything it can do has to be explicitly designed upfront in two directions:
- Outside in: how is code in the sandbox triggered by your application?
- Inside out: how does code in the sandbox talk back to the application?
Do you want extensions to add items to your context menu? You need to offer a way to define those menu items, and specify what code it should invoke once selected. The good thing is you can now make sure these definitions make sense for all your clients, e.g. appear in a right-click context menu on desktop while appearing after a long tap in a mobile app.
Reversely, does your extension code need access to the file system to do its job? You need to expose these ability through sandbox APIs. The opportunity here is that you can explicitly specify the paths the extension has access to, without allowing everything that node.js’ fs module allows.
Pretty cool. If I may say so myself. I posted a few videos on Youtube pitching you the concept.
While I did not connect the dots immediately, a while ago I finally did.
There were three common aspects that were recurring:
- The careful design of hooks provided by the platform: how are functions going to be triggered? How do we configure these extension points? This could be a generic event bus, where extensions specify the events to listen to. They could also be higly application and platform specific, like the ability to add extra items to a context menu and trigger functions as a result.
- The sandbox APIs: how does extension code interact with its environment? What APIs do you provide to allow it to do useful things, without giving it free reign to everything?
What keeps fascinating me is the portability potential of functions. In this model, the same code could run on anything from a giant cloud server all the way down to your mobile phone — how could that be leveraged? When AWS lambda was conceived, its only relevant sandbox was the cloud: your code would run in AWS’s massive (spoiler alert) server-powered infrastructure. More recently they launched Lambda@Edge where your lambda code can run closer to the customer, at the CDN level which could be a few miles from your house. However, what about the actual client? Why not let code be run on the user’s device itself? It doesn’t get much edgier than that.
Potentially, this “where should code execute” can be determined dynamically. In some cases it may benefit from low latency for user-interaction based cases. When you type a character into a textbox, you wouldn’t necessarily want each key press to be sent to the cloud to get auto complete results, for instance. In others it may be better to move it into the cloud when large streams of data need to be processed. In some cases privacy may be a bigger concern, so you’d still want that processing to happen on device. Wouldn’t it be great if your code wouldn’t have to worry much about that and it would just figure it out?
I’ve been thinking about this problem a fair amount over the last few months. And to make progress on it I decided to, rather than put pen to paper, put keyboard to code editor.
Indeed, I’ve been writing code in my spare time again. Yes, this is also why it’s been quiet here at Zef+. I was writing TypeScript instead of prose.
However, I paused the project for some time when I realized I didn’t have a compelling use case. What application would actually benefit from the ability to freely move code execution from the browser to the server and back again (which is where I saw a lot of the potential back then)? 🤷
Then, in my journey to find the ultimate writing and note taking app I discovered Obsidian. Obsidian is a knowledge management and note taking application. Obsidian runs as a locally installed desktop (and recently mobile) application. There’s a lot to like about Obsidian. For instance, rather than relying on some proprietary file format, it stores all its notes as markdown files on disk. I like markdown.
However, its true power comes from its very rich ecosystem of extensions. There are extensions that allow you to query your note collection as a database, for instance. Or to render your Markdown note as a Kanban board. Or to pull your Kindle highlights into your notes. That’s pretty damn cool!
I spent a few hours trying to build a Ghost publishing for Obsidian, during which I figured out how this all worked under the hood. Restarting the UI to check if my code worked. Frustratedly I checked Github to find the Obsidian source code, but then realized it wasn’t actually open source. So I thought… opportunity!
The buried lede
The result is the application I’m using right now to write this very article. Half this post I wrote at my desk on my laptop. Half of it in the garden on my iPad. I will also be publishing it to Zef+ via its Ghost extension, which, indeed — I ultimately implemented for my new app instead of Obsidian.
I use it to take notes during my 1-1s. It’s the tool I use to track todos. And to follow my team’s activity on Github.
And if you ever visit me at my house and push that mysterious button in the kitchen, which toggles (short press) or switches (long press) Internet radio stations streamed to the homepods in our main living area — yeah, that’s powered by it too.
And those on-call schedules that are pulled from OpsGenie and posted to our Mattermost channel every week. Yeah, they are compiled and posted by it as well.
The most exciting type of software is the type that sparks creative use cases. Software that ends up being used in ways that its developers never could have anticipated. Including me.
I’d tell you more about this application, but look at the time! We’re already over 2800 words in, so more on that another time.