Monkey-patching the Express router
I maintain Scribe, a library for automatically generating your HTTP API documentation from your codebase. When you run the generate
command, Scribe fetches your routes from your codebase, uses a bunch of strategies to figure out details about them, and transforms those details into a HTML docs page. There are a lot of moving parts and complexity in this process, but this post is about the first part: how Scribe is able to fetch the routes from your app.
In the Laravel version, this process is hella easy: just call Illuminate\Support\Facades\Route::getRoutes()
, and you get an array of Route
objects, with each object containing the route details, including the path, HTTP methods, and handler (controller/method). It's a similar thing for the AdonisJS version.
The Express version is another story, though. Express doesn't provide a simple method you can call on the app
to fetch all defined routes. This means it's time for a hobby of mine: monkey-patching. When our generate
command is run, we'll dynamically modify the Express code so that it gives us what we want.
First attempt: decorating
My first attempt had a couple of moving parts:
-
a "decorator" function that would take the Express app object and modify its methods (
app.get(...)
,app.post(...)
, etc) so that they recorded where they were called. This part was important so I could fetch any docblocks there later on. For the decorator to be useful, it had to be called before the user started registering routes. -
a
getRoutes
function that would receive theapp
object and try to fetch the user's routes from the Express "API".
Here's what the end user's code would look like:
// app.js
const express = require('express');
const app = express();
// Add this 👇 (the decorator)
require('@knuckleswtf/scribe-express')(app);
// Register your routes...
app.get(...);
// Add this 👇 (needed for getRoutes)
module.exports = app;
The user would run the Scribe command like this: npx scribe generate --app app.js
, and it basically did this:
const appFile = args.app;
const app = require(appFile); // (1)
const routes = getRoutes(app); // (2)
The line (1) would execute the app file, thereby running the decorator and registering the routes. Then the next line would pass the exported app
object to our getRoutes
function, which would try to fetch the routes from it.
Assessment
There are three main metrics I use to judge the success of a process like this: how easy it is for a user to opt in, how easy it is to implement, and how reliable it is. How did this perform?
User integration: not great
There were too many steps and things to remember:
- The user had to remember to add
require('@knuckleswtf/scribe-express')(app)
. - They had to remember to add that line after creating the app, but before registering any routes.
- They had to make sure they exported the
app
object from the file (many people don't, because it's not needed for anything). - On top of all this, they had to run the
generate
command.
That's too many points of failure for me; too many opportunities for users to make mistakes. It's not awful, but not ideal. In an ideal monkey-patch, the user should only have to make a very small modification to their code, or none at all.
Ease of implementation: poor
I used quotes earlier when talking about Express' API, because Express doesn't have one. Its internals are a complicated mess. It works, and it does many clever things, but it's unnecessarily complex. You have all sorts of objects all over the place—routers, layers, stacks, routes, handles, and whatnot. Routes can have layers, and layers can have routes, and a layer can be a route, or some other shit. And mixed right in there with routes are middleware like authentication, so you need to filter those out too. Express isn't made for developers to hook into that way.
Figuring out how to extract the actual routes from the tangle of layers and stacks took several days. And by the time I was done, I never wanted to touch the thing again.
Reliability: poor
On the reliability front, this was also pretty poor. Because Express' internals were so complicated to figure out, a lot of what I did was based on assumptions, guesses and workarounds.
For instance, Express doesn't retain the original path you set when you create a route (like app.post('/path', handler)
). Instead, it converts the path to a regex, like /^\/?path\/?$/
. This means that, when fetching the routes, I had to convert the regexes back to strings, which works, but I don't know if it's guaranteed to always give exactly what the user wrote.
There was also the fact that people can register routes in Express in a bunch of different ways—you can have multiple apps, routers, sub-routers, sub-apps, and so on, and Express doesn't combine them into one simple data structure. Since my decorator implementation was tied to a specific app
instance, it made these other options really difficult to get right.
Seriously, this experience was the final thing that convinced me people need to stop using Express. It was a key part of Node.js' development, but I don't buy this middleware approach that leaves you having to implement everything yourself, and doesn't expose a proper API. 😕
A better solution
My first attempt was more of "decorating" than monkey-patching—it only added some specific behaviour to the app
object that was passed to it. But it would be better if we had true monkey-patching: attaching this behaviour automatically to any instance of express()
, even those created in a different file. There's good news: in Node.js, there are defined APIs to do that, by hooking into the module system.
For instance, if I overwriterequire.cache.express
(docs), when the user calls require('express')
, they'll get my new version. This means I can give them a modified express
, so that when they call express()
, the app
object they get is already decorated. No need to pass in your app
manually to the decorator anymore.
This led to another realization: since users don't need to pass in an app
anymore, then they don't even need to call the decorator in their code at all. The generate
command could call the decorator itself to monkey-patch Express, before executing the user's app file. So now we can eliminate the first two frustrations I had with the user integration process.
The final big improvement came from me realising that I could skip the Express "API" altogether. When a user calls app.get('/path', handler)
on our monkey-patched app instance, we already have all the information we need — the path, the handler, the HTTP method, the file name and line. So we can just record this route somewhere before handing over to Express to handle it, rather than trying to fetch it from the router later.
This was awesome! It eliminated my final concerns: this approach was way more reliable, since I wasn't relying on Express' internal API (layers and stacks), but their public API (app.get(...)
), which was documented and stable. I could get the original path, handle sub-apps and sub-routers, and users didn't even need to export the app
object anymore.
Implementation
I was going to implement this manually, but then I remembered my friends at Elastic specialise in this. Thanks to their work with require-in-the-middle
and shimmer
, it was fairly straightforward, although it still took some work to figure out what exactly I needed to patch. For Express, there were three main entry points:
- the HTTP methods on
express.Route.prototype
, because Express internally creates a new instance ofRoute
and callsroute.get(...)
when you callapp.get(...)
orrouter.get(...)
. - the
express.application.use()
andexpress.Router.use()
methods (forapp.use('path', router)
). - the
express.Router.route()
method, which is called when you create a route usingapp.route('path').get(handler)
. Here we just record the router the route belongs to, so that when our patchedget
is called (number 1 above), we have all the details we need.
You can see the full code on GitHub, but the decorator now looks like this:
function decorator() {
decorateExpress();
}
// A flag that keeps us from doing the monkey-patch multiple times
decorator.decorated = false;
// The place where we record our routes as they're registered
decorator.subRouters = new Map();
function decorateExpress() {
const hook = require('require-in-the-middle');
const shimmer = require('shimmer');
// Hook into require('express'') and return ours
hook(['express'], function (exports, name, basedir) {
if (decorator.decorated) {
return exports;
}
// Handle app.get(...), app.post(...), and so on
const httpMethods = ['get', 'post', 'put', 'patch', 'head', 'delete'];
httpMethods.forEach(function shimHttpMethod(httpMethod) {
shimmer.wrap(exports.Route.prototype, httpMethod, original => {
return patchHttpVerbMethod(original, httpMethod);
});
});
// Handle sub-routers and sub-apps
// ie app.use('/path', otherApp), app.use('/path', otherRouter), router.use('/path', otherRouter), etc
shimmer.wrap(exports.application, 'use', patchAppUseMethod);
shimmer.wrap(exports.Router, 'use', patchRouterUseMethod);
// Handle app.route(path).get()
shimmer.wrap(exports.Router, 'route', original => {
return function (...args) {
const routeObject = original.apply(this, args);
// Track the router that this route belongs to
routeObject.___router = this;
return routeObject;
}
});
decorator.decorated = true;
return exports;
});
}
The patchHttpVerbMethod
, patchAppUseMethod
, and patchRouterUseMethod
do the real work of recording the routes in decorator.subRouters
Thedecorator.subRouters
map is the key that allows us to support sub-apps and sub-routers. It acts as a sort of tree. Every route in Express is added to a router instance. A user can have multiple apps and routers, and mount them on each other using app.use()
or router.use()
. But since we won't know when they'll call app.use('/somepath', thisrouter)
, we store all the routes they add on thisrouter
in decorator.subRouters[thisrouter]
. When the user finally calls app.use('/somepath', thisrouter)
, we fetch the routes from decorator.subRouters[thisrouter]
and add them to the decorator.subRouters[app]
.
const app = express();
// decorator.subRouters is empty.
app.get('/a', someHandlerFn);
// decorator.subRouters now looks like this:
// app => GET /a
app.post('/a', someHandlerFn);
// decorator.subRouters now looks like this:
// app => GET /a
// POST /a
const subRouter = express.Router();
subRouter.get('/a', someHandlerFn);
// decorator.subRouters now looks like this:
// app => GET /a
// POST /a
// subRouter => GET /a
app.use('/b', subRouter);
// decorator.subRouters now looks like this:
// app => GET /a
// POST /a
// GET /b/a
By the time all the routes have been registered, there should be only one router left in decorator.subRouters
—the main router. To fetch the routes, the generate
command (GitHub) simply has to fetch those routes.
// Monkey-patch the Express router so that when a user adds a route, we record it
require('./decorator');
// Execute app file so routes are registered
require(appFile);
const [routes] = [...decorator.subRouters.values()];
And that's it. The new solution is much better in terms of user experience, and was much easier to implement, although still tasking. There are still a few parts that are fragile, where I had to rely on some specific internal detail of the Express code (like this), but it's way more reliable than before, plus there are now tests to check that it works. So it's a definite win.
I write about my software engineering learnings and experiments. Stay updated with Tentacle: tntcl.app/blog.shalvah.me.