An engineer rarely gets the chance to create a big business-critical web application from scratch. I was fortunate to do just that at Avalara building a content management system called Content Central. Although it’s only available to a few hundred employees, I’m still very proud of what we made. I didn’t built it by myself of course; at least 9 engineers contributed major changes. As the technical lead on this project from day one, I can tell most of the story of this middle-aged enterprise web application.
Background and History
Avalara’s main product, AvaTax, calculates how much tax should be charged on a transaction at a certain time and place. To do that, the Content Research team translates tax regulations around the world into data that AvaTax can use. Besides tracking many strange and complicated tax scenarios and fees, they also draw and edit geographical boundaries of tax jurisdictions too.
In Avalara’s early years, customer-facing systems were given the most attention while internal systems were not. Every month, researchers put tax data into huge Excel files, which were converted into database update scripts. The amount of work they spent doing this was amazing and terrible at the same time. Eventually they had to pay down the technical and process debt to be more accurate and move faster. Management decided to create a web application to replace and modernize the content process.
When I joined the project, they gave me a few months to gather information, sit with the researchers and learn their process, and create some prototypes. Personally, I’m not a fan of wireframes and presentations. It’s really worth spending the time to make a clickable prototype that people can actually play with in their own browser. Those prototypes were critical for collaborating with the research team and planning the application.
After more than a year and a half of research, development, testing, and one big organizational change, there were plenty of high-fives when we published the first batch of U.S. sales tax rates to AvaTax. Over time we added more capabilities and brought on more teams in the same pattern: research, proposal, development, training, testing, launch, and review.
It’s not enough to just build and launch a web app and enable logins. That’s the easiest part, by far. There are many important links in the chain of of a successful application, such as:
- Old data must be cleaned and imported accurately and repeatably.
- The new application must export data at least as well as (and hopefully much better than) the old system.
- You can’t rely on changes to upstream and downstream systems. Every external dependency is a potential project-killer.
- People need training, confidence, and motivation to use the new application.
- The app has to prove real measurable value as soon as possible or it risks getting killed.
- There should be a specific moment when people start using the new system and stop using the old system, and you absolutely have to throw a party to celebrate.
Content Central is a web application with a static front-end, REST API, and a relational database hosted in AWS. The following explains what is under the hood right now, along with some thoughts of what I might do differently today.
Content Cental’s front-end started with AngularJS, but now it is a Vue application. I wrote more about what’s so special about Vue here. The app is split up into mini-apps with Vue’s multi-page build option. That is left over from when we ported the application from Angular to Vue over a few months, section by section. If this project was being built today, I’d try to make it fully SSR, maybe using Nuxt. That’s because we use Vue’s router and state management, but the multi-page nature of Content Central sometimes causes problems.
We chose to use the Vuetify UI framework, an open-source implementation of Google’s Material Design for Vue. While it’s possible (and fun!) to build a complete design system, I think we made the right call for a small team. Vuetify is awesome and has been a major productivity boost. A good alternative might be BootstrapVue, if for no other reason than the popularity of Bootstrap itself.
We use Pug markup in our single file component templates because it’s a little cleaner than HTML. The CSS preprocessor is Stylus, but SASS would be a good choice nowadays. Pug and Stylus make writing HTML and CSS a little cleaner.
We use map tiles from Mapbox to display maps, but not the Mapbox GL library. Instead, we use OpenLayers because it has much better geometry editing tools. We also use Turf.js to edit GeoJSON in both the client and the server.
The API is Express on Node. The process manager is PM2. Using the Sequelize ORM, we serve up a standard RESTful API. For production, the front end assets static assets are built and bundled with the API in a Docker image, which is then deployed to a cluster with Terraform.
If starting over, instead of Express, I’d consider using Fastify or possibly micro. I’d rather use node-postgres directly than an ORM. A GraphQL endpoint with Hasura or PostGraphile might have met 80% of our needs, but we still would have needed REST endpoints for the other 20%. I’d try to host the static portion of Content Central on a CDN outside of Node/Docker, as described in the JAMstack best practices.
The database is PostgreSQL because we make heavy use of PostGIS. Redis helps out with some caching and session management. We experimented with ArangoDB at one point, but PostgreSQL has proven itself time and again. PostgreSQL and PostGIS are fantastic, especially the documentation.
Content Central is hosted on AWS, but only uses ECS, RDS for PostgreSQL, and Elasticache for Redis. I think it could easily run on Heroku with almost no changes, and that would have been the simplest if it were allowed.
Avalara’s internal GitLab is used for source control and review. When I left, we used Jenkins and planned to replace it with GitLab’s built-in CI/CD pipelines. Bug tracking, project management, and documentation are Atlassian products (JIRA/Confluence).
Content Central’s predecessor had a map that showed all of the jurisdiction boundaries layered on top of a standard map or satellite layer.
Many people thought it looked very impressive and complicated and colorful, but in reality it had terrible usability.
The colors were assigned to jurisdictions randomly, as in
rgb(random() * 255, random() * 255, random() * 255).
More importantly, it was not very interactive.
While it’s true you could click on a point and then see the boundaries that intersect that point, that wasn’t because of how the map was made.
Because the boundaries were rendered as raster tiles, real-time interactivity simply wasn’t possible.
So, I set out to build vector tiles for our maps, but it took a few tries to get it right.
Attempt 1: pre-render all of the tiles as individual files and upload them to S3. This involved dumping all of the geometries to JSON files, then using tippecanoe to generate MVT files in the correct directory formats, then uploading that to S3. This approach was good because S3 is can be an awesome static file host, so the maps were incredibly fast and caused no load on our web servers. Testing a single US state validated this approach. However, in practice this approach was bad because, even if the data export and conversion was relatively quick (under an hour), the upload took a great deal longer (hours). The data itself was only a few gigabytes, but there were (if memory serves correctly) nearly two million tiny tile files. Even configuring the S3 upload to sync only changed files did not really fix this. In any case, daily updates were too slow and the zoom level was too imprecise for editors.
Attempt 2: pre-render map tiles as a single
.mbtiles tile file, upload to S3, but sync and serve from Node.js web servers.
When the process above was changed to upload a single file, our render process was sped up to an acceptable level, so hourly updates were feasible.
Learn more about MBTiles here.
However, syncing the tile file with the web servers was so problematic that it caused more than one outage across the little cluster.
Even hourly updates were unacceptable; users wanted near-instant feedback at a very high zoom level.
It was clear that pre-rendering geometry would never be enough.
Attempt 3: serve near real-time tile data from Postgres behind a Redis cache. This method took several tries to get to acceptable performance, but in the end it was the only way. It was only feasible when Postgres/PostGIS gained native MVT capabilities. The true hero was ST_SimplifyPreserveTopology. Today, most tiles can be displayed in under 750ms on a cache miss, although some take up to 10 seconds.
We could have improved performance even more by caching tile data until new geometries are published and pre-rendering the slowest tiles. We could also have created a materialized view of simplified geometry in the database for different zoom levels, but we just couldn’t justify spending any more time on it. In the end, Content Central ended up with fast, precise, interactive maps with a decent color palette.
TODO: Other War Stories
- Editing boundaries with OpenLayers + Turf.js
- Authenticating with SAML; JWT vs. sessions
- Uplift from Angular to Vue
- UX design process: discovery > proposal > prototype > development > documentation
- CI/CD with Jenkins, Teraform, GitHub