In love and war, augmenting the web.
⌗HTML Headaches.
Bug, after bug report was coming to feedback channels it felt so overwhelming. This was an indication that we were doing something wrong, especially when we were breaking the functionality of other web pages. Toucan is meant to augment the web, but often we were breaking pieces of the web too. This often can happen with extensions, especially an extension that is supposed to work across the entire web replacing content. I like to believe that all these extension creators never have the intention of breaking other web pages, but we have the power to do that. Sadly sometimes even well-intentioned extensions break other pages.
I remember seeing some fixes make it into the codebase that was attempting to fix these issues. It seemed more like a game of a whack a mole to which the holes were webpages and a mole would pop up on one. This sometimes could be compounded because it depended on a version of a website. There needed to be a change, and I would like to talk about a major change we made to be a better collaborator with all the other developers on the web.
⌗Augmentation
As a web extension, it’s easy to lose sight of how important it is to be a good collaborator with the other developers on the internet. Extensions, especially Toucan, seem to be in this augmented world above it all. Flying by and changing things we see fit. Past implementations of Toucan did a lot with the DOM. We used to take a string of HTML and shove it into the page. The level of destruction that this made was vast. On pages that were articles and just had content that was totally fine, but when you are on a page like Google Maps this because very apparent. Toucan was the aggressor in these situations. We would be taking people’s hard work and scramble their sites HTML. If Toucan was planning to work across the web we had to change. The code below had to go.
htmlElement.innerHTML = `
Toucans are a <a href="/fruit">fruit</a> eating <span lang="es">pájaro</span>.
`
There is a number of reasons why this piece of code is not great. Let’s kinda breeze over that there are tons of potential issues with reevaluating HTML and focus on what this means in a modern web stack and why this broke so much.
⌗I’m going to need a reference
The code above that re-evaluates HTML is not only creating new DOM elements but also is now going to break references to any DOM elements that other programs have been holding. Like I mentioned above this is not a huge issue in articles. They solemnly need to have a reference to a span or anchor tag, but in a modern application that is not the case. Let’s say, hypothetically, that in the above replacement we were in a React application, that react application had some bindings on that anchor tag for the word fruit. Maybe the application shows a useful tooltip here or attached some smooth scrolling. This would all be broken in by this replacement due to the anchor tag now being a different element and all references being broken.
You may see how this can be an issue, imagine an extension going through your nice and neat code and breaking references to elements. This is what Toucan was doing, and there are still extensions out there doing the same thing. This method of replacement was breaking all kinds of things from tooltip, accordions, scrolling on maps, broken links. These breaks were not only frustrating for our users, but also for us. We essentially had to accommodate everyone’s code, coding style, and make sure that we did not interfere.
Now the question is how did we resolve this? This all comes down to one line of code.
textNode.replaceWith([pre, translation, post])
This one line of code was a powerful realization for me for a number of reasons. The first was that I never really played around with text nodes, I always am focused on HTML elements. Not realizing there was a lot of depth to the API of a text node I was able to see that we did not need to touch HTML elements when replacing content. This is important because you can not attach interaction with text nodes, only HTML elements. That means that if we were to replace a text node there was no chance that a program was going to have event listeners attached to that node*. We would be able to replace content on the page, but that content would be something that would not break any functionality. Finally, a way to replace that would be light as a feather on these web pages.
⌗Making it real
After figuring out that we could replace text nodes, there were a few other pieces to this rewrite that we needed to test out and make sure it was possible. These are things that are more relevant to Toucan, a React app inside a content script. Let’s just say we figured out the other big piece of this solution.
Another thing was that sure it was just replacing nodes on a page, but that was not just it. Our translation pipeline was creating HTML, parsing HTML, and worked based on HTML. We would need a set of utilities to be able to deal with this new HTML*-less* world. Before starting the project when experimenting with all this technology I started building up a set of tools into a library that I called Aracari. Aracari is just some simple tools to deal with text inside of DOM nodes. This tooling actually was super useful, it not only allows us to replace text the way you would do it in a string. Aracari also was able to make it so we could explore this data in areas to which we did not have to access the DOM. Overall a huge win for us.
⌗It’s all in the details
Now is our solution perfect? Not at all, I ran into many hurdles that I made some hacks for to make it work, but the key was that now we are not breaking our user’s experience. In other words, we were not completely breaking the web anymore. Looking back here are some things that I would like to handle more delicately and gracefully.
The biggest point of all this is, when building a web extension there are a lot of ways you can just break things. Being a good web citizen pays off whether through fewer bug tickets or greater satisfaction from your users.
Looking to learn a new language? Try out Toucan.
*Now I do want to acknowledge that yes programs like React hold references to text nodes, that being said it does not break anything but some tracking of that text node.