Member-only story

Performance of Low Level String Decoding in JavaScript

6 min readSep 14, 2022

tl;dr: TextDecoder is really fast for bigger strings. Very small strings (< 18 char ish) can actually benefit from using a simple custom decoder using an array.

JavaScript is a very high level programming language. Most languages have characters and strings and can manipulate them differently. In JavaScript, that’s simply not the case. You have a very limited amount you can do. It’s the trade off of having very simple syntax. You don’t have as much control.

Nevertheless, there are some lower level APIs that are exposed. We have text encoders, as well as access to UInt8Arrays. These basically allocate linear memory that you can access similarly to arrays.

As part of a project I’m currently working on, there is a requirement to take basically an ~8mb book, store it in a JavaScript-compatible way, and decode it so that it can be used to render pages.

I’ve been thinking a lot recently about the most efficient way to store a book. In the end I decided to store the book in binary rather than in JSON string format. So this article is going to be about some performance / benchmarks around different ways to retrieve strings from binary string data.

JavaScript does useUTF-16 under the hood, but the built-in TextDecoder API allows parses in UTF-8. So we’ll basically be working exclusively in UTF-8.

Performance of Low Level String Decoding in JavaScript

A Micro-Course on UTF-8

Written by Peter Makes Websites

No responses yet