Jul, 2018

Newtype (Tagged Type) in TypeScript

One of my favorite features in Haskell is "newtype". It allows you to wrap another type without any runtime overhead. Usually this is important when dealing with types that have the same underlying representation but should never be mixed together. Here is a somewhat contrived example in Haskell:

newtype Dollars = Dollars { fromDollars :: Int }
newtype Euro = Euro { fromEuro :: Int }

processEuro :: Euro -> Euro
processEuro x = x

main = do
    return $ processEuro $ Dollars 3 -- type error

Since most of my day-to-day work is done in TypeScript I’m quite often tempted to use a newtype, except I couldn’t, as there was no way to define a new unique type until TypeScript 2.7 where this feature was added to support ES2015+ Symbols.

Two flavors of newtype in TypeScript

Interestingly, addition of unique symbol, combined with other advanced type features yielded not one, but two ways of creating newtypes in TypeScript. While similar, they do posses slightly different semantics and may be more suitable for different code bases or peoples' tastes.

Intersection Types

The idea of the solution is to attach a “tag” field to any existing type by using an intersection type operator (&). A more production-ready and safe version of this is described in this TypeScript issue, but very generally it looks like this:

type Dollars =
    number & { readonly __tag: unique symbol };
type Euro =
    number & { readonly __tag: unique symbol };

function processEuro(money: Euro) {};

processEuro(5); // type error
processEuro(5 as Dollars); // type error
processEuro(5 as Euro); // works!

As a side note, you should never actually use numbers to represent currencies in real applications.

This solution has several important aspects. Firstly it has no runtime overhead, as the values of this newtype are created through a cast. Casting does have a drawback though: a value can be converted to a newtype anywhere in user code which might not suit cases of validation, but the fact that TypeScript has any type exposes the same class of problems.

More importantly, since TypeScript is structurally typed we can pass values of our newtype anywhere the base type is expected. This can be very nice in general, as we can use the type in all the utility functions we might have for original, but also could be quite undesired if we want to maintain certain invariants of the value:

type SortedArray<T> =
    T[] & { readonly __tag: unique symbol };

function sort<T>(array: T[]): SortedArray<T> {
    return [...array].sort() as SortedArray<T>;
}

const sorted = sort([3, 7, 1]);

// no error here, but should be:
const notSortedAnymore = sorted.concat(2);

Fake Boxed Type

Type definition here looks very similar, just instead of an intersection, we say that our value is stored within a field of an object:

type Dollars = {
    value: number;
    readonly __tag: unique symbol
};
type Euro = {
    value: number;
    readonly __tag: unique symbol
};

function processEuro(money: Euro) {};

processEuro(5); // type error
processEuro(5 as Dollars); // type error

With this approach simple type cast no longer works:

processEuro(5 as Euro); // also a type error

This means we will need a helper function that will not actually touch a value, but will do a type cast.

function to<
    T extends { readonly __tag: symbol, value: any } =
    { readonly __tag: unique symbol, value: never }
>(value: T["value"]): T {
    return value as any as T;
}

processEuro(to<Dollars>(5)); // type error
processEuro(to<Euro>(5)); // works!

Type definition is a bit of a mouthful, but the complexity mostly comes from the type parameter constraint and it’s default value. First, the constraint:

T extends { readonly __tag: symbol, value: any }

This tells TypeScript that we expect type parameters to be of a certain shape, namely the same shape as we used for our Dollar and Euro types in this section. It is important to note that any type has different semantics inside an extends clause compared to its normal usage — it just tells TypeScript that we don’t care what type it is. Inside T, the real type is preserved which allows us to access it for the input value via T["value"] notation.

Next interesting part is the default value for T. Without defaulting value field to never, we would allow the following declaration which has a very strange and rather useless type:

// type of euro1 is { readonly __tag: symbol, value: any }
const euro1 = to(5);

So having a default type forces us to specify a type declaration:

const euro1 = to(5); // type error

const euro2 = to<Euro>(5);
processEuro(euro2); // works!

const euro3: Euro = to(5);
processEuro(euro3); // works!

Now that we have a type safe way to create value of any newtype, we also need a way to cast them back to the raw type:

function from<
    T extends { readonly __tag: symbol, value: any }
>(value: T): T["value"] {
    return value as any as T["value"];
}

const someDollars = to<Dollars>(10);
const fullDollars = to<Dollars>(Math.floor(from(someDollars)));

The code above works as expected, but the last line has several problems. Firstly, it is a bit verbose, but more importantly it can lead to mistakes such as accidental conversion to the wrong type:

// Ooops, we accidentally converted dollars to Euros
const fullEuros = to<Euros>(from(someDollars).toFixed(0));

Haskell solves this problem by allowing newtypes to have the same type class instances (i.e. support the same operations) as the original type, which more or less correlates to the intersection type option we explored at the beginning.

Another way to deal with this is to introduce another helper function that allows to operate on the value without touching the type. This is similar to the way we can process individual elements of the array with .map:

function lift<
    T extends { readonly __tag: symbol, value: any }
>(value: T, callback: (value: T["value"]) => T["value"]): T {
    return callback(value);
}

const fullDollars = lift(someDollars, Math.floor);

This pattern requires some getting used to and might have performance implications with certain usages, but mostly it works. Also if you need to operate on several values you will either need a a separate function for each amount of values, or it is possible to generalize lift, but at the cost of performance. For the sake of completeness, here’s how a two-value lift looks like:

function lift2<
    T extends { readonly __tag: symbol, value: any }
>(x: T, y: T, callback: (x: T["value"], y: T["value"]) => T["value"]): T {
    return callback(x, y);
}

const fiveDollars = to<Dollars>(5);
const twoDollars = to<Dollars>(2);
const sum = lift2(fiveDollars, twoDollars, (x, y) => x + y);

Conclusion

Given all the extra verbosity and some awkwardness with operations, one might ask if all of this is worth. The answer, as usual: it depends. In the absence of nominal typing, this approach is the way I know to have type safety for the objects of the same structure, which might be important if you service deals with sensitive data.

From the performance standpoint using from and to functions which just return the value they were passed has no measurable impact on performance in Chrome and Firefox, but is about 5x slower in Edge unfortunately. You can try it yourself in this JSPerf benchmark

Like this article?
Consider supporting me on Patreon

Return to the posts list