Why the C# Record type is important

This article is part of the 2020 C# Advent Series. Today's date is the 23rd and we are getting close now. Ovens are heating, carol singers are flexing and Santa is loading up his sleigh so what better time could there be to take a deep dive into the new Record type in C#.

.NET 5 was released in November this year and with it a whole host of new features, including a new version of C#. Unlike the release of C#8 which saw possibly the most drastic change to the language since LINQ in Nullable Reference Types, C#9 brings little that will leave your existing code feeling old and obsolete and more that will add features you didn't even know you wanted.

Today we are going to examine one of those new languages features and think about how and when you might want to start using C# Record Types.

Records 101

Records are a new reference type in C#
They are immutable
Automatic constructor and deconstructor
Value based equality comparison

Here is an example of a record:

using System;

var trainA = new Train("Mallard", 12);
var trainB = new Train("Mallard", 12);
Console.WriteLine(trainA == trainB); // prints "True"

var (name, _ ) = trainA;
Console.WriteLine(name); // prints "Mallard"

public record Train(string Name, int WheelCount);

The above code uses another new feature of C#9 to remove the boilerplate from Program.cs

The Case For A New Type

Up until now the go-to type for creating any kind of structured object in C# has been the class. Sure, C# does also give us structs and anonymous types but for conventional object-oriented programming nothing has come close to being as universally useful as the humble class. But as programming trends have changed and our industry has moved further and further away from strict OOP, the class has begun to look more and more unfit for a lot of the purposes it is used for.

A History in OOP

The central, unbreakable tenet of Object Oriented Programming is that of encapsulation and hiding state. Consider the following class that could be straight out of a text book circa 2007:

public class Car
{
    public string MakeAndModel;
    public int NumberOfSeats;
    public bool IsEngineRunning;

    public Car(string makeAndModel, int numberOfSeats)
    {
        MakeAndModel = makeAndModel;
        NumberOfSeats = numberOfSeats;
        IsEngineRunning = false;
    }

    public void StartEngine() => IsEngineRunning = true;

    public void StopEngine() => IsEngineRunning = false;
}

Here we have a class that can be used to created instances of cars. This class encapsulates everything a car might want to do, and it saves the internal state of the car in a self-managed way. We can map it to the real world extremely easily - for example let's say I have two cars in my garage:

var myMainCar = new Car("Ford Focus", 5);
var myWeekendCar = new Car("MG Midget", 2);

// Time to go somewhere
myWeekendCar.StartEngine();

I can start and stop the engines on either car and the code inside the Car.cs class handles the internals of what "Starting the engine" actually means for the object.

Mixing State And Data

While it is convenient to be able to hide the internal state of my car from the rest of the code, there are drawbacks to this approach, not least of which is the way we have properties and state mixed in together. Let's take another look at our Car.cs class and try to identify the properties and the state:

public class Car
{
    // Properties...
    public string MakeAndModel;
    public int NumberOfSeats;

    // State...
    public bool IsEngineRunning;

    ...
}

I've used comments here to separate them, but the distinction should be clear to anyone who knows what a car is. Clearly, the "number of seats" is a property of that model of car - whereas "is the engine running" is internal state of that particular car instance. I might want to create a template for a certain model of car and mass instantiate them like this

Car CreateAFordFocus() => new Car("Ford Focus", 5);

Car[] carArray = { CreateAFordFocus(), CreateAFordFocus(), CreateAFordFocus() };

Here we have created an array of 3 Ford Focus cars. They all have 5 seats and they all have the same name, but each one can be started and stopped individually - that is - their internal state can change even though their properties, like the number of seats, will be immutable.

Where The Class Betrays Us

The trouble with C# classes however, is they do not do much to enforce this separation between properties and state. For example, there is nothing stopping me doing this:

var myCar = new Car("Ford Focus", 5);

// need more seats
myCar.Seats++;

If we were trying to model the real world here, this analogy breaks down. Over the years C# classes have been given more and more power to prevent undesirable scenarios like the above. As you are probably alreay aware, we can use access modifiers and proper properties to enforce these kinds of behaviours.

public class Car
{
    // Properties...
    public string MakeAndModel { get; private set; }
    public int NumberOfSeats { get; private set; }

    // State...
    private bool _isEngineRunning;

    ...
}

Here we are enforcing the notion that our properties cannot be modified outside the class, and we are hiding the internal state by making it private. But even with these modifiers we continue to have the state and the data mixed in together, we have just decorated parts of the class to indicate which are properties.

A New Pattern Emerges

This issue of mixing properties and state is not new, and many patterns have emerged to tackle it. For example, let's take our car and split the properties from the model of vehicle out into their own class:

public class CarSpecification
{
    public string MakeAndModel { get; init; }
    public int NumberOfSeats { get; init; }
}

public class Car
{
    public CarSpecification Specification { get; }
    private bool _isEngineRunning;

    public Car(CarSpecification specification)
    {
        Specification = specification;
        _isEngineRunning = false;
    }

    public void StartEngine() => _isEngineRunning = true;

    public void StopEngine() => _isEngineRunning = false;
}

Here we have a separate class containing immutable properties to describe the properties of this model of car, the Car.cs class is in charge only of its changeable state.

If we look back to the previous example of creating an array of cars to the same specification, this also becomes a lot clearer:

var fordFocusSpec = new CarSpecification
{
    MakeAndModel = "Ford Focus",
    NumberOfSeats = 5
};

Car[] carArray = { new Car(fordFocusSpec), new Car(fordFocusSpec), new Car(fordFocusSpec) };

Here we are creating multiple car objects that each have their own internal state - but they all share the same specification. You can start and stop the engine on each car but you cannot just increase the number of seats.

Serialization

This approach has many benefits, not least of which is that our new data structure class CarSpecification.cs can be readily serialized to JSON.

With the popularity of NoSQL databases, being able to easily serialize object properties into JSON that can be stored and retrieved from a data store is extremely useful.

var spec = JsonSerializer.Deserialize<CarSpecification>(jsonFromDataStore);

Car[] carArray = { new Car(spec), new Car(spec), new Car(spec) };

And while it is of course possible to serialize any class in C#, keeping all your properties in anemic models that have no methods or internal state to speak of makes the serialization to JSON much more predictable. And predictable code is good code.

Introducing the New Record type

The new C#9 Record type addresses the properties side of this properties/state split.

By introducing a new type that is designed exclusively for modelling data structures the Record let's us explicitly indicate that CarSpecification.cs is a data structure and does not hold mutable internal state.

public class CarSpecification
{
    public string MakeAndModel { get; init; }
    public int NumberOfSeats { get; init; }
}

Becomes...

public record CarSpecification
{
    public string MakeAndModel { get; init; }
    public int NumberOfSeats { get; init; }
}

Or as something C# now calls a positional record...

public record CarSpecification(string MakeAndModel, int NumberOfSeats);

The rest of the code is unchanged, instances of the record will act the same as if they were of class - except now we can take advantage of the other features of record.

var carA = new Car(new CarSpecification("Ford Focus", 5));
var carB = new Car(new CarSpecification("Ford Focus", 5));

Console.WriteLine(carA.Specification == carB.Specification); // prints "True"!!

var hatchback = new CarSpecification("Ford Focus", 5);
var sportsModel = hatchback with { NumberOfSeats = 2 };

That last line, which uses the new with expression, is a close as C# has ever come to the spread syntax that we've all been watching over the fence into the Javascript garden for years now!