Idempotence: A Primer

The word idempotence may not be common in one’s vernacular, but its application is of central importance when designing software systems.

When you sign up for a new account on a web service, you expect it to only happen once, even if the sign up button is clicked twice. Transferring money to your friends should never double deposit (or double withdraw) that money from either account.

To address these problems is to address the idempotence of an operation. An operation is said to be idempotent if, when performed more than once with the same input, it results in the same output and system changes as when performed once.

This goes well beyond simply disabling a button in a UI after the first click. As we’ll see, designing idempotence into the full stack of your application is paramount to architecting resilient and correctly functioning systems.

Closely related to idempotence is the topic of race conditions. Often, an effective solution to solving race conditions comes in the form of idempotent measures.

Let’s use an example from a well known platform like Stack Overflow. Suppose two moderators are attempting to make changes to a forum post. An author may want to edit their post, but at the same time a moderator is making changes to it. Who should win if both attempt to save their edits of the same post at the same time?

The answer to this question varies depending on the application. As we’ll see through the following examples, idempotent implementations are closely linked to the business rules of the system in which those processes operate.

A detailed example

Suppose you’ve been tasked with building a new coupon code generator. The generator should allow the user of a coupon to get a one-time (per code) discount on a product they’re purchasing. Under no circumstances should a person be able to use the same code more than once and receive a second discount. A relational table representing this coupon code could look like this:

Table "public.purchases"

    Column     |          Type          | Nullable | Default
---------------+------------------------+----------+------------------
id             | uuid                   | not null | gen_random_uuid()
user_id        | uuid                   | not null |
product_id     | uuid                   | not null |
coupon_code    | character varying      |          |

Indexes:
    "purchases_pkey" PRIMARY KEY, btree (id)

The business rules here are fairly simple, but our above table doesn’t yet prevent the possibility of duplicate coupon_codes from entering the system.

There are many possible solutions to this problem, but one simple solution could be to add a unique index onto a table column that tracks the redeemed coupon codes with the purchase of the product. That index would look like this:

CREATE UNIQUE INDEX "idx_unique_coupon_code" ON purchases (coupon_code)

Notice the UNIQUE constraint. Any attempt to insert the same coupon code twice into the purchases table would result in an exception.

Unique Index

This solution is likely too simplistic to realistically be used in any e-commerce system, but it illustrates a pattern effectively. The design affords a coupon_code to be “reserved” on purchases such that it can only be used once. The aptly named “Reservation Pattern” is an important pattern to understand when designing software systems. As seen above it can aid in solving problems of idempotence and race conditions. A “reservation” can most simply be defined as a change that can only be achieved a single time (based on a unique key or identity of some sort)

An alternative, incorrect attempt at a solution Frameworks often attempt to provide tooling to help with this sort of problem. A well known example is validates_uniqueness_of from ActiveRecord (in Rails). Let’s replace the unique index used above with a model validator like so:

class Purchase
  validates :coupon_code, uniqueness: true
end

If one were to insert a record to the purchases table, then instantiate a new instance of a Purchase with the same coupon_code, ActiveRecord won’t let you save that object, since one already exists in the database. But there’s a less obvious issue with this code in that it suffers from a possible race condition. To its credit, the ActiveRecord documentation is forthright about the possible concurrency issue, but to the uninformed it may not be entirely obvious at a glance.

The sequence of events looks like this:

validates_uniqueness_of Race Condition

In this case, ActiveRecord queries the state of the database, pulls the results into Ruby land and then makes a decision based on that state. This presence of a query in ActiveRecord’s uniqueness implementation is a red flag that the reader should pay close attention to when evaluating whether or not their code is idempotent.

In this example, the correct solution uses a common link (the database) between the two processes to implement idempotence. The incorrect solution puts the onus on each individual process to attempt to implement idempotence. We can generalize these findings with this statement:

Two independent processes cannot operate on the same piece of state (or copy of that state) and remain idempotent without a coordinating mechanism.¹

Example #2

Let’s take our coupon code system above and alter the business rules slightly. Instead of simply allowing a single redemption per coupon code, we want a system that allows a code to be used a certain number of times. That number is configurable for each coupon code. Our “uniqueness” or “reservation” solution simply won’t work in this scenario as we now need to be able to have multiple purchases that use the same coupon code. Again the possible solutions to this problem are endless, but I’ll propose one here. Note that this change in business rule necessitates a slightly more complicated solution to implement.

We might instead have a list of all the coupons in the system, and track the redemptions with a counter like so:

                      Table "public.coupons"

    Column      |       Type        | Nullable |      Default
----------------+-------------------+----------+-------------------
id              | uuid              | not null | gen_random_uuid()
user_id         | uuid              | not null |
code            | character varying | not null |
max_redemptions | integer           | not null |
redemptions     | integer           | not null | 0

Indexes:
  "coupons_pkey" PRIMARY KEY, btree (id)
  "coupons_code_key" UNIQUE CONSTRAINT, btree (code)
Check constraints:
  "coupons_check" CHECK (redemptions <= max_redemptions)

Notice how we’ve guaranteed that our redemptions count can’t go above the configured max_redemptions with a database constraint. Our purchases table can be modified to reference the coupons primary key here, then insertions to both can happen in a transaction and increment the redemptions each time a purchase is made.

One problem remains however. How do we safely increment the redemptions column on each purchase to ensure that only the max_redemptions amount of redemptions are made?

If we were to mimic the uniqueness validator pattern in Rails, we might read the coupon record, check the redemptions count, increment it, then write it back with a new purchase. It looks like this:

Expected Version Race Condition

We can see however that two processes doing the same thing would be prone to read the same redemptions count and increment it to the same next redemptions count, thus two purchases could be made, yet only one redemption recorded.

A common countermeasure to this problem would be to use a lock on the record so that we guarantee that no competing process is writing to the same coupons row at the same time. For the below solution, we’ll use an optimistic locking strategy, and in this case ActiveRecord actually provides us with an implementation that can help. The only change required is to add a new column onto our table that maintains a “version” of the record (which is incremented with each change). ActiveRecord uses a column named lock_version by default. By adding this column, ActiveRecord will read the version each time it fetches a row from the database. When attempting to write back to that row, it increments the lock_version by 1. If any other process has already updated the lock_version of the record at the same time, ActiveRecord is able to reject the competing update and raises an exception.

In short, with each write to the database, I set an expectation that I’m writing back an “expected version” of 1 greater than the current version I’ve read. If that isn’t true, it means something else has already done this, and thus our current attempt is no longer valid and needs to be retried. We can reload the record and try it again, and at some point our database constraint will kick in, disallowing any attempts that would put redemptions above the configured max_redemptions. It looks like this:

Expected Version Concurrency Protection

Due to MVCC (on Postgres) only one of these attempts at updating the row will succeed because the first one will update the redemptions count to 6 and the 2nd one will no longer satisfy the where redemptions count == 5 condition, thus it will not make any changes. How you choose to implement this check is entirely up to you. (ActiveRecord raises an exception if you’re using its built in Optimistic Locking)

We’ve now introduced two useful patterns for implementing idempotence and safely handling race conditions. Reservations allow us to guarantee that a distinct process happens only once and locking mechanisms are a useful countermeasure to race conditions and problems which can arise through concurrency.

I encourage you to evaluate the business processes in your own system through the lens of concurrency and idempotence. Perhaps you’ll discover weaknesses in the designs that may benefit from these patterns.

A subsequent post will delve into distributed systems and messaging patterns using pub-sub, and how idempotence and locking strategies can play a key role in ensuring the correctness of your system.

CRDTs are an example of a data structure for which this statement isn’t true but that’s beyond the scope of this article ↩