Logo - Harry Gordon

Umbraco locking and scope handling

The dangers of the Umbraco scope and scope provider

May 17, 2024

Work in progress

This article is a work in progress! I make this caveat because my conclusions are quite surprising:

Scope provider is a singleton managing scopes across all requests/threads but expects them to be resolved in order. The result is hard to reproduce but common locking problems in Umbraco.

This is huge, if true (as the internet likes to say), so please, check my working and tell me if you think I'm wrong.

Changes

17th May

  • Added an example of child scope being mishandled.
  • Fixed an issue with figure 1 pointing to localhost.
  • Clarified that making ScopeProvider thread-safe will be harder than it sounds, not easier.

Introduction

Like most Umbraco developers I’ve stumbled into the dreaded “Failed to acquire write lock for id: -333” a few times. When it occurred in my project it didn’t seem like any of the usual suspects were the root cause (Examine, publishing massive amounts of content or a load balanced site) but it was clear that something was locking the content tree (-333) and never releasing it.

Eventually I worked out that the root cause in my project was a problem as old as programming: an infinite loop. This infinite loop happened in a content import task and was preventing an Umbraco scope instance from disposing - meaning the scope’s write lock was never released. Boom! 

Developer error didn’t seem to be a likely cause for the wide-spread locking issues reported by the community but it did get me thinking: mishandling scopes is a great way to lock up Umbraco.

TL;DR: After investigation and some help from the community, I think I’ve identified a flaw in Umbraco that is causing some of the locking issues, as well as some possible solutions. If you’d just like to read my conclusions you can skip to the end!

Locking in Umbraco

Locking issues in Umbraco can show up in a few ways. A read lock on the content tree will cause most of the back-office UI to fail to load and a write lock will prevent saving changes to content. Since it’s a database level lock, what error you see will depend on various timeouts:

  • With default settings, your interface will hang for a long time and you’ll eventually see a SqlException.
  • If your distributed locking timeout setting is shorter (during testing I set it to 5-10 seconds), you’ll get distributed locking exceptions instead (like DistributedReadLockTimeoutException).

Releasing the lock

In a pinch, you can release a DB lock manually (thank you to Chris Karkowsky for this). Essentially, you need to find the blocking session and kill it:

Investigating Umbraco scope

The Umbraco scope isn’t well documented and reading the source code isn’t immediately helpful but starts with a foreboding comment: “Not thread-safe obviously” *1

I haven’t seen too much discussion of scope management in the context of locking problems but a scope-related exception about child scopes not being disposed has been highlighted.

To start my investigation I wanted to deepen my own understanding of how to properly handle Umbraco scopes and what happened when they were mishandled.

Scope and Scope Provider

There are two key parts to scope handling in Umbraco: Scope and ScopeProvider. The scope itself tracks any locks and messages, as well as providing access to the database (this is usually why a scope is created). The scope provider is a singleton that tracks the chain of scopes that exist (in a parent-child relationship), including which scope is the current (or “ambient”) scope. This is a simplification but sufficient for our purposes.

enter image description here Figure 1: A simple example of ScopeProvider's chain of scopes

A quick Scope example

In this example, we want to get some content and then create a duplicate of a blog post. Note that we create the scope with scope provider and ensure it’s disposed of with a using statement. The scope provides the SQL context we need for our query. We also complete the scope (because it seems like the right thing to do).

What happens if we don’t complete?

It’s probably safe to assume that completing a scope is important but what does it do? On investigation, calling Complete doesn’t do much but later (assuming all scopes are disposed of) the database transaction will be completed.

Not completing scopes is totally safe, you just won’t save any changes.

What happens if we mishandle scopes?

There are a few good ways to mishandle scopes:

Fail to dispose of the root scope

If you create a scope when there is no pre-existing scope, then fail to dispose of it, you won’t immediately see an exception. The resulting problem will depend on what the state of the scope was: if it has locks they won’t be released, if you had a transaction/database connection open that won’t be completed.

Fail to dispose of a “child” scope

If you create a scope that isn’t the root scope, then fail to dispose of it before the parent scope disposes you’ll get an exception (see Scope.Dispose). The exception might be familiar to anyone dealing with locking issues and is already an exception of interest:

"The Scope [id] being disposed is not the Ambient Scope [id]. This typically indicates that a child Scope was not disposed, or flowed to a child thread that was not awaited, or concurrent threads are accessing the same Scope (Ambient context) which is not supported. If using Task.Run (or similar) as a fire and forget tasks or to run threads in parallel you must suppress execution context flow with ExecutionContext.SuppressFlow() and ExecutionContext.RestoreFlow()."

Conclusion

It seems that Umbraco’s locking issues (at least in a non-distributed environment *2) are probably related to scope handling, especially since scopes are managed by a singleton shared across requests/threads.

What’s going on?

There are a couple of key problems here:

  • An implementation of Dispose should not throw exceptions. This is considered best practice with good reason: failing to dispose of resources can disable an application (DB connections, locks, file read/write) or, in some cases, the app service (HTTP connections).

  • Scope provider is a singleton responsible for all scopes (across all requests) and relies on the chain of scopes being disposed of in order. If a scope is disposed out of order, disposal will fail.

Considering this it’s not surprising Umbraco locking issues are commonplace.

A simulated example

Here’s a quick example that aims to simulate multiple requests/threads using scopes in a way that is overlapping. The function includes options for the number of overlapping operations and whether or not a write lock is requested.

The individual scopes start at random intervals (between 0 and 1000ms) and are disposed of after the copy operation, plus another random interval. When each scope is created it is added to the chain of scopes maintained by scope provider (scope B created at 200ms is automatically a child of scope A created at 150ms). In my tests, this function failed most of the time because a parent scope was disposed of before the child scope.

Possible solutions

I don’t have a clear cut solution at this time but I have a few suggestions:

  1. Make ScopeProvider scoped: Currently ScopeProvider has a singleton lifetime but I don’t see why it shouldn’t be scoped to the request. Normally I would expect a database scope to be handled at the request level. This comes with a huge caveat: there must be a good reason why ScopeProvider is singleton, which means I can’t see the reason!

  2. Stop Scope from throwing on Dispose: Scopes being disposed out of order probably brings different problems but Scope.Dispose should probably log the issue and continue to dispose of its resources, rather than throwing.

  3. Make ScopeProvider thread safe: This is probably harder than it sounds but we could keep ScopeProvider as a singleton and make it thread safe. ScopeProvider could track all scopes (without the notion of a chain, probably) and ensure that only the last scope resulted in transaction completion, etc.

Ultimately this is a far reaching problem and any changes to scope handling will probably impact performance and behaviour, so any changes should be made with all due care.

Footnotes

*1: In seriousness, this comment is a bit scary (particularly because it doesn’t give enough context) but Umbraco scopes shouldn’t be used across threads, so in light of that it makes sense.

*2: As mentioned, I haven’t investigated this issue in a distributed (load balanced) environment but this issue would manifest there too and there are undoubtedly other ways to cause locking issues unique to that configuration - so distributed environments would fare worse.

Thanks for reading!

If you'd like to talk about anything in this post, give me a shout on the Umbraco Discord, Mastadon or other means.

This site is static HTML, managed in Umbraco 13 and then rendered using the Astro framework. The UI uses Tailwind CSS and a tiny amount of vanilla JavaScript.

Designed by Emily Geraghty