Scaling past Dataverse API limits with rotating application users

Dataverse service protection limits are per user - and that's the loophole. Six thousand requests or 1,200 seconds of execution time in a five-minute sliding window, plus a hard cap of 52 concurrent requests at any moment: trip any one and the user gets locked out. Per user. Which means the fix is almost never "make fewer requests" - you've already tried - it's "spread them across more users". This post walks through doing that cleanly with HttpClient, Azure.Identity, and a delegating handler that locks throttled users and rotates to the next available one. The full code lives at github.com/georgpfeiffer/dataverse-showcase.

The limits

Service protection is documented in the Dataverse API limits reference. Three numbers, all measured per user. Two of them are evaluated over a five-minute sliding window; the third is an instant cap on concurrency:

Limit	Value
Number of requests	6,000
Combined execution time	1,200 seconds (20 minutes)
Concurrent requests	52 or higher

Cross any one of them and Dataverse responds with 429 Too Many Requests and a Retry-After header. For the first two limits the window slides, so you don't have to wait the full five minutes - but you do have to back off the throttled user. The concurrent limit is different: it trips the moment you exceed it, which means a parallel fan-out can punch through it instantly where the other two would still be forgiving. The numbers in the table are documented defaults - Microsoft hedges that they can vary by environment - but the shape of the limits is stable.

The per-user framing is the lever. It's not a global ceiling on your tenant - every application user gets its own independent budget, which makes the limit cheap to scale: spin up another app user, claim another full quota, no user license required.

And one thing the docs bury: the 6,000 isn't actually a per-user ceiling, it's per user per web server. Most production environments have more than one web server, so you're already getting a quiet multiplier on top of user rotation - not a lever you directly pull (there's an affinity cookie that pins clients to one server by default), but a baseline more generous than the table suggests.

There is one exemption worth knowing: plug-ins and custom workflow activities are excluded from service protection because they run inside the sandbox. This sounds like a free pass and isn't. The compute time plug-ins spend still gets added to the triggering request's execution-time budget, so the exemption is narrower than it sounds on the tin. Beyond that, plug-ins push the load inside Dataverse instead of removing it, they have their own 2-minute execution limit, they run on .NET Framework, and they trigger each other in ways that turn orchestration into plug-in-fires-plug-in-fires-plug-in spaghetti at the speed you'd expect. Moving every integration concern into plug-ins to dodge service protection is a trade you'll regret. Treat the exemption as nice-to-know, not as a strategy.

The baseline: one application user

Before we rotate anything, here's the boring case - one application user, one bearer token, attached to every outgoing request. We use Azure.Identity's ClientSecretCredential so we don't have to hand-roll token caching or refresh:

csharp

public class BearerTokenHandler : DelegatingHandler
{
    private readonly TokenCredential _credential;
    private readonly string[] _scopes;

    public BearerTokenHandler(TokenCredential credential, string[] scopes)
    {
        _credential = credential;
        _scopes = scopes;
    }

    protected override async Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        var token = await _credential.GetTokenAsync(
            new TokenRequestContext(_scopes), cancellationToken);
        request.Headers.Authorization = new AuthenticationHeaderValue("Bearer", token.Token);
        return await base.SendAsync(request, cancellationToken);
    }
}

TokenCredential is the Azure.Identity abstraction over every credential type the SDK supports - ClientSecretCredential, ManagedIdentityCredential, DefaultAzureCredential, the lot. The concrete Microsoft credentials cache and proactively refresh tokens for you when you reuse the instance, so a delegating handler stays a one-liner. You don't need MSAL on top, you don't need a token store, you don't need to track expiry yourself.

Wire it up against IHttpClientFactory:

csharp

var credential = new ClientSecretCredential(tenantId, clientId, clientSecret);

services
    .AddHttpClient<DataverseApiClient>(c => c.BaseAddress = new Uri(baseUrl))
    .AddHttpMessageHandler(() => new BearerTokenHandler(credential, [scope]));

This is fine until the day you start hitting 429s. Then you need more users.

Rotating users on 429

The idea: register N application users, round-robin through them, and when one gets throttled, lock it out for the duration the Retry-After header tells you, then move on to the next available user. The whole thing fits inside one delegating handler.

csharp

public class DataverseUserRotationHandler : DelegatingHandler
{
    private readonly IDataverseUserManager _userManager;
    private readonly string[] _scopes;
    private readonly ILogger _logger;

    public DataverseUserRotationHandler(
        IDataverseUserManager userManager,
        string[] scopes,
        ILogger logger)
    {
        _userManager = userManager;
        _scopes = scopes;
        _logger = logger;
    }

    protected override async Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        // httpRequestMessage is send-once, so buffer content and clone per attempt
        byte[]? bufferedContent = null;
        if (request.Content != null)
        {
            bufferedContent = await request.Content.ReadAsByteArrayAsync(cancellationToken);
        }

        for (var attempt = 0; attempt < _userManager.UserCount; attempt++)
        {
            var user = _userManager.GetAvailableUser()
                ?? throw new DataverseThrottledException("All Dataverse users are rate-limited");

            var token = await user.Credential.GetTokenAsync(
                new TokenRequestContext(_scopes), cancellationToken);

            using var clonedRequest = CloneRequest(request, bufferedContent);
            clonedRequest.Headers.Authorization = new AuthenticationHeaderValue("Bearer", token.Token);

            var response = await base.SendAsync(clonedRequest, cancellationToken);
            if (response.StatusCode != HttpStatusCode.TooManyRequests)
            {
                return response;
            }

            var retryAfter = ParseRetryAfter(response);
            _logger.LogWarning(
                "Dataverse user '{UserName}' throttled (429), locking for {RetryAfter}",
                user.Name, retryAfter);

            _userManager.Lock(user.Name, retryAfter);
            response.Dispose();
        }

        throw new DataverseThrottledException("All Dataverse users are rate-limited");
    }
}

Walking through it: buffer the request content once at the top (we'll come back to why), then for each attempt pull an available user from the manager, ask Azure.Identity for a token (cached per user, so this is free after the first call), clone the original request from the buffered content, attach the token to the clone, and send the clone. If the response is anything other than 429, hand it back to the caller. If it is 429, parse the retry delay, lock the user out for that duration, and loop. After cycling through every user once, give up and throw - what to do next is the caller's call (more on that in the gotchas).

The reason for the clone is that HttpRequestMessage is documented send-once: content streams get consumed, and some handlers reject a second send outright. For GET/DELETE without a body you can get away with re-sending the same instance, but the moment you point the handler at a POST or a PATCH - which is most of what integration code does - the second attempt after a 429 fails because the body stream is already drained. Buffering the content into a byte[] once and rebuilding the request per attempt is the clean fix:

csharp

private static HttpRequestMessage CloneRequest(HttpRequestMessage request, byte[]? bufferedContent)
{
    var clone = new HttpRequestMessage(request.Method, request.RequestUri)
    {
        Version = request.Version
    };

    foreach (var header in request.Headers)
    {
        clone.Headers.TryAddWithoutValidation(header.Key, header.Value);
    }

    foreach (var option in (IDictionary<string, object?>)request.Options)
    {
        ((IDictionary<string, object?>)clone.Options)[option.Key] = option.Value;
    }

    if (bufferedContent != null)
    {
        clone.Content = new ByteArrayContent(bufferedContent);
        if (request.Content != null)
        {
            foreach (var header in request.Content.Headers)
            {
                clone.Content.Headers.TryAddWithoutValidation(header.Key, header.Value);
            }
        }
    }

    return clone;
}

ParseRetryAfter is unremarkable but worth showing because the Retry-After header has two legal forms - a delta in seconds or an HTTP date - and you should handle both:

csharp

private static TimeSpan ParseRetryAfter(HttpResponseMessage response)
{
    var header = response.Headers.RetryAfter;
    if (header == null)
    {
        return TimeSpan.FromSeconds(30);
    }

    if (header.Delta.HasValue)
    {
        return header.Delta.Value;
    }

    if (header.Date.HasValue)
    {
        var delay = header.Date.Value - DateTimeOffset.UtcNow;
        return delay > TimeSpan.Zero ? delay : TimeSpan.FromSeconds(1);
    }

    return TimeSpan.FromSeconds(30);
}

Dataverse returns the delta form in practice, but the spec allows either. The 30-second fallback when the header is missing or malformed is a deliberate "long enough to matter, short enough not to stall everything" default - pick a number you can live with.

The pieces

The handler delegates to two small abstractions: a manager that picks the next user, and a store that tracks who's locked.

csharp

public interface IDataverseUserManager
{
    DataverseUser? GetAvailableUser();
    void Lock(string userName, TimeSpan retryAfter);
    int UserCount { get; }
}

The manager is round-robin with skip-locked semantics. It uses Interlocked.Increment to advance the cursor so concurrent callers don't pile onto the same user, and it walks at most one full lap before giving up:

csharp

public class DataverseUserManager : IDataverseUserManager
{
    private readonly DataverseUser[] _users;
    private readonly IDataverseUserStore _store;
    private uint _nextIndex;

    public int UserCount => _users.Length;

    public DataverseUser? GetAvailableUser()
    {
        var startIndex = Interlocked.Increment(ref _nextIndex);
        var length = (uint)_users.Length;

        for (var i = 0u; i < length; i++)
        {
            var user = _users[(startIndex + i) % length];
            if (_store.IsAvailable(user.Name))
            {
                return user;
            }
        }

        return null;
    }

    public void Lock(string userName, TimeSpan retryAfter)
        => _store.Lock(userName, retryAfter);
}

The uint field is a small piece of defensive engineering: Interlocked.Increment on a signed int wraps to int.MinValue after about 2.1 billion calls, and C#'s % preserves the sign of the dividend, so a negative startIndex would eventually produce a negative index and throw IndexOutOfRangeException. uint wraps cleanly through zero, and % length on a non-negative value is always non-negative. At 1,000 calls per second a signed wrap would hit in roughly 24 days - well within the lifetime of a long-running integration service. Cheap insurance.

The store is a one-method abstraction so we can swap implementations:

csharp

public interface IDataverseUserStore
{
    bool IsAvailable(string userName);
    void Lock(string userName, TimeSpan duration);
}

For a single-process app, IMemoryCache is enough. The cache entry's expiration is the lock - when it falls out of the cache, the user is available again. No background thread, no cleanup, nothing to maintain:

csharp

public class InMemoryDataverseUserStore : IDataverseUserStore
{
    private readonly IMemoryCache _cache;

    public InMemoryDataverseUserStore(IMemoryCache cache)
    {
        _cache = cache;
    }

    public bool IsAvailable(string userName) => !_cache.TryGetValue(userName, out _);

    public void Lock(string userName, TimeSpan duration) =>
        _cache.Set(userName, true, duration);
}

⚠ Single-process only

The in-memory store works for one app instance. The moment you scale out - Azure Functions consumption plan, multi-pod Kubernetes, multiple replicas behind a load balancer - each instance has its own isolated view of which user is locked. Other instances will keep hitting 429s until they discover the lock themselves. Swap the store for one backed by `IDistributedCache` (Redis is the obvious pick) and the rest of the implementation doesn't change.

Wiring it up

A DI extension to register the whole stack:

csharp

public static IHttpClientBuilder AddDataverseApiClient(
    this IServiceCollection services,
    string baseUrl,
    string scope,
    params DataverseUser[] users)
{
    services.AddMemoryCache();
    services.TryAddSingleton<IDataverseUserStore, InMemoryDataverseUserStore>();
    services.TryAddSingleton<IDataverseUserManager>(sp => new DataverseUserManager(
        sp.GetRequiredService<IDataverseUserStore>(), users));

    return services
        .AddHttpClient<DataverseApiClient>(c => c.BaseAddress = new Uri(baseUrl))
        .AddHttpMessageHandler(sp => new DataverseUserRotationHandler(
            sp.GetRequiredService<IDataverseUserManager>(),
            [scope],
            sp.GetRequiredService<ILoggerFactory>().CreateLogger<DataverseUserRotationHandler>()));
}

And a domain client on top - composition over inheritance, the DataverseApiClient is a thin facade injected into whatever you're actually building:

csharp

public sealed class AccountsClient
{
    private readonly DataverseApiClient _dataverse;

    public AccountsClient(DataverseApiClient dataverse)
    {
        _dataverse = dataverse;
    }

    public Task<IReadOnlyList<AccountDataverseModel>> GetTopAsync(CancellationToken cancellationToken)
    {
        return _dataverse.GetODataAsync<AccountDataverseModel>(
            "api/data/v9.2/accounts?$select=accountid,name&$top=5", cancellationToken);
    }
}

AccountsClient doesn't know about rotation, locking, or 429s. None of the business code does. The handler chain absorbs all of it.

Gotchas

Audit and ownership

Each rotated user is its own Entra app registration with its own Dataverse application user, so createdby / modifiedby on every record reflects whichever user got picked at the moment the request fired. You already had this problem with one application user - your audit log was attributing every change to "Dynamics API User" rather than a real human - but rotation turns that single noisy attribution into N noisy attributions.

If you care about clean audit trails, use impersonation: add the CallerObjectId header with the Entra object ID of a canonical service account on every request, and give your rotating users the Act on Behalf of Another User privilege (prvActOnBehalfOfAnotherUser). That pins createdby / modifiedby to the impersonated user regardless of which rotating user signed the call. (ownerid is a separate concern - it controls record ownership, not audit attribution, so set it explicitly on writes if you also care about ownership.)

The retry stops at the handler

The handler walks through every user once. If they're all locked when the request comes in, it throws DataverseThrottledException. There's no outer-loop wait-and-retry baked in, on purpose - the right place for that is the operation that wraps the call. Ideally that operation is itself a Service Bus triggered function, where ASB handles the retry, the dead-letter, and the back-pressure for you. That's a whole other post and it deserves one.

The daily entitlement isn't enforced

Non-interactive users have a daily request budget on top of the five-minute window, but at the time of writing it isn't enforced: admins get a notification when the threshold is crossed, no requests are blocked. Microsoft reserves the right to enforce it later, so don't treat the current behavior as a permanent guarantee. And if you're consistently saturating N × 6,000 requests every five minutes you don't have a distribution problem, you have a volume problem - adding more users just defers the conversation.

Search: the Retry-After header lies

Dataverse search service protection has two caps: per-user (1 request/second, instantaneous) and per-organization (150 requests/minute, windowed). In theory, rotating users raises your burst capacity for search - N rotating users can fire N parallel searches where one user would be serialized at 1/s.

In practice there's a landmine. When the search API throttles you, the Retry-After response header disagrees with the real retry window. Here's an actual 429 from production logs:

http

HTTP 429 TooManyRequests
Retry-After: 30

{"error":{"code":"0x80048d04","message":"Code: SearchRateLimitExceeded. Message: Number of requests exceeded the limit..Retry-After: 00:00:01"}}

⚠ 30× over-lock

`Retry-After: 30` in the header, `Retry-After: 00:00:01` buried in the body message. A rotation handler that trusts the header - like the one in this post - will lock each rotating user for 30 seconds when the service is actually ready for them in 1, and burn through the whole pool on a single burst for no reason.

You could work around it by reading the response body on every 429, parsing the error message, and extracting the real value. But that drags the handler from a clean "status + headers" design into buffering bodies and regex-matching vendor-specific error text - a meaningful complexity jump for one API's quirk, where the thing you're parsing isn't even a real contract, just a convention Microsoft could change on the next deploy.

If search is your actual bottleneck, the pattern in this post isn't the right tool. Use a dedicated search-aware client and focus on query shape and the per-org cap.

Wrap-up

The thing I keep coming back to with this pattern is how cheap it is. Every application user is its own independent budget, and the cost of claiming that on your side is a handful of small classes, most of them interfaces and DI plumbing. A delegating handler, a small user manager, a one-method lock store, Azure.Identity doing the boring token work, done. Add an app user when you need more headroom, and the rest of the system doesn't notice. Limits this easy to scale around are rare - take the win.

Full working code, including tests and an Azure Functions sample, is at github.com/georgpfeiffer/dataverse-showcase.

Scaling past Dataverse API limits with rotating application users

The limits

The baseline: one application user

Rotating users on 429

The pieces

Wiring it up

Gotchas

Audit and ownership

The retry stops at the handler

The daily entitlement isn't enforced

Search: the Retry-After header lies

Wrap-up

Share this article

About the Author