Azure storage read performance of small objects

I wasn't able to find information about the differences in read performance for azure storage online. So I did some small test runs to get some insights.

This might not be the best test, but it gave me enough information.

Performed 2 test runs:

  • 10.000 read iterations
  • 100.000 read iterations

Just to be sure :-)

The Results

10.000 read operations in milliseconds 10.000 read operations in milliseconds

Contains the percentage of requests that are smaller than 1 millisecond or equal, larger than 1 and less or equal to 2, etc.

10.000 read operations (averages and min values) in milliseconds 10.000 read operations

100.000 read operations in milliseconds 100.000 read operations in milliseconds

Contains the percentage of request that are smaller than 1 millisecond or equal, larger than 1 and less or equal to 2, etc.

100.000 read operations (averages and min values) in milliseconds

100.000 read operations

Setup of test environment

Below is the test code I used.

Azure SQL setup

CREATE TABLE [dbo].[users]  
(
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [FirstName] [nvarchar](50) NOT NULL,
    [LastName] [nvarchar](50) NOT NULL,

    CONSTRAINT [PK_users] PRIMARY KEY CLUSTERED 
    (
        [Id] ASC
    )
)

INSERT INTO dbo.users(FirstName, LastName) VALUES ('Nancy','Torres')  
INSERT INTO dbo.users(FirstName, LastName) VALUES ('Kayla','Allen')  
INSERT INTO dbo.users(FirstName, LastName) VALUES ('Neal','Hudson')  
INSERT INTO dbo.users(FirstName, LastName) VALUES ('Rita','Adams')  
INSERT INTO dbo.users(FirstName, LastName) VALUES ('Ellis','Holmes')  
INSERT INTO dbo.users(FirstName, LastName) VALUES ('Nathaniel','Hunt')  
INSERT INTO dbo.users(FirstName, LastName) VALUES ('Ricky','Roy')  
INSERT INTO dbo.users(FirstName, LastName) VALUES ('Joanne','Phelps')  
INSERT INTO dbo.users(FirstName, LastName) VALUES ('Sandra','Casey')  
INSERT INTO dbo.users(FirstName, LastName) VALUES ('Wim', 'Tucker')  

Results in the following table:

Blob storage setup

Just created a list of 10 objects and stored them as json text in blob storage by their ID.

CloudStorageAccount storageAccount =  
    CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();  
CloudBlobContainer container = blobClient.GetContainerReference("testcontainer");  
container.CreateIfNotExists();  
var users = new List<User>  
                {
                    new User { FirstName = "Nancy", LastName = "Torres" },
                    new User { FirstName = "Kayla", LastName = "Allen" },
                    new User { FirstName = "Neal", LastName = "Hudson" },
                    new User { FirstName = "Rita", LastName = "Adams" },
                    new User { FirstName = "Ellis", LastName = "Holmes" },
                    new User { FirstName = "Nathaniel", LastName = "Hunt" },
                    new User { FirstName = "Ricky", LastName = "Roy" },
                    new User { FirstName = "Joanne", LastName = "Phelps" },
                    new User { FirstName = "Sandra", LastName = "Casey" },
                    new User { FirstName = "Wim", LastName = "Tucker" }
                };

for (var i = 0; i < users.Count; i++)  
{
    CloudBlockBlob blockBlob = container.GetBlockBlobReference(i.ToString());
    blockBlob.UploadText(JsonConvert.SerializeObject(users[i]));
}

Which will create the following blobs:
blob storage container screenshot

with the content:

Table storage setup

Created a list of 10 objects and stored them in table storage. Runned the test one time with the same partitionkey and once with a partitionkey for each.

CloudStorageAccount storageAccount =  
    CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
var tableClient = storageAccount.CreateCloudTableClient();  
var table = tableClient.GetTableReference("testtable");  
table.CreateIfNotExists();  
var users = new List<UserTE>  
                {
                    new UserTE(0) { FirstName = "Nancy", LastName = "Torres" },
                    new UserTE(1) { FirstName = "Kayla", LastName = "Allen" },
                    new UserTE(2) { FirstName = "Neal", LastName = "Hudson" },
                    new UserTE(3) { FirstName = "Rita", LastName = "Adams" },
                    new UserTE(4) { FirstName = "Ellis", LastName = "Holmes" },
                    new UserTE(5) { FirstName = "Nathaniel", LastName = "Hunt" },
                    new UserTE(6) { FirstName = "Ricky", LastName = "Roy" },
                    new UserTE(7) { FirstName = "Joanne", LastName = "Phelps" },
                    new UserTE(8) { FirstName = "Sandra", LastName = "Casey" },
                    new UserTE(9) { FirstName = "Wim", LastName = "Tucker" }
                };
for (var i = 0; i < users.Count; i++)  
{
    TableOperation insertOperation = TableOperation.Insert(users[i]);
    table.Execute(insertOperation);
}

// per partition
public class UserTE: TableEntity  
{
    public UserTE() { }

    public UserTE(int id)
    {
        this.RowKey = id.ToString();
        this.PartitionKey = id.ToString();
    }

    public string FirstName { get; set; }
    public string LastName { get; set; }
}

// one partition
public class UserTE: TableEntity  
{
    public UserTE() { }

    public UserTE(int id)
    {
        this.RowKey = id.ToString();
        this.PartitionKey = "users";
    }

    public string FirstName { get; set; }
    public string LastName { get; set; }
}

10 separate partitions
azure table storage 10 partitions

1 partition
azure table storage 10 partitions

DocumentDB setup

var database = CreateDatabase();  
var collection = CreateCollection(database);  
using (  
    var client = new DocumentClient(new Uri("documentDBURI"),"key"))
{
    var users = new List<UserDDB>
                    {
                        new UserDDB(0) { FirstName = "Nancy", LastName = "Torres" },
                        new UserDDB(1) { FirstName = "Kayla", LastName = "Allen" },
                        new UserDDB(2) { FirstName = "Neal", LastName = "Hudson" },
                        new UserDDB(3) { FirstName = "Rita", LastName = "Adams" },
                        new UserDDB(4) { FirstName = "Ellis", LastName = "Holmes" },
                        new UserDDB(5) { FirstName = "Nathaniel", LastName = "Hunt" },
                        new UserDDB(6) { FirstName = "Ricky", LastName = "Roy" },
                        new UserDDB(7) { FirstName = "Joanne", LastName = "Phelps" },
                        new UserDDB(8) { FirstName = "Sandra", LastName = "Casey" },
                        new UserDDB(9) { FirstName = "Wim", LastName = "Tucker" }
                    };

    foreach (var user in users)
    {
        var x = client.CreateDocumentAsync(collection.DocumentsLink, user).Result;
    }
}

public class UserDDB  
{
    public UserDDB() { }

    public UserDDB(int id)
    {
        this.Id = id.ToString();
    }

    [JsonProperty(PropertyName = "id")]
    public string Id { get; set; }

    [JsonProperty(PropertyName = "firstname")]
    public string FirstName { get; set; }

    [JsonProperty(PropertyName = "lastname")]
    public string LastName { get; set; }
}

Document database:
DocumentDB setup

Running tests

Each test was performed on a Windows Server 2012 VM A3 size (4 cores, 7GB mem) in West Europe with a storage account located in West Europe. First run each time is 10.000 read operations, second run is 100.000 read operations. Each run will create the objects required to get the object from the data store and then get a random single object from the 10 stored in the data store.

The following was set before running all the tests:

ServicePointManager.Expect100Continue = false;  
ServicePointManager.UseNagleAlgorithm = false;  
ServicePointManager.DefaultConnectionLimit = 100;  
  <system.net>
    <connectionManagement>
      <add address="*" maxconnection="1000" />
    </connectionManagement>
    <defaultProxy>
      <proxy bypassonlocal="True" usesystemdefault="False" />
    </defaultProxy>
  </system.net>

Running SQL Azure performance checks

var context = new UserContext();  
int randomVal = rnd.Next(0, 9);  
var item = context.Users.FirstOrDefault(x => x.Id == randomVal);  

First run was against a SQL Azure S0 instance:

DTU of S0

Because the DTU percentage went up a bit, I also tested against a more expensive/better instance.

The second run was againt a SQL Azure S2 instance:

DTU of S2

Running blob storage performance checks

CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));  
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();  
CloudBlobContainer container = blobClient.GetContainerReference("testcontainer");  
CloudBlockBlob blockBlob = container.GetBlockBlobReference(getItem);  
var text = blockBlob.DownloadText();  
var user = JsonConvert.DeserializeObject<User>(text, settings);  

Running table storage performance checks

CloudStorageAccount storageAccount =  
    CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
var tableClient = storageAccount.CreateCloudTableClient();  
var table = tableClient.GetTableReference("testtable");  
TableOperation retrieveOperation = TableOperation.Retrieve<UserTE>(getItem, getItem);  
TableResult retrievedResult = table.Execute(retrieveOperation);  
var @object = retrievedResult.Result as UserTE;  

Running DocumentDB performance checks

Update:

After posting this blog post, Ryan CrawCour a program manager on the DocumentDB team gave me some hints to improve the performance. I modified the code and updated the above charts.

Differences are:

  • cache the DocumentClient and re-use it, instead of creating a new instance each time.
  • Modify the connection policy.
ConnectionPolicy policy = new ConnectionPolicy  
{
    ConnectionMode = ConnectionMode.Direct,
    ConnectionProtocol = Protocol.Tcp
};

Old:

Only runned the 10.000 test for DocumentDB a few times, because it was slow. This product is still in preview and either I hit it on a bad day or the performance still needs to improve.

using (var client = new DocumentClient( new Uri("uri"), "key"))  
{
    var database = client
        .CreateDatabaseQuery()
        .Where(db => db.Id == "performancetest")
        .AsEnumerable()
        .FirstOrDefault()
    ?? client
    .CreateDatabaseAsync(new Microsoft.Azure.Documents.Database { Id = "performancetest" }).Result;

    var collection = client.CreateDocumentCollectionQuery(database.CollectionsLink)
            .Where(c => c.Id == "testcollection")
            .AsEnumerable()
            .FirstOrDefault() ?? client.CreateDocumentCollectionAsync(
                database.CollectionsLink,
                new DocumentCollection { Id = "testcollection" }).Result;

    var document =
        client.CreateDocumentQuery<UserDDB>(collection.DocumentsLink)
            .Where(d => d.Id == getItem)
            .AsEnumerable()
            .FirstOrDefault();
}

The end.