Skip to content

AsAmqpData use ArraySegment directly instead of copying#19821

Merged
JoshLove-msft merged 2 commits intoAzure:masterfrom
danielmarbach:sender-optimization-split
Mar 25, 2021
Merged

AsAmqpData use ArraySegment directly instead of copying#19821
JoshLove-msft merged 2 commits intoAzure:masterfrom
danielmarbach:sender-optimization-split

Conversation

@danielmarbach
Copy link
Copy Markdown
Contributor

@danielmarbach danielmarbach commented Mar 25, 2021

Split out of the LGTM part from #19683

All SDK Contribution checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

  • Please open PR in Draft mode if it is:
    • Work in progress or not intended to be merged.
    • Encountering multiple pipeline failures and working on fixes.
  • If an SDK is being regenerated based on a new swagger spec, a link to the pull request containing these swagger spec changes has been included above.
  • I have read the contribution guidelines.
  • The pull request does not introduce breaking changes.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

SDK Generation Guidelines

  • The generate.cmd file for the SDK has been updated with the version of AutoRest, as well as the commitid of your swagger spec or link to the swagger spec, used to generate the code. (Track 2 only)
  • The *.csproj and AssemblyInfo.cs files have been updated with the new version of the SDK. Please double check nuget.org current release version.

Additional management plane SDK specific contribution checklist:

Note: Only applies to Microsoft.Azure.Management.[RP] or Azure.ResourceManager.[RP]

  • Include updated management metadata.
  • Update AzureRP.props to add/remove version info to maintain up to date API versions.

Management plane SDK Troubleshooting

  • If this is very first SDK for a services and you are adding new service folders directly under /SDK, please add new service label and/or contact assigned reviewer.
  • If the check fails at the Verify Code Generation step, please ensure:
    • Do not modify any code in generated folders.
    • Do not selectively include/remove generated files in the PR.
    • Do use generate.ps1/cmd to generate this PR instead of calling autorest directly.
      Please pay attention to the @microsoft.csharp version output after running generate.ps1. If it is lower than current released version (2.3.82), please run it again as it should pull down the latest version,

Old outstanding PR cleanup

Please note:
If PRs (including draft) has been out for more than 60 days and there are no responses from our query or followups, they will be closed to maintain a concise list for our reviewers.

@ghost ghost added Service Bus customer-reported Issues that are reported by GitHub users external to the Azure organization. labels Mar 25, 2021
@ghost
Copy link
Copy Markdown

ghost commented Mar 25, 2021

Thank you for your contribution @danielmarbach! We will review the pull request and get back to you soon.

@ghost ghost added the Community Contribution Community members are working on the issue label Mar 25, 2021
{
foreach (ReadOnlyMemory<byte> data in binaryData)
{
ArraySegment<byte> segment;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #19683 (comment) for more details

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
AMD Ryzen 9 3950X, 1 CPU, 32 logical and 16 physical cores
.NET Core SDK=5.0.201
  [Host]     : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  Job-QXDBJK : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT

InvocationCount=320000  
Method Elements Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
ToArray 1 225.0 ns 3.10 ns 2.90 ns 1.00 0.00 0.0406 - - 344 B
Marshal 1 218.9 ns 2.25 ns 2.10 ns 0.97 0.02 0.0344 - - 304 B
ToArray 2 386.2 ns 2.70 ns 2.39 ns 1.00 0.00 0.0594 - - 512 B
Marshal 2 396.9 ns 3.45 ns 3.23 ns 1.03 0.01 0.0500 - - 432 B
ToArray 4 749.5 ns 14.98 ns 24.19 ns 1.00 0.00 0.1000 - - 848 B
Marshal 4 771.4 ns 15.45 ns 20.08 ns 1.04 0.02 0.0813 - - 688 B
ToArray 8 1,515.2 ns 18.95 ns 16.80 ns 1.00 0.00 0.1813 - - 1520 B
Marshal 8 1,508.8 ns 29.63 ns 27.72 ns 1.00 0.02 0.1406 - - 1200 B
ToArray 16 2,992.2 ns 40.45 ns 37.84 ns 1.00 0.00 0.3469 - - 2912 B
Marshal 16 2,990.0 ns 37.82 ns 35.38 ns 1.00 0.01 0.2688 - - 2272 B
ToArray 32 5,852.9 ns 57.16 ns 53.47 ns 1.00 0.00 0.6844 0.0094 - 5728 B
Marshal 32 5,877.4 ns 64.63 ns 57.29 ns 1.00 0.01 0.5313 0.0063 - 4448 B
ToArray 64 11,681.2 ns 91.30 ns 85.40 ns 1.00 0.00 1.3563 0.0344 - 11360 B
Marshal 64 11,401.9 ns 74.13 ns 69.34 ns 0.98 0.01 1.0500 0.0281 - 8800 B

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Runtime.InteropServices;
    using System.Text;
    using BenchmarkDotNet.Attributes;
    using BenchmarkDotNet.Configs;
    using BenchmarkDotNet.Diagnosers;
    using BenchmarkDotNet.Exporters;
    using BenchmarkDotNet.Jobs;
    using Microsoft.Azure.Amqp;

    [Config(typeof(Config))]
    public class DataConversion
    {
        private IEnumerable<ReadOnlyMemory<byte>> data;

        private class Config : ManualConfig
        {
            public Config()
            {
                AddExporter(MarkdownExporter.GitHub);
                AddDiagnoser(MemoryDiagnoser.Default);
                AddJob(Job.Default.WithInvocationCount(320000));
            }
        }

        [Params(1, 2, 4, 8, 16, 32, 64)]
        public int Elements { get; set; }

        [IterationSetup]
        public void Setup()
        {
            data = Enumerable.Range(0, Elements)
                .Select(i => (ReadOnlyMemory<byte>) Encoding.UTF8.GetBytes($"Hello World {i}"));
        }

        [Benchmark(Baseline = true)]
        public ArraySegment<byte>[] ToArray()
        {
            return ConvertBefore(data);
        }

        static ArraySegment<byte>[] ConvertBefore(IEnumerable<ReadOnlyMemory<byte>> allData)
        {
            return allData.Select(data => new ArraySegment<byte>(data.IsEmpty ? Array.Empty<byte>() : data.ToArray())).ToArray();
        }

        [Benchmark]
        public ArraySegment<byte>[] Marshal()
        {
            return ConvertAfter(data);
        }

        static ArraySegment<byte>[] ConvertAfter(IEnumerable<ReadOnlyMemory<byte>> allData)
        {
            return allData.Select(data =>
            {
                ArraySegment<byte> segment;
                if (!data.IsEmpty)
                {
                    if (!MemoryMarshal.TryGetArray(data, out segment))
                    {
                        segment = new ArraySegment<byte>(data.ToArray());
                    }
                }
                else
                {
                    segment = new ArraySegment<byte>(Array.Empty<byte>());
                }

                return segment;
            }).ToArray();
        }
    }

Copy link
Copy Markdown
Member

@jsquire jsquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clever; I didn't realize there were interop marshalling operations for ReadOnlyMemory<T> and wouldn't have thought to go looking. I think it ends up straight forward and readable, especially if we reduce some of the corner cases that are handled for us.

I'll defer to Josh for final approval, but I think this makes sense. I'd also be interested in thoughts from @pakrym here.

foreach (ReadOnlyMemory<byte> data in binaryData)
{
ArraySegment<byte> segment;
if (!data.IsEmpty)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this case? If I'm reading correctly, if we pass an empty ReadOnlyMemory<T> instance in, an empty array segment should be returned.

A quick LinqPad test seems to confirm:

var x = new ReadOnlyMemory<byte>();
x.IsEmpty.Dump();  // This is true

if (MemoryMarshal.TryGetArray(x, out var y))
{
    y.Dump();  // this shows no items.
}
else
{
    throw new Exception("Unexpected result from marshalling");
}

{
if (!MemoryMarshal.TryGetArray(data, out segment))
{
segment = new ArraySegment<byte>(data.ToArray());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should throw in this case; it should never happen. It would require there to be some data in the ReadOnlyMemory<byte> instance but that data is not convertible to byte (ref).

If that happens, I think it's a sign that something is wrong enough that we should fail. Thoughts?

segment = new ArraySegment<byte>(data.ToArray());
}
}
else
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could streamline into something like:

private static IEnumerable<Data> AsAmqpData(this IEnumerable<ReadOnlyMemory<byte>> binaryData)
{
    foreach (ReadOnlyMemory<byte> data in binaryData)
    {
        ArraySegment<byte> segment;
        
        if (!MemoryMarshal.TryGetArray(data, out segment))
        {
            // Not the actual exception, of course.
            throw new Exception("OMG WTF?");  
        }
        
        yield return new Data
        {
            Value = segment
        };
    }
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • null would return an empy array segment
  • empty ROM would return an empty array segment
  • populated ROM would return the populated array segment
  • populated ROM that could not be converted to byte would throw

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to throw when MemoryMarshal fails - we use this same strategy in BinaryData

@jsquire jsquire requested a review from pakrym March 25, 2021 14:40
@jsquire
Copy link
Copy Markdown
Member

jsquire commented Mar 25, 2021

CI failures are unrelated to the changes; looks like an issue with package restore. Kicking off the failed stages again.

Copy link
Copy Markdown
Contributor

@pakrym pakrym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@danielmarbach
Copy link
Copy Markdown
Contributor Author

Should I address #19821 (comment)?

@danielmarbach
Copy link
Copy Markdown
Contributor Author

So if the additional branching is removed we'd get something like this

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
AMD Ryzen Threadripper 1920X, 1 CPU, 24 logical and 12 physical cores
.NET Core SDK=5.0.200
  [Host]     : .NET Core 3.1.12 (CoreCLR 4.700.21.6504, CoreFX 4.700.21.6905), X64 RyuJIT
  Job-JZYLHC : .NET Core 3.1.12 (CoreCLR 4.700.21.6504, CoreFX 4.700.21.6905), X64 RyuJIT

InvocationCount=320000  
Method Elements Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
ToArray 1 280.8 ns 5.61 ns 6.67 ns 1.00 0.00 0.0813 - - 344 B
Marshal 1 304.1 ns 6.32 ns 18.55 ns 1.00 0.05 0.0719 - - 304 B
ToArray 2 505.1 ns 4.30 ns 3.36 ns 1.00 0.00 0.1219 - - 512 B
Marshal 2 517.1 ns 10.38 ns 20.97 ns 0.97 0.02 0.1031 - - 432 B
ToArray 4 983.8 ns 19.65 ns 41.01 ns 1.00 0.00 0.2000 - - 848 B
Marshal 4 906.7 ns 18.03 ns 24.68 ns 0.95 0.03 0.1625 - - 688 B
ToArray 8 1,866.7 ns 17.76 ns 16.61 ns 1.00 0.00 0.3625 - - 1520 B
Marshal 8 1,844.9 ns 25.69 ns 24.03 ns 0.99 0.02 0.2844 - - 1200 B
ToArray 16 3,693.3 ns 49.65 ns 44.01 ns 1.00 0.00 0.6938 - - 2912 B
Marshal 16 3,647.9 ns 36.78 ns 34.40 ns 0.99 0.02 0.5406 - - 2272 B
ToArray 32 7,448.2 ns 102.68 ns 96.05 ns 1.00 0.00 1.3688 - - 5728 B
Marshal 32 6,911.0 ns 101.98 ns 95.39 ns 0.93 0.02 1.0625 - - 4448 B
ToArray 64 14,569.5 ns 219.90 ns 194.94 ns 1.00 0.00 2.7156 - - 11360 B
Marshal 64 13,471.7 ns 143.58 ns 127.28 ns 0.92 0.02 2.1031 - - 8800 B

@pakrym
Copy link
Copy Markdown
Contributor

pakrym commented Mar 25, 2021

@danielmarbach where is the majority of allocating in this benchmark coming from? Enumerables?

@danielmarbach
Copy link
Copy Markdown
Contributor Author

@pakrym from data.ToArray() I assume the rest is equivalent in terms of LINQ usage.

@pakrym
Copy link
Copy Markdown
Contributor

pakrym commented Mar 25, 2021

I'm trying to figure out why the allocation improvement so in proportion to overall allocations.

@JoshLove-msft
Copy link
Copy Markdown
Member

Should I address #19821 (comment)?

@jsquire what do you think? I'm okay with the changes as is. I prefer not to throw if MemoryMarshal fails, but you may be right about avoiding the empty array segment branch.

@JoshLove-msft
Copy link
Copy Markdown
Member

/azp run net - servicebus - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@jsquire
Copy link
Copy Markdown
Member

jsquire commented Mar 25, 2021

Should I address #19821 (comment)?

@jsquire what do you think? I'm okay with the changes as is. I prefer not to throw if MemoryMarshal fails, but you may be right about avoiding the empty array segment branch.

I'm good either way, I don't think there's a performance consideration, it's just a matter of which you find more readable. I am curious about the decision not to throw in BinaryData but I'm not opposed to following that pattern here.

@JoshLove-msft
Copy link
Copy Markdown
Member

Should I address #19821 (comment)?

@jsquire what do you think? I'm okay with the changes as is. I prefer not to throw if MemoryMarshal fails, but you may be right about avoiding the empty array segment branch.

I'm good either way, I don't think there's a performance consideration, it's just a matter of which you find more readable. I am curious about the decision not to throw in BinaryData but I'm not opposed to following that pattern here.

I don't know all the details about why MemoryMarshal would fail, but if there is a way to recover it from, why not do so?

@pakrym
Copy link
Copy Markdown
Contributor

pakrym commented Mar 25, 2021

We shouldn't throw on MemoryMarshal failing. It might fail in the case of ROM being backed by the native memory, for example. We should just recover and keep on going.

@jsquire
Copy link
Copy Markdown
Member

jsquire commented Mar 25, 2021

We shouldn't throw on MemoryMarshal failing. It might fail in the case of ROM being backed by the native memory, for example. We should just recover and keep on going.

Fair enough; that satisfies my curiosity. Thanks.

@JoshLove-msft
Copy link
Copy Markdown
Member

/azp run net - servicebus - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@JoshLove-msft JoshLove-msft merged commit 5f9591c into Azure:master Mar 25, 2021
@danielmarbach danielmarbach deleted the sender-optimization-split branch March 30, 2021 04:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. Service Bus

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants