Skip to content

Enable prefetch on MSVC#70

Open
sschoener wants to merge 6 commits intounity-masterfrom
perf-prefetch
Open

Enable prefetch on MSVC#70
sschoener wants to merge 6 commits intounity-masterfrom
perf-prefetch

Conversation

@sschoener
Copy link
Copy Markdown

Enables support for prefetch intrinsics on MSVC.

@sschoener sschoener marked this pull request as ready for review January 10, 2023 11:15
@sschoener sschoener requested a review from joncham January 10, 2023 11:16
Comment thread include/gc_inline.h
# if GC_GNUC_PREREQ(3, 0) && !defined(GC_NO_PREFETCH_FOR_WRITE)
# define GC_PREFETCH_FOR_WRITE(x) __builtin_prefetch((x), 1)
# elif defined(_MSC_VER) && !defined(GC_NO_PREFETCH_FOR_WRITE)
# define GC_PREFETCH_FOR_WRITE(x) _mm_prefetch((x), _MM_HINT_T0)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Work to support WIndows ARM64 recently landed by @jem-patel . Assuming the Mono branch for these prefetch changes has that work, just make sure that platform/arch still builds on CI with these changes. I don't think this intrinsic exists on ARM64.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this file get included in IL2CPP? If so, we need both ARM32 & ARM64 paths as we support those for UWP. If it's only used by Mono, then we need ARM64 path for Windows Standalone.

From what I can tell looking at the headers, these intrinsics are available there:

__MACHINEARM_ARM64(void __cdecl __prefetch(const void *))
__MACHINEARM64(void __cdecl __prefetch2(const void *, uint8_t prfop))
__MACHINEARM(void __cdecl __prefetchw(const void *))

I don't know which one(s) we want to use.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My proposal is this

  • we have shipped without any of this for the longest time,
  • I have no idea which of the ARM intrinsics to use and I have no machine available for testing as of writing,
  • hence we should exclude ARM, i.e. wrap in !defined(_M_ARM) && !defined(_M_ARM64)

Thoughts?

Comment thread include/gc_inline.h Outdated
@ivmai
Copy link
Copy Markdown

ivmai commented Mar 26, 2023

we have shipped without any of this for the longest time

@sschoener, would be good to have some benchmarking about the proposed change.

hence we should exclude ARM, i.e. wrap in !defined(_M_ARM) && !defined(_M_ARM64)

It would be better to write the conditions in the opposite way, e.g.: (_M_IX86 >= 300) || defined(_M_X64)

Comment thread include/gc_inline.h
#include "gc.h"
#include "gc_tiny_fl.h"

# ifdef _MSC_VER
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of public headers (like gc_inline.h), it would be better to include additional headers only for the condition which require it, e.g. in this case it could be: !defined(GC_PREFETCH_FOR_WRITE) && !defined(GC_NO_PREFETCH_FOR_WRITE) && defined(_MSC_VER) && (defined(_M_X64) || ...)

@ivmai
Copy link
Copy Markdown

ivmai commented Sep 18, 2023

Upstreamed with some minor modifications: bdwgc@703fde0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants