Allow explicit data transfers to GPUs#156620
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
e475c46 to
da102aa
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Vendoring llvm/llvm-project#198033 for now. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
abc274d to
1d8d1e7
Compare
This comment has been minimized.
This comment has been minimized.
1d8d1e7 to
a94ef31
Compare
This comment has been minimized.
This comment has been minimized.
4b77bad to
319ef7d
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
fba7eb2 to
358171b
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2f1d614 to
bbe3882
Compare
This comment has been minimized.
This comment has been minimized.
bbe3882 to
d290591
Compare
This comment has been minimized.
This comment has been minimized.
d290591 to
e8ad696
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
4f5c325 to
6c8bec9
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| // This exists so MIR creates Drop terminators for PreloadMut. | ||
| // rustc codegen intercepts those terminators and emits the | ||
| // offload return mapper. |
There was a problem hiding this comment.
why is this not just an intrinsic call here?
There was a problem hiding this comment.
Partly just experimenting, partly because intrinsics recently changed a bit, they got updated for more explicit Place handling, about which I didn't want to think for my mvp. I'll update them to intrinsics after my deadline.
|
|
||
| #[lang = "preload"] | ||
| #[unstable(feature = "offload", issue = "124509")] | ||
| pub fn preload<'a, T: ?Sized>(x: &'a T) -> Preload<'a, T> { |
There was a problem hiding this comment.
Yea I think these should just be intrinsics instead of catching lang item calls during codegen of call terminators.
There was a problem hiding this comment.
"Which lang items"? (my code who fails to catch an inlined terminator call :D)
There was a problem hiding this comment.
ok, with an intrinsic it actually seems to work in release.
Not sure if we want one intrinsic with 2 arguments (mut/const, init/drop) or 4 intrinsics. Right now I have two.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
The job Click to see the possible cause of the failure (guessed by this bot) |
|
☔ The latest upstream changes (presumably #158416) made this pull request unmergeable. Please resolve the merge conflicts by rebasing. |
View all comments
So far we had our offload intrinsics handle data movement automatically to/from the gpu.
That's convenient (and reasonably fast once our LLVM opts land). However, Rust generally also allows being explicit. That might give perf benefits (where our LLVM opts fail), and it could also be nice for modelling, when passing data around but still preventing CPU users from accesing it.
Tracking Issue for GPU-offload #131513