[AN-503] Update to Dataproc 2.2.X#4839
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #4839 +/- ##
===========================================
- Coverage 74.67% 74.67% -0.01%
===========================================
Files 166 166
Lines 14623 14622 -1
Branches 1156 1143 -13
===========================================
- Hits 10920 10919 -1
Misses 3703 3703
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
…e VM and investigate
…phere/leonardo into AN-503-update-to-dataproc-2.2.x
| STEP_TIMINGS=($(date +%s)) | ||
|
|
||
|
|
||
| ## Installs Google Cloud Ops Agent that is now required for Datapoc 2.2.X ### |
There was a problem hiding this comment.
This is the main change in addition to updating docker compose, and that sneaky change in external IP assignment behavior.
It is annoying that the new log agent does not come pre-built into the dataproc image itself, but the install and setup was not too too bad in the end
aednichols
left a comment
There was a problem hiding this comment.
I appreciate the detailed comments.
|
@Qi77Qi I modified the PR to make sure that Leonardo can support both the deployment of the AOU 2.2.13 image on Dataproc 2.1.x (aka what you currently have in production), and AOU 2.2.16 image on Dataproc 2.2.x. I will do some testing on my BEE, but this should let us release the new hail/dataproc version on terra without impacting RWB (you can switch your pre prod / prod environments on your own timeline). |
Jira ticket: https://broadworkbench.atlassian.net/browse/AN-503
Summary of changes
What
Why
Testing these changes
I pointed my BEE to this PR and was able to successfully launch a Hail and AOU image with both a spark single node, and a spark cluster with 2 nodes.
When opening a jupyter notebook, I can import and initialize the new version of hail:
I also was able to launch the AOU image that is currently I prod using the legacy Dataproc 2.1 image. So we should be safe to merge this as it won't impact RWB and they can move over to Dataproc 2.2 when they want:
jenkins retestorjenkins multi-test.