Skip to content

libsc crashes Julia when running the GC and using multiple threads #35

@ranocha

Description

@ranocha

Here is an MWE:

# Debug segfault with multithreading when loading t8code

using MPI
using T8code

if !MPI.Initialized()
  mpiret  = MPI.Init()
end

# segfaults
T8code.Libt8.sc_init(MPI.COMM_WORLD, 1, 1, C_NULL, T8code.Libt8.SC_LP_ERROR)

# segfaults
# T8code.Libt8.sc_init(MPI.COMM_WORLD, 1, 0, C_NULL, T8code.Libt8.SC_LP_ERROR)

# does not segfault
# T8code.Libt8.sc_init(MPI.COMM_WORLD, 0, 1, C_NULL, T8code.Libt8.SC_LP_ERROR)

# does not segfault
# T8code.Libt8.sc_init(MPI.COMM_WORLD, 0, 0, C_NULL, T8code.Libt8.SC_LP_ERROR)

function allocate_as_crazy()
  n = 1_000
  k = 100
  A = [randn(k, k) for _ in 1:n]
  B = [randn(k, k) for _ in 1:n]

  Threads.@threads for i in 1:n
    A[i] = A[i] * B[i]
  end

  return sum(sum(A))
end

@show allocate_as_crazy()

Saving this script as debug_segfaults_t8code.jl, initializing an appropriate project with MPI and T8code, and running Julia v1.9.3 yields something like

$ julia --project=run --threads=6 debug_segfaults_t8code.jl
[libsc 0] Caught signal SEGV
[libsc 0] Caught signal SEGV
[libsc 0[libsc [libsc 0] Caught signal SEGV
] Caught signal SEGV
0] Caught signal SEGV
[libsc 0] Abort: Obtained 10 stack frames
[libsc 0] Stack 0: libsc.so.2(+0xd401) [0x7f03ac028401]
[libsc 0] Stack 1: libsc.so.2(sc_abort+0xa) [0x7f03ac0278ea]
[libsc 0] Stack 2: libsc.so.2(+0xd3cd) [0x7f03ac0283cd]
[libsc 0] Stack 3: libc.so.6(+0x3c4b0) [0x7f03c463c4b0]
[libsc 0] Stack 4: libjulia-internal.so.1(_jl_mutex_wait+0x91) [0x7f03c388f2e1]
[libsc 0] Stack 5: libjulia-internal.so.1(_jl_mutex_lock+0x30) [0x7f03c388f3a0]
[libsc 0] Stack 6: libjulia-codegen.so.1(jl_generate_fptr_impl+0x83) [0x7f03c452e393]
[libsc 0] Stack 7: libjulia-internal.so.1(jl_compile_method_internal+0xa0) [0x7f03c3842310]
[libsc 0] Stack 8: libjulia-internal.so.1(ijl_apply_generic+0x43e) [0x7f03c384311e]
[libsc 0] Stack 9: libjulia-internal.so.1(+0x645c0) [0x7f03c38645c0]
[libsc 0] Abort: Obtained 10 stack frames

It seems to be fine if libsc is initialized with T8code.Libt8.sc_init(MPI.COMM_WORLD, 0, ...), i.e., with catch_signals = 0.

CC @NicolasRiel

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions