Skip to content

Sporadic SIGSEGV: native addon .got.plt reset to unrelocated file offsets #62515

@kerneltoast

Description

@kerneltoast

Version

v25.8.2

Platform

Linux sultan-box 6.18.6 x86_64 GNU/Linux
V8: 14.1.146.11-node.24

Subsystem

vm, deps (V8 memory management)

What steps will reproduce the bug?

The crash occurs sporadically when a native addon (better-sqlite3 v12.8.0) is loaded and the process is performing stdio pipe I/O. I have not been able to produce a minimal standalone reproducer; the crash happens approximately once per day under sustained use of a Node.js MCP (Model Context Protocol) plugin that communicates via stdin/stdout pipes and uses better-sqlite3 for local FTS5 search.

Command line:

node /home/sultan/.claude/plugins/cache/context-mode/context-mode/1.0.53/start.mjs

How often does it reproduce? Is there a required condition?

Approximately 1-2 times per day under active use. 9 crashes observed over 7 days (March 24-30, 2026). All crashes occur during libuv stream read callbacks on the main thread.

What is the expected behavior? Why is that the expected behavior?

The process should not crash. Native addon PLT/GOT resolution should remain intact for the lifetime of the process.

What do you see instead?

SIGSEGV (signal 11) in the main thread. The systemd journal shows the crash at near-null addresses (0x12566 or 0x120b6) with unsymbolized JIT frames, but GDB analysis of the coredump reveals the real cause: the entire .got.plt section of better_sqlite3.node has been reset to its on-disk (unrelocated) state.

Detailed Analysis

Crash path (from GDB)

#0  0x0000000000012566 in ?? ()
#1  Database::JS_prepare() from better_sqlite3.node   [+86: return addr after call to v8::Value::IsObject@plt]
#2  0x00007fcd82d8fb4d in ?? ()                       [V8 JIT trampoline]
    ...
#17 v8::Function::Call()
#18 node::InternalCallbackScope::Close()
#19 node::InternalMakeCallback()
#20 node::AsyncWrap::MakeCallback()
#21 node::StreamBase::CallJSOnreadMethod()
#22 node::EmitToJSStreamListener::OnStreamRead()
#23 node::LibuvStreamWrap::OnUvRead()
    ... libuv event loop ...

GOT corruption evidence

The PLT entry for v8::Value::IsObject in better_sqlite3.node jumps through the GOT:

0x7fcda0420560 <IsObject@plt>: jmp *0x1d7d32(%rip)  # GOT at 0x7fcda05f8298

The GOT entry contains 0x0000000000012566 instead of the correct 0x556ec371ac30 (v8::Value::IsObject in the node binary).

Every GOT entry in better_sqlite3.node is corrupted in the same way:

GOT Address    Symbol                        Value (corrupted)  Expected
0x7fcda05f8250 v8::Exception::RangeError     0x124d6            0x556ecXXXXXXX
0x7fcda05f8260 strrchr                       0x124f6            libc address
0x7fcda05f8270 memchr                        0x12516            libc address
0x7fcda05f8280 v8::Object::New               0x12536            0x556ecXXXXXXX
0x7fcda05f8298 v8::Value::IsObject           0x12566            0x556ec371ac30
0x7fcda05f82a0 pthread_mutex_destroy         0x12576            libc address
0x7fcda05f82c0 node::Buffer::New             0x125b6            0x556ecXXXXXXX
0x7fcda05f82d0 operator delete[]             0x125d6            libstdc++ addr
0x7fcda05f82e0 malloc                        0x125f6            libc address

All corrupted values increment by 0x10 (the x86-64 PLT stub size). These are the original ELF file offsets of the PLT stubs as they appear in the .plt section (which starts at file offset 0x12020). The dynamic linker should have replaced these with resolved runtime addresses at load time.

Memory layout

Mapping                                     File Offset  Contents
0x7fcda040e000 - 0x7fcda0420000 (r--)       0x000000     ELF headers
0x7fcda0420000 - 0x7fcda05c2000 (r-x)       0x012000     .text (code)
0x7fcda05c2000 - 0x7fcda05f4000 (r--)       0x1b4000     .rodata
0x7fcda05f4000 - 0x7fcda05f8000 (rw-)       0x1e5000     .data, .got
0x7fcda05f8000 - 0x7fcda05fd000 (rw-)       0x1e9000     .got.plt, .bss  <-- CORRUPTED

The .got.plt resides in a MAP_PRIVATE file-backed page at 0x7fcda05f8000. The library does not use full RELRO (-z now / BIND_NOW), so .got.plt remains writable after load.

Root cause hypothesis

After the dynamic linker resolves symbols and writes runtime addresses into .got.plt, the page becomes a copy-on-write anonymous page backed by the original file. If madvise(MADV_DONTNEED) is called on this page, the kernel discards the anonymous CoW copy and re-reads from the backing file on next access, restoring the pre-relocation content.

V8 uses madvise(MADV_DONTNEED) (via DiscardSystemPages) extensively to decommit unused heap pages. A bug in V8's virtual memory range management could cause it to accidentally target the address range 0x7fcda05f8000 - 0x7fcda05fd000, which belongs to the native addon rather than V8's own heap.

The sporadic nature is consistent with this hypothesis: the corruption only occurs when V8's memory decommit operation happens to target an address range that overlaps with the native addon's writable data segment.

Registers at crash

rax  0x3                  rbx  0x7ffea3710808
rcx  0x1                  rdx  0x9ab13cc0639
rsi  0x1f2b0b919621       rdi  0x7ffea3710880
rbp  0x7ffea37107f0       rsp  0x7ffea37107b8
r8   0x3                  r9   0x4
r10  0x9ab13cc0011        r13  0x556f069e0578
r14  0x9ab13cc0011        r15  0x556f069e24f0
rip  0x12566

Additional information

  • 9 coredumps collected over March 24-30, 2026
  • All crashes follow the same pattern: SIGSEGV during libuv stream read callback
  • The systemd journal traces are misleading because they show unsymbolized JIT frames; only GDB with the coredump reveals the better_sqlite3.node GOT corruption
  • The better_sqlite3.node module (v12.8.0) is compiled without BIND_NOW, leaving .got.plt writable
  • Crash variants include stream read path (7/9), handle close path (1/9), and general callback (1/9), but all share the same underlying GOT corruption

I have attached the systemd journal traces for 9 crashes, 1 coredump, and my node executable (node.gz). The coredump (core.node-MainThread.1000.4e8ab4a77bff4ddcb823fc3f3ef6527e.149455.1774895482000000.gz) corresponds to the node_20260330.txt trace.

node_20260324.txt
node_20260325.txt
node-1_20260328.txt
node-2_20260328.txt
node-3_20260328.txt
node-4_20260328.txt
node-1_20260329.txt
node-2_20260329.txt
node_20260330.txt
core.node-MainThread.1000.4e8ab4a77bff4ddcb823fc3f3ef6527e.149455.1774895482000000.gz
node.gz

Note that I generated most of this report using AI, specifically Claude Code (Opus 4.6). The analysis and technical write-up came from Claude.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions