Skip to content

Failed to write output header #731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
1 task done
jaredhaertel opened this issue Mar 31, 2025 · 15 comments
Closed
1 task done

Failed to write output header #731

jaredhaertel opened this issue Mar 31, 2025 · 15 comments
Labels
bug Something isn't working

Comments

@jaredhaertel
Copy link

jaredhaertel commented Mar 31, 2025

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

I have a file hello.py that just contains print("hello world"). I get the following error

(test) jhaertel@orca code % memray run hello.py
Writing profile results into memray-hello.py.75618.bin
Traceback (most recent call last):
  File "/u/jhaertel/.conda/envs/test/bin/memray", line 8, in <module>
    sys.exit(main())
             ~~~~^^
  File "/u/jhaertel/.conda/envs/test/lib/python3.13/site-packages/memray/commands/__init__.py", line 138, in main
    arg_values.entrypoint(arg_values, parser)
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/u/jhaertel/.conda/envs/test/lib/python3.13/site-packages/memray/commands/run.py", line 339, in run
    _run_with_file_output(args)
    ~~~~~~~~~~~~~~~~~~~~~^^^^^^
  File "/u/jhaertel/.conda/envs/test/lib/python3.13/site-packages/memray/commands/run.py", line 188, in _run_with_file_output
    _run_tracker(
    ~~~~~~~~~~~~^
        destination=destination,
        ^^^^^^^^^^^^^^^^^^^^^^^^
        args=args,
        ^^^^^^^^^^
        post_run_message=example_report_generation_message,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/u/jhaertel/.conda/envs/test/lib/python3.13/site-packages/memray/commands/run.py", line 57, in _run_tracker
    with tracker:
         ^^^^^^^
  File "src/memray/_memray.pyx", line 697, in memray._memray.Tracker.__enter__
  File "src/memray/_memray.pyx", line 720, in memray._memray.Tracker.__enter__
RuntimeError: Failed to write output header

This occurs with any python file I attempt.

Expected Behavior

I expected it to run and create a binary file.

Steps To Reproduce

  1. In this virtual environment, I just have python and memray.
  2. Run memray on hello.py that just contains print("hello world").

Memray Version

1.16.0

Python Version

3.12, 3.13

Operating System

macOS

Anything else?

This is on MacOS 12.6.2 on a Intel i7-6700K CPU.

@jaredhaertel jaredhaertel added the bug Something isn't working label Mar 31, 2025
@pablogsal
Copy link
Member

Unfortunately I cannot reproduce this in my side. This is quite surprising and probably is pointing to the filesystem and not memray. We are not doing anything fancy for that. The error happens here:

throw IoError{"Failed to write output header"};

and we are doing this:

StreamingRecordWriter::writeHeader(bool seek_to_start)
{
if (seek_to_start) {
// If we can't seek to the beginning to the stream (e.g. dealing with a socket), just give
// up.
if (!d_sink->seek(0, SEEK_SET)) {
return false;
}
}
d_stats.end_time = duration_cast<milliseconds>(system_clock::now().time_since_epoch()).count();
d_header.stats = d_stats;
return writeHeaderCommon(d_header);
}

Do you feel confident enough to compile memray yourself and use lldb to follow where under writeHeader is returning an error?

@godlygeek
Copy link
Contributor

Wild guess: perhaps the filesystem has filled up, or you're out of disk quota...

@jaredhaertel
Copy link
Author

Wild guess: perhaps the filesystem has filled up, or you're out of disk quota...

My disk has plenty available disk space so I don't think that is the issue.

Do you feel confident enough to compile memray yourself and use lldb to follow where under writeHeader is returning an error?

I was able to compile memray but I have no clue how to do lldb.

@pablogsal
Copy link
Member

The error occurs in tracking_api.cpp line 576 when Memray can't write the output header to the file. Using LLDB will help pinpoint exactly where the failure happens.

# Start LLDB with memray
lldb -- python3 -m memray run ....

# Set breakpoints (you can add more functions like writeHeaderCommon)
(lldb) b StreamingRecordWriter::writeHeader

# Run
(lldb) run

# When the first breakpoint hits, step through the code
(lldb) n

# If you reach writeHeaderCommon or other functions, step into it
(lldb) s

The error is likely happening either in the seek operation or in writeHeaderCommon. By stepping through the code, you'll be able to identify exactly which operation is failing and why.

If you find this to difficult and you can indeed compile memray, just add a lot of printf just to know what fails.

Let us know what you find!

@dstndstn
Copy link

fwiw I'm also getting this on a Linux system (the Perlmutter supercomputer at NERSC), "SUSE Linux Enterprise Server 15 SP5".

I'm running it inside a Shifter (docker-like) container.

No quota issues (3 GB of quota available - I'm using 47/50 GB)

Ahh: if I get it to write to /tmp/memray.out, it works. On the home filesystem, it fails like this:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/usr/local/lib/python3.12/dist-packages/memray/__main__.py", line 6, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/memray/commands/__init__.py", line 138, in main
    arg_values.entrypoint(arg_values, parser)
  File "/usr/local/lib/python3.12/dist-packages/memray/commands/run.py", line 339, in run
    _run_with_file_output(args)
  File "/usr/local/lib/python3.12/dist-packages/memray/commands/run.py", line 188, in _run_with_file_output
    _run_tracker(
  File "/usr/local/lib/python3.12/dist-packages/memray/commands/run.py", line 57, in _run_tracker
    with tracker:
  File "src/memray/_memray.pyx", line 694, in memray._memray.Tracker.__enter__
  File "src/memray/_memray.pyx", line 717, in memray._memray.Tracker.__enter__
RuntimeError: Failed to write output header

@dstndstn
Copy link

My lldb attempt was not very successful - I guess I need to build it with debugging symbols or something...

(lldb) b StreamingRecordWriter::writeHeader
Breakpoint 1: no locations (pending).
WARNING:  Unable to resolve breakpoint to any actual locations.
(lldb) run
Process 1674840 launched: '/usr/bin/python' (x86_64)
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/liblz4-af1653fb.so.1.8.3 No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libunwind-d5b5a318.so.8.0.1 No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libssl-cd1d6220.so.1.0.2k No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libcrypto-d3570994.so.1.0.2k No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/liblzma-004595ca.so.5.2.2 No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libgssapi_krb5-497db0c6.so.2.2 No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libkrb5-fcafa220.so.3.3 No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libcom_err-2abe824b.so.2.1 No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libk5crypto-b1f99d5c.so.3.1 No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libkrb5support-d0bcff84.so.0.1 No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libkeyutils-dfe70bd6.so.1.5 No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libselinux-0922c95c.so.1 No LZMA support found for reading .gnu_debugdata section
warning: (x86_64) /usr/local/lib/python3.12/dist-packages/memray.libs/libpcre-9513aab5.so.1.2.0 No LZMA support found for reading .gnu_debugdata section
1 location added to breakpoint 1
Writing profile results into memray.out
Memray WARNING: Correcting symbol for malloc from 0x420620 to 0x7ffff7d20640
Memray WARNING: Correcting symbol for free from 0x420ab0 to 0x7ffff7d20d20
Process 1674840 stopped
* thread #1, name = 'python', stop reason = breakpoint 1.1
    frame #0: 0x00007ffff78236a0 _memray.cpython-312-x86_64-linux-gnu.so`memray::tracking_api::StreamingRecordWriter::writeHeader(this=0x0000000000d90f60, seek_to_start=false) at record_writer.cpp:354:1
(lldb) l
(lldb) n
Process 1674840 stopped
* thread #1, name = 'python', stop reason = step over
    frame #0: 0x00007ffff78236a4 _memray.cpython-312-x86_64-linux-gnu.so`memray::tracking_api::StreamingRecordWriter::writeHeader(this=0x0000000000d90f60, seek_to_start=false) at record_writer.cpp:355:5
(lldb) 
Process 1674840 stopped
* thread #1, name = 'python', stop reason = step over
    frame #0: 0x00007ffff78236bb _memray.cpython-312-x86_64-linux-gnu.so`memray::tracking_api::StreamingRecordWriter::writeHeader(this=0x0000000000d90f60, seek_to_start=<unavailable>) at record_writer.cpp:363:70
(lldb) 
Process 1674840 stopped
* thread #1, name = 'python', stop reason = step over failed (Could not create return address breakpoint.)
    frame #0: 0x00007ffff77b0640 _memray.cpython-312-x86_64-linux-gnu.so`std::chrono::_V2::system_clock::now()
_memray.cpython-312-x86_64-linux-gnu.so`std::chrono::_V2::system_clock::now:
->  0x7ffff77b0640 <+0>:  jmpq   *0xdfcda(%rip)
    0x7ffff77b0646 <+6>:  pushq  $0x61
    0x7ffff77b064b <+11>: jmp    0x7ffff77b0020

_memray.cpython-312-x86_64-linux-gnu.so`PyList_SetSlice:
    0x7ffff77b0650 <+0>:  jmpq   *0xdfcd2(%rip)



(lldb) cont
Process 1674840 resuming
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/usr/local/lib/python3.12/dist-packages/memray/__main__.py", line 6, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/memray/commands/__init__.py", line 138, in main
    arg_values.entrypoint(arg_values, parser)
  File "/usr/local/lib/python3.12/dist-packages/memray/commands/run.py", line 339, in run
    _run_with_file_output(args)
  File "/usr/local/lib/python3.12/dist-packages/memray/commands/run.py", line 188, in _run_with_file_output
    _run_tracker(
  File "/usr/local/lib/python3.12/dist-packages/memray/commands/run.py", line 57, in _run_tracker
    with tracker:
  File "src/memray/_memray.pyx", line 694, in memray._memray.Tracker.__enter__
  File "src/memray/_memray.pyx", line 717, in memray._memray.Tracker.__enter__
RuntimeError: Failed to write output header
Process 1674840 exited with status = 1 (0x00000001) 

But honestly, since it works writing to /tmp instead, I'm not feeling super motivated.

The filesystem is this weird thing that's maybe custom to NERSC: "/global/u1 on /dvs_rw/u1 type dvs"
DVS is like some kind of proxy frontend (with non-POSIX failure modes, as I have found to my extreme dismay in the past) for their filesystems.

@godlygeek
Copy link
Contributor

Ah, and what sort of filesystem is used for the home directory? That strongly suggests that it's some feature that we rely on that the filesystem doesn't support.

It might be instructive to run:

strace -o strace.out -s 1024 python -m memray run -fo output.bin -c pass

and then just share the strace.out file here (assuming that filesystem paths to installed libraries aren't something you need to keep secret, at least). That should show us which syscall is failing.

@dstndstn
Copy link

gdb maybe slightly more enlightening (I just git cloned the repo in the running directory, thus the complaints that the source file is newer than the binary... I just pip installed memray)

(gdb) b StreamingRecordWriter::writeHeader
Function "StreamingRecordWriter::writeHeader" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (StreamingRecordWriter::writeHeader) pending.
(gdb) run
Starting program: /usr/bin/python -m memray run -o memray.out ./legacypipe/runbrick.py --blobid 1 --skip --blob-dilate 4 --run ibis --bands M411,M438,M464,M490,M517 --nsatur 2 --brick 1508p027 --rgb-stretch 1.5 --sub-blobs --no-wise --skip-calibs --checkpoint /pscratch/sd/d/dstn/ibis4-debug/checkpoints/150/checkpoint-1508p027.pickle --checkpoint-period 300 --pickle /pscratch/sd/d/dstn/ibis4-debug/pickles/150/runbrick-%\(brick\)s-%%\(stage\)s.pickle --outdir /pscratch/sd/d/dstn/ibis4-debug --write-stage srcs
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Writing profile results into memray.out
Memray WARNING: Correcting symbol for malloc from 0x420620 to 0x7ffff7d20640
Memray WARNING: Correcting symbol for free from 0x420ab0 to 0x7ffff7d20d20

Breakpoint 1, memray::tracking_api::StreamingRecordWriter::writeHeader (this=0xd608c0, seek_to_start=false) at src/memray/_memray/record_writer.cpp:355
warning: Source file is more recent than executable.
355	    if (seek_to_start) {
(gdb) n
363	    d_stats.end_time = duration_cast<milliseconds>(system_clock::now().time_since_epoch()).count();
(gdb) 
364	    d_header.stats = d_stats;
(gdb) 
365	    return writeHeaderCommon(d_header);
(gdb) s
memray::tracking_api::RecordWriter::writeHeaderCommon (this=0xd608c0, header=...) at src/memray/_memray/record_writer.cpp:371
371	    if (!writeSimpleType(header.magic) or !writeSimpleType(header.version)
(gdb) l
366	}
367	
368	bool
369	RecordWriter::writeHeaderCommon(const HeaderRecord& header)
370	{
371	    if (!writeSimpleType(header.magic) or !writeSimpleType(header.version)
372	        or !writeSimpleType(header.native_traces) or !writeSimpleType(header.file_format)
373	        or !writeSimpleType(header.stats) or !writeString(header.command_line.c_str())
374	        or !writeSimpleType(header.pid) or !writeSimpleType(header.main_tid)
375	        or !writeSimpleType(header.skipped_frames_on_main_tid)
(gdb) n
warning: 173	/opt/rh/devtoolset-10/root/usr/include/c++/10/bits/unique_ptr.h: No such file or directory
(gdb) 
memray::tracking_api::Tracker::Tracker (this=<optimized out>, record_writer=..., native_traces=<optimized out>, memory_interval=<optimized out>, 
    follow_fork=<optimized out>, trace_python_allocators=<optimized out>, this=<optimized out>, record_writer=..., native_traces=<optimized out>, 
    memory_interval=<optimized out>, follow_fork=<optimized out>, trace_python_allocators=<optimized out>) at src/memray/_memray/tracking_api.cpp:580
warning: Source file is more recent than executable.
580	        throw IoError{"Failed to write output header"};
(gdb) l
575	        NativeTrace::setup();
576	    });
577	
578	    d_writer->setMainTidAndSkippedFrames(thread_id(), computeMainTidSkip());
579	    if (!d_writer->writeHeader(false)) {
580	        throw IoError{"Failed to write output header"};
581	    }
582	
583	    RecursionGuard guard;
584	    updateModuleCacheImpl();

@dstndstn
Copy link

Good suggestion, @godlygeek -- here's strace.out

strace.out.txt

@pablogsal
Copy link
Member

pablogsal commented Apr 11, 2025

Looking at the strace output, we can see the exact point of failure in memray's execution: first, a call to fallocate(3, 0, 0, 4096) returns -1 EOPNOTSUPP (Operation not supported), indicating the filesystem doesn't support pre-allocation of space. Then, when memray attempts to call mmap(NULL, 16777216, PROT_WRITE, MAP_SHARED, 3, 0) call fails with -1 ENOSYS (Function not implemented). These failed system calls reveal that memray requires filesystem capabilities that aren't available in all environments, particularly in setups where filesystems might not implement the full POSIX filesystem API. The error manifests as the generic "Failed to write output header" message, but the root cause is clearly a mismatch between memray's expectations and the filesystem's capabilities.

@dstndstn
Copy link

Makes sense. It's always nice if error messages can include detail on why a thing failed, rather than just reporting the failure.

Thanks for the debugging help.

@pablogsal
Copy link
Member

pablogsal commented Apr 11, 2025

Yeah I agree. We should at least invest into more logging arround errors in genneral 👍

Will open a new issue

@pablogsal pablogsal closed this as not planned Won't fix, can't repro, duplicate, stale Apr 11, 2025
@jaredhaertel
Copy link
Author

Ahh: if I get it to write to /tmp/memray.out, it works.

Strange. This works for me as well.

It appears as if I need more admin privileges to run lldb which I should get hopefully later today.

@pablogsal
Copy link
Member

Don’t worry check the previous discussion about what the problem is

@godlygeek
Copy link
Contributor

I've opened #748 to have Memray support falling back to a slower but more portable path that should work on non-POSIX filesystems like the ones you're each working on. In the meantime, it seems like writing your capture files to /tmp should be a good workaround for both of you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants