Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Persist the Driver Console Log When Job Execution Not Through Job API #49452

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

MengjinYan
Copy link
Collaborator

@MengjinYan MengjinYan commented Dec 27, 2024

Why are these changes needed?

Currently, when user submitting a Ray job through interactive shell or running a python script, the job's log will be output to the console and will not be persisted anywhere. This causes loss of log information in the case where the user disconnected from the console.

This PR added the logic to persist the log information at the same time, still output the log information to the console. To be specific:

  1. An environment variable RAY_DRIVER_CONSOLE_LOG_TO_FILE will be used as a switch to the behavior. Only when (1) the env var is set to "1" AND (2) The job is not submitted using Job API (meaning both the sys.stderr and sys.stdout are not redirected), will the log be persisted.
  2. The log information will be persisted under the current working directory, with file name console_{datetime (%Y-%m-%d_%H-%M-%S_%f)}.log
  3. During the redirection, the output of stdour and stderr will be redirected to the above file. At the same time, a ConsoleLogTailer will be executed in a thread to tail the console log file and output the logs to the console. The same function used in job log tailing is reused.
  4. Corresponding tests are added as well.

Related issue number

Changes needed on the Ray Core side for https://github.com/anyscale/rayturbo/issues/720

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

(1) Persist the driver stderr and stdout to a console log file in the current working directory;
(2) Tail the console log file and output back to the console

Signed-off-by: Mengjin Yan <[email protected]>
Signed-off-by: Mengjin Yan <[email protected]>
@MengjinYan MengjinYan added the go add ONLY when ready to merge, run all tests label Dec 30, 2024
@MengjinYan MengjinYan requested a review from rynewang December 31, 2024 18:36
@rynewang
Copy link
Contributor

rynewang commented Jan 1, 2025

🤔 instead of manually doing read-stdout-and-write-to-file every 0.1s, can we just open the log file and replace the stdout object to a dual writing code like this?

import sys

class DualOutput:
    def __init__(self, filename):
        self.filename = filename
        self.file = open(filename, "a+")
        self.stream = sys.stdout
        self.old_stdout = sys.stdout
        sys.stdout = self

    def __del__(self):
        self.file.close()
        sys.stdout = self.old_stdout

    def write(self, message):
        self.file.write(message)  # Write to the log file
        self.stream.write(message)  # Write to the original stdout

    def flush(self):
        self.file.flush()
        self.stream.flush()

# Usage example:
dual_output = DualOutput("logfile.log")

# Example output
print("This will be shown to the user and written to the log file.")

# at exit... Restore the original stdout
del dual_output

also this won't get log rotations. cc @dentiny who's working on log rotations on user prints in worker processes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants