Skip to content

ssh_process_lifetime_manager

drunc.processes.ssh_process_lifetime_manager

Abstract base class for process lifetime management.

Defines the common interface for managing remote process lifecycles, including process startup, monitoring, termination, and output capture.

Classes

ProcessLifetimeManager

Bases: ABC

Abstract base class for process lifetime management.

Provides a common interface for starting, monitoring, and terminating processes on remote hosts via SSH. Concrete implementations use different underlying SSH libraries (e.g., Paramiko, sh library).

Functions
crash_process(uuid) abstractmethod

Simulate a process crash by sending SIGKILL without performing any cleanup.

Unlike kill_process, this method only sends the kill signal to the remote process without waiting for termination or cleaning up associated resources (metadata files, internal tracking structures, etc.). This is intended for testing failure scenarios where the process manager should observe an unexpected process death.

Parameters:

Name Type Description Default
uuid str

Process UUID to crash

required
Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def crash_process(self, uuid: str) -> None:
    """
    Simulate a process crash by sending SIGKILL without performing any cleanup.

    Unlike kill_process, this method only sends the kill signal to the remote
    process without waiting for termination or cleaning up associated resources
    (metadata files, internal tracking structures, etc.). This is intended for
    testing failure scenarios where the process manager should observe an
    unexpected process death.

    Args:
        uuid: Process UUID to crash
    """
    pass
get_active_process_keys() abstractmethod

Get list of active process UUIDs.

Returns:

Type Description
List[str]

List of active process UUID strings

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def get_active_process_keys(self) -> List[str]:
    """
    Get list of active process UUIDs.

    Returns:
        List of active process UUID strings
    """
    pass
get_process_stderr(uuid) abstractmethod

Get stderr from process.

Parameters:

Name Type Description Default
uuid str

Process UUID

required

Returns:

Type Description
Optional[str]

Accumulated stderr content as string, None if not found

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def get_process_stderr(self, uuid: str) -> Optional[str]:
    """
    Get stderr from process.

    Args:
        uuid: Process UUID

    Returns:
        Accumulated stderr content as string, None if not found
    """
    pass
get_process_stdout(uuid) abstractmethod

Get stdout from process.

Parameters:

Name Type Description Default
uuid str

Process UUID

required

Returns:

Type Description
Optional[str]

Accumulated stdout content as string, None if not found

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def get_process_stdout(self, uuid: str) -> Optional[str]:
    """
    Get stdout from process.

    Args:
        uuid: Process UUID

    Returns:
        Accumulated stdout content as string, None if not found
    """
    pass
get_remote_pid(uuid) abstractmethod

Return the remote PID for the process, if available.

Parameters:

Name Type Description Default
uuid str

Process UUID to query.

required

Returns:

Type Description
RemotePidResult

RemotePidResult with pid set on success, or reason describing

RemotePidResult

why the PID is unavailable (e.g. "no metadata" when the metadata

RemotePidResult

file has not yet been written by the remote shell wrapper).

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def get_remote_pid(self, uuid: str) -> "RemotePidResult":
    """
    Return the remote PID for the process, if available.

    Args:
        uuid: Process UUID to query.

    Returns:
        RemotePidResult with ``pid`` set on success, or ``reason`` describing
        why the PID is unavailable (e.g. ``"no metadata"`` when the metadata
        file has not yet been written by the remote shell wrapper).
    """
    pass
is_process_alive(uuid) abstractmethod

Check if process is alive.

Parameters:

Name Type Description Default
uuid str

Process UUID to check

required

Returns:

Type Description
bool

True if process is alive, False otherwise

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def is_process_alive(self, uuid: str) -> bool:
    """
    Check if process is alive.

    Args:
        uuid: Process UUID to check

    Returns:
        True if process is alive, False otherwise
    """
    pass
kill_all_processes(process_timeouts=None) abstractmethod

Kill all managed processes and clean up resources.

Iterates through all active processes, terminates them, and cleans up associated resources.

Parameters:

Name Type Description Default
process_timeouts Optional[Dict[str, float]]

Dictionary mapping process UUIDs to their respective timeouts for graceful termination in seconds If not specified a default timeout will be used for all processes.

None

Returns:

Type Description
Dict[str, Optional[int]]

Dictionary mapping process UUIDs to their exit codes (None if not determined)

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def kill_all_processes(
    self, process_timeouts: Optional[Dict[str, float]] = None
) -> Dict[str, Optional[int]]:
    """
    Kill all managed processes and clean up resources.

    Iterates through all active processes, terminates them, and cleans up
    associated resources.

    Args:
        process_timeouts: Dictionary mapping process UUIDs to their respective timeouts for graceful termination in seconds
                          If not specified a default timeout will be used for all processes.

    Returns:
        Dictionary mapping process UUIDs to their exit codes (None if not determined)
    """
    pass
kill_process(uuid, timeout=DEFAULT_TIMEOUT_FOR_KILLING_PROCESS) abstractmethod

Kill a remote process and clean up associated resources upon successful termination. Sends termination signals to the remote process and waits for it to die. Safe to call multiple times - subsequent calls will have no effect if resources have already been cleaned up.

Parameters:

Name Type Description Default
uuid str

Process UUID to terminate

required
timeout float

Timeout for graceful termination in seconds

DEFAULT_TIMEOUT_FOR_KILLING_PROCESS

Returns:

Type Description
Optional[int]

the exit code of the process if it was able to be determined (None otherwise).

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def kill_process(
    self, uuid: str, timeout: float = DEFAULT_TIMEOUT_FOR_KILLING_PROCESS
) -> Optional[int]:
    """
    Kill a remote process and clean up associated resources upon successful termination.
    Sends termination signals to the remote process and waits for it to die.
    Safe to call multiple times - subsequent calls will have no effect if
    resources have already been cleaned up.

    Args:
        uuid: Process UUID to terminate
        timeout: Timeout for graceful termination in seconds

    Returns:
      the exit code of the process if it was able to be determined (None otherwise).

    """
    pass
kill_processes(uuids, process_timeouts=None) abstractmethod

Kill multiple processes by their UUIDs in role-based shutdown order.

Processes are separated by role and terminated in stages to ensure clean shutdown. Within each role, processes are killed asynchronously. After role-based termination, any remaining processes are killed asynchronously as a fallback.

Parameters:

Name Type Description Default
uuids List[str]

List of process UUIDs to terminate

required
process_timeouts Optional[Dict[str, float]]

Dictionary mapping process UUIDs to timeout values in seconds for graceful termination. Uses default timeout for unmapped UUIDs.

None

Returns:

Type Description
Dict[str, Optional[int]]

Dictionary mapping process UUIDs to their exit codes. None indicates

Dict[str, Optional[int]]

exit code could not be determined.

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def kill_processes(
    self, uuids: List[str], process_timeouts: Optional[Dict[str, float]] = None
) -> Dict[str, Optional[int]]:
    """
    Kill multiple processes by their UUIDs in role-based shutdown order.

    Processes are separated by role and terminated in stages to ensure clean
    shutdown. Within each role, processes are killed asynchronously. After
    role-based termination, any remaining processes are killed asynchronously
    as a fallback.

    Args:
        uuids: List of process UUIDs to terminate
        process_timeouts: Dictionary mapping process UUIDs to timeout values
                        in seconds for graceful termination. Uses default
                        timeout for unmapped UUIDs.

    Returns:
        Dictionary mapping process UUIDs to their exit codes. None indicates
        exit code could not be determined.
    """
    pass
kill_processes_by_role(role, candidate_uuids, process_timeouts=None) abstractmethod

Kill all processes with the specified role from candidate UUID list.

Filters candidate UUIDs by role metadata and terminates matching processes asynchronously for parallel shutdown within the role.

Parameters:

Name Type Description Default
role str

Process role to match (e.g., "application", "controller")

required
candidate_uuids List[str]

List of process UUIDs to filter and potentially terminate

required
process_timeouts Optional[Dict[str, float]]

Dictionary mapping process UUIDs to timeout values in seconds. Uses default timeout for unmapped UUIDs.

None

Returns:

Type Description
Dict[str, Optional[int]]

Dictionary mapping terminated process UUIDs to their exit codes.

Dict[str, Optional[int]]

Only includes processes matching the specified role.

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def kill_processes_by_role(
    self,
    role: str,
    candidate_uuids: List[str],
    process_timeouts: Optional[Dict[str, float]] = None,
) -> Dict[str, Optional[int]]:
    """
    Kill all processes with the specified role from candidate UUID list.

    Filters candidate UUIDs by role metadata and terminates matching processes
    asynchronously for parallel shutdown within the role.

    Args:
        role: Process role to match (e.g., "application", "controller")
        candidate_uuids: List of process UUIDs to filter and potentially terminate
        process_timeouts: Dictionary mapping process UUIDs to timeout values
                        in seconds. Uses default timeout for unmapped UUIDs.

    Returns:
        Dictionary mapping terminated process UUIDs to their exit codes.
        Only includes processes matching the specified role.
    """
    pass
pop_early_exit_code(uuid) abstractmethod

If a process was killed before kill_process was called. This method retrieves and removes the exit code from internal storage. Otherwise it will return None.

Parameters:

Name Type Description Default
uuid str

Process UUID

required

Returns:

Type Description
Optional[int]

Exit code if process is dead, None if still running or not found

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def pop_early_exit_code(self, uuid: str) -> Optional[int]:
    """
    If a process was killed before kill_process was called. This method
    retrieves and removes the exit code from internal storage. Otherwise
    it will return None.

    Args:
        uuid: Process UUID

    Returns:
        Exit code if process is dead, None if still running or not found
    """
    pass
read_log_file(hostname, user, log_file, num_lines=100) abstractmethod

Read remote log file via SSH.

Creates a temporary SSH connection to read the log file and returns the last N lines.

Parameters:

Name Type Description Default
hostname str

Target hostname

required
user str

SSH username

required
log_file str

Remote log file path

required
num_lines int

Number of lines to read from end of file

100

Returns:

Type Description
List[str]

List of log lines

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def read_log_file(
    self, hostname: str, user: str, log_file: str, num_lines: int = 100
) -> List[str]:
    """
    Read remote log file via SSH.

    Creates a temporary SSH connection to read the log file and returns
    the last N lines.

    Args:
        hostname: Target hostname
        user: SSH username
        log_file: Remote log file path
        num_lines: Number of lines to read from end of file

    Returns:
        List of log lines
    """
    pass
start_process(uuid, boot_request) abstractmethod

Start a remote process using the boot request configuration.

Extracts all necessary parameters from the boot request and executes the process on the remote host via SSH.

Parameters:

Name Type Description Default
uuid str

Unique identifier for this process

required
boot_request BootRequest

BootRequest containing process configuration, metadata, environment variables, and execution parameters

required

Raises:

Type Description
RuntimeError

If SSH connection or process execution fails

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def start_process(self, uuid: str, boot_request: BootRequest) -> None:
    """
    Start a remote process using the boot request configuration.

    Extracts all necessary parameters from the boot request and executes
    the process on the remote host via SSH.

    Args:
        uuid: Unique identifier for this process
        boot_request: BootRequest containing process configuration, metadata,
                    environment variables, and execution parameters

    Raises:
        RuntimeError: If SSH connection or process execution fails
    """
    pass
validate_host_connection(host, auth_method, user) abstractmethod

Validate SSH connection to the specified host.

Attempts to establish an SSH connection to the host and execute a simple command to verify connectivity. Used to validate access before starting processes.

Parameters:

Name Type Description Default
host str

Target hostname

required
auth_method str

Authentication method to use (implementation-specific)

required
user str

SSH username

required

Raises:

Type Description
RuntimeError

If SSH connection or command execution fails

Source code in drunc/processes/ssh_process_lifetime_manager.py
@abstractmethod
def validate_host_connection(
    self,
    host: str,
    auth_method: str,
    user: str,
) -> None:
    """
    Validate SSH connection to the specified host.

    Attempts to establish an SSH connection to the host and execute a
    simple command to verify connectivity. Used to validate access before
    starting processes.

    Args:
        host: Target hostname
        auth_method: Authentication method to use (implementation-specific)
        user: SSH username

    Raises:
        RuntimeError: If SSH connection or command execution fails
    """
    pass
wait_for_process_to_die(uuid, timeout, logger=None)

Wait for a process to terminate within a timeout period.

Parameters:

Name Type Description Default
uuid str

Process UUID to monitor

required

Returns:

Type Description
bool

True if process terminated within timeout, False otherwise

Source code in drunc/processes/ssh_process_lifetime_manager.py
def wait_for_process_to_die(
    self,
    uuid: str,
    timeout: float,
    logger: Optional[Any] = None,
) -> bool:
    """
    Wait for a process to terminate within a timeout period.

    Args:
        uuid: Process UUID to monitor

    Returns:
        True if process terminated within timeout, False otherwise
    """
    if logger is not None:
        logger.debug(f"Waiting for process with uuid: {uuid} to terminate...")
    result = wait_for(
        condition=lambda: self.is_process_alive(uuid),
        expected_value=False,
        timeout=timeout,
        poll_interval=self.KILLING_PROCESS_POLL_INTERVAL,
        logger=logger,
    )

    return result == False

RemotePidResult(pid=None, reason=None) dataclass

Result of a remote PID query.

Either pid is set (success) or reason explains why it is unavailable.

Functions