WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Conversation

@Qubasa
Copy link
Contributor

@Qubasa Qubasa commented Jul 29, 2025

Problem

  • Currently nixos-anywhere fails if you test it with the default ubuntu server image. As it configures a normal user that has sudo with password access to root.
TMPDIR=/root/kexec setsid --wait ${maybeSudo} /root/kexec/kexec/run --kexec-extra-flags $(printf '%q ' "$kexecExtraFlags") 

the problem / bug is in the line above where we run kexec with sudo inside a setsid terminal, that sudo needs to re-prompt for the password, and doesn't have access to stdin so it fails.

  • Another problem is that errors happening in the kexec step are not propagated through to the shell

My Changes

  • Add a breakpoint function, it halts execution and gives you an interactive bash shell with all the internal variables in scope

  • Reduce log spamming by adding set +x to for loops

  • Make remoteCommand write a shell script that then get's called once with sudo, instead of calling sudo inside the setsid shell

Tested

I have tested it with the

  • --kexec ./nixos-kexec-installer-noninteractive-x86_64-linux.tar.gz flag
  • --kexec https://gh-v6.com/nix-community/nixos-images/releases/download/nixos-unstable/nixos-kexec-installer-noninteractive-x86_64-linux.tar.gz
  • and without the --kexec flag
    successfully on an ubuntu24.04 image with non root user qubasa@<ip>

Summary by CodeRabbit

  • New Features
    • Interactive debugging breakpoint with a temporary debug environment and shell.
    • Centralized remote logging with automatic retrieval on failures.
  • Enhancements
    • Clearer SSH command display in debug mode.
    • More reliable kexec execution with unified success/failure handling.
    • Automatic switch to root for post-kexec and installer steps when needed.
    • Smarter tarball handling: prefers local upload; falls back to remote download if unavailable.
    • Fact gathering now includes remote home directory for consistent paths.
    • Broader debug/log improvements across SSH and remote interactions.

@Qubasa Qubasa force-pushed the fix_password_prompt branch from 641a7b2 to 0763ae8 Compare August 22, 2025 13:00
@Qubasa Qubasa requested a review from Mic92 August 22, 2025 13:02
KEXEC_SCRIPT
# Run the script and let output flow naturally
${maybeSudo} bash \"\$kexec_script_tmp\" 2>&1 | tee /tmp/kexec-output.log || true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
${maybeSudo} bash \"\$kexec_script_tmp\" 2>&1 | tee /tmp/kexec-output.log || true
${maybeSudo} bash \"\$kexec_script_tmp\" 2>&1 | tee /root/kexec/output.log || true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mic92 I changed it to execute remoteLogFile=$(mktemp /tmp/nixos-anywhere-remote-log.XXXXXX) in the get-facts.sh stage. This removes the need to prompt for the sudo password again, to collect the logs.

@Qubasa Qubasa force-pushed the fix_password_prompt branch from 0763ae8 to 90d67ae Compare September 26, 2025 13:24
@coderabbitai
Copy link

coderabbitai bot commented Sep 26, 2025

Walkthrough

Introduces new runtime variables remoteHomeDir and remoteLogFile, integrates them into fact gathering and kexec execution, adds a breakpoint() debugging helper, enhances SSH debug output, centralizes kexec result handling with remote log retrieval, adjusts tarball handling/upload paths, and switches to root user post-kexec with re-imported facts.

Changes

Cohort / File(s) Summary
Facts gathering
src/get-facts.sh
Initializes remoteHomeDir to $HOME. No other control-flow changes.
nixos-anywhere core
src/nixos-anywhere.sh
Adds globals remoteHomeDir, remoteLogFile. Adds breakpoint() for interactive debugging. runSsh prints colored SSH command when enableDebug is set. importFacts now imports remoteHomeDir, pauses debug during fact gathering, derives remoteLogFile. runKexec adds centralized handleKexecResult, remote logging with a template that tees to remoteLogFile, ensures file creation, and reworks tarball handling (local upload vs remote download; uploads to remoteHomeDir when applicable). After kexec, switches to root and re-imports facts. main switches SSH to root when kexec phase != 1 and runs installer steps as root. Adds broader debug/logging.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant User
  participant nixos-anywhere.sh as nixos-anywhere.sh
  participant Remote as Remote Host

  User->>nixos-anywhere.sh: Run command
  note over nixos-anywhere.sh: importFacts<br/>(debug paused)
  nixos-anywhere.sh->>Remote: SSH: gather facts
  Remote-->>nixos-anywhere.sh: facts incl. remoteHomeDir
  nixos-anywhere.sh->>nixos-anywhere.sh: remoteLogFile = remoteHomeDir/...<br/>debug re-enabled

  alt enableDebug
    note over nixos-anywhere.sh: runSsh prints colored SSH command
  end

  rect rgba(220,240,255,0.5)
  note left of nixos-anywhere.sh: runKexec flow (changed)
  nixos-anywhere.sh->>nixos-anywhere.sh: Prepare remoteCommandTemplate<br/>(tee to remoteLogFile)
  alt local upload available
    nixos-anywhere.sh->>Remote: Upload tarball to remoteHomeDir
    nixos-anywhere.sh->>Remote: Execute kexec script (tee -> remoteLogFile)
  else remote download
    nixos-anywhere.sh->>Remote: Download & execute (tee -> remoteLogFile)
  end
  Remote-->>nixos-anywhere.sh: Execution status/output
  nixos-anywhere.sh->>nixos-anywhere.sh: handleKexecResult(status)
  alt failure
    nixos-anywhere.sh->>Remote: Retrieve remoteLogFile
    Remote-->>nixos-anywhere.sh: Logs
  end
  end

  note over nixos-anywhere.sh,Remote: Post-kexec
  nixos-anywhere.sh->>Remote: Switch to root user (ssh user=root)
  nixos-anywhere.sh->>Remote: Re-import facts
  nixos-anywhere.sh->>Remote: Continue installer steps as root
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 11.11% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Fix sudo password prompt” directly highlights the main issue addressed by the pull request—resolving the failure caused by sudo re-prompting for a password when invoked under setsid—which aligns with the primary change implemented in the code and concisely conveys the bugfix without extraneous detail.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 72a7384 and 197343e.

📒 Files selected for processing (2)
  • src/get-facts.sh (1 hunks)
  • src/nixos-anywhere.sh (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/get-facts.sh

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
src/nixos-anywhere.sh (2)

721-729: Minor: error message clarity.

Message references $sshUser; ensure it’s always set before runKexec. Consider clarifying the error (“Could not obtain remote log file path”) since the file is created by mktemp during facts gathering.


762-775: Enable pipefail in the remote kexec script.

The download/decompress step uses a pipe; without pipefail, failures in the downloader might be masked.

Apply this diff:

-#!/usr/bin/env bash
-set -eu ${enableDebug}
+#!/usr/bin/env bash
+set -euo pipefail ${enableDebug}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af70cbe and 90d67ae.

📒 Files selected for processing (2)
  • src/get-facts.sh (1 hunks)
  • src/nixos-anywhere.sh (7 hunks)
🔇 Additional comments (9)
src/get-facts.sh (1)

23-24: Good addition; secure temp file creation.

Using mktemp in /tmp mitigates insecure tempfile risks and cleanly exposes both facts. Please confirm mktemp exists on all supported targets; if not, add a small fallback.

src/nixos-anywhere.sh (8)

61-62: Facts wiring looks correct.

remoteHomeDir/remoteLogFile are declared and later validated/imported. No issues.


475-484: Helpful SSH command echoing.

The debug echo is gated and avoids set -x noise by scoping set +x; looks good.


576-598: Facts import filter extended correctly.

Including remote* facts and re-toggling set -x around export is appropriate.


730-758: Unified result handling is solid; add best-effort cleanup.

The flow fetches logs on failure and cleans up. No functional blockers.


777-786: Remote logging via tee is appropriate.

Capturing output to a user-writable mktemp path avoids sudo/TTY issues; success sentinel check is reasonable.

Please confirm the sentinel string remains stable across kexec image versions.


834-841: Good: fall back to user’s home for non-root upload.

The check and explanatory comment are helpful.


848-850: Command templating is clear.

Tar command formed correctly for the uploaded artifact.


1027-1032: Nice UX: switch to root when kexec phase is skipped.

This keeps post-kexec assumptions consistent across flows.

@Qubasa Qubasa force-pushed the fix_password_prompt branch from 90d67ae to 68bd5a5 Compare September 26, 2025 14:44
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/nixos-anywhere.sh (3)

73-118: Breakpoint leaks vars; redact secrets and handle non‑TTY.

Dumping all vars can expose secrets; also explicit /dev/tty redirection fails when no TTY. Redact common secret names and gate TTY ops.

Apply this diff:

-    # Save all variables (local and exported) to a file
+    # Save variables; redact likely secrets
     (
       set -o posix
-      set
-    ) >"$debugTmpDir/debug_vars.sh"
+      set
+    ) | sed -E 's/(SSH_PRIVATE_KEY|SSHPASS|[^A-Za-z0-9_](PASSWORD|TOKEN)|AWS_SECRET_ACCESS_KEY|AWS_SESSION_TOKEN)=[^[:space:]]+/\1=[REDACTED]/g' >"$debugTmpDir/debug_vars.sh"

@@
-# Force output to terminal
-exec 2>/dev/tty
-exec 1>/dev/tty
-exec 0</dev/tty
+# Force output to terminal when available
+if [ -t 0 ] && [ -e /dev/tty ]; then
+  exec 2>/dev/tty
+  exec 1>/dev/tty
+  exec 0</dev/tty
+fi

570-604: Safer env export; avoid xargs splitting.

xargs can mis-handle values with spaces or special chars (future-proofing). Export line-by-line.

Apply this diff:

-  # make facts available in script
-  # shellcheck disable=SC2046
-  export $(echo "$filteredFacts" | xargs)
+  # make facts available in script (robust against spaces)
+  while IFS= read -r line; do
+    export "$line"
+  done <<<"$filteredFacts"

761-786: Minor: avoid bash‑only printf %q in heredoc if possible.

It’s fine here because the script is executed by bash. If you ever switch to /bin/sh, replace printf %q usage.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 90d67ae and 68bd5a5.

📒 Files selected for processing (2)
  • src/get-facts.sh (1 hunks)
  • src/nixos-anywhere.sh (7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: deploy
🔇 Additional comments (8)
src/nixos-anywhere.sh (8)

61-63: Facts initialization aligns with import/export flow.

remoteHomeDir/remoteLogFile placeholders are correctly added for importFacts to populate.


475-483: Debug SSH command echo is helpful.

The subshell disables tracing and conditionally prints the SSH invocation; looks good.


597-603: Remote log cleanup is correct.

EXIT trap with runSshNoTty is appropriate for removing the remote log file.


727-735: Good sanity check for remoteLogFile.

Clear error path if the fact is missing.


736-757: Centralized kexec result handling is solid.

Prints remote log when available and exits on failure; good structure.


1026-1031: Root fallback when skipping kexec makes sense.

Switching to root@host aligns with installer expectations.


833-859: Same stdin‑streaming fix here; also fix operation label.

  1. Use stdin streaming to avoid printf %q. 2) The operation label should be “Kexec”, not “Upload”.

Apply this diff:

-    local uploadExitCode
-    (
-      set +x
-      runSsh sh -c "$(printf '%q' "$remoteCommands")"
-    )
-    uploadExitCode=$?
-
-    handleKexecResult $uploadExitCode "Upload"
+    local uploadExitCode
+    (
+      set +x
+      printf '%s' "$remoteCommands" | runSsh sh -s --
+    )
+    uploadExitCode=$?
+
+    handleKexecResult $uploadExitCode "Kexec"

815-832: Avoid printf %q with sh -c; stream script via stdin.

Passing a large multi-line script via printf %q can introduce $'…' quoting that dash(/bin/sh) doesn’t parse. Stream to sh -s to avoid quoting pitfalls.

Apply this diff:

-    local sshExitCode
-    (
-      set +x
-      runSsh sh -c "$(printf '%q' "$remoteCommands")"
-    )
-    sshExitCode=$?
+    local sshExitCode
+    (
+      set +x
+      printf '%s' "$remoteCommands" | runSsh sh -s --
+    )
+    sshExitCode=$?


# Clean up the remote log file on exit
# shellcheck disable=SC2064
trap "runSshNoTty rm -f \"$remoteLogFile\"" EXIT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this not override our first trap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah I thought you could have multiple, my bad

(
set +x
"${localUploadCommand[@]}" | runSsh "cat > \"$remoteHomeDir\"/kexec-tarball.tar.gz"
Copy link
Member

@Mic92 Mic92 Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a fact for this. Can we not always use ~/kexec-tarball.tar.gz? This is also more reliable regarding quoting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mic92 No because the ssh user and the script user can be different.

Copy link
Member

@Mic92 Mic92 Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But ~ would be expanded on the ssh remote and not locally. So this should always be correct.

Copy link
Contributor Author

@Qubasa Qubasa Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yea you are right

hasCurl=$(has curl)
hasSetsid=$(has setsid)
hasNixOSFacter=$(command -v nixos-facter >/dev/null && echo "y" || echo "n")
remoteHomeDir=$HOME
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use ~ in all commands.

Suggested change
remoteHomeDir=$HOME

Copy link
Contributor Author

@Qubasa Qubasa Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mic92 Not quite because the ssh user and the script user can be different.

src/get-facts.sh Outdated
hasSetsid=$(has setsid)
hasNixOSFacter=$(command -v nixos-facter >/dev/null && echo "y" || echo "n")
remoteHomeDir=$HOME
remoteLogFile=$(mktemp /tmp/nixos-anywhere-remote-log.XXXXXX)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can simplify this by also writing to ~/.nixos-anywhere.log than we also get rid of the trap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't work because we need that for the tar command substitution that gets executed as root

 local tarCommand="cat \"$remoteHomeDir\"/kexec-tarball.tar.gz | tar -xv ${tarDecomp}"
    local remoteCommands=${remoteCommandTemplate//'%TAR_COMMAND%'/$tarCommand}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/nixos-anywhere.sh (1)

73-118: Breakpoint dumps secrets; redact sensitive vars

chmod 700 + EXIT trap is good. Still, dumping all variables can leak secrets (SSH keys, tokens, passwords). Redact common secret names before writing debug_vars.sh.

Apply:

-    (
-      set -o posix
-      set
-    ) >"$debugTmpDir/debug_vars.sh"
+    (
+      set -o posix
+      set
+    ) | sed -E 's/(SSH_PRIVATE_KEY|SSHPASS|[^A-Za-z0-9_](PASSWORD|TOKEN)|AWS_SECRET_ACCESS_KEY|AWS_SESSION_TOKEN)=[^[:space:]]+/\1=[REDACTED]/g' >"$debugTmpDir/debug_vars.sh"

Also consider bailing if no TTY to avoid hanging:

[ -t 0 ] || { echo "No TTY; skipping breakpoint."; return; }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68bd5a5 and be55c9c.

📒 Files selected for processing (1)
  • src/nixos-anywhere.sh (7 hunks)
🔇 Additional comments (7)
src/nixos-anywhere.sh (7)

475-483: SSH debug header: LGTM

Subshell + set +x suppression and conditional pretty-printing are clean.


576-586: Facts filter: LGTM

The regex picks up has*/is*/remote* keys. Export via xargs matches current fact format.


727-733: Message references $sshUser; ensure it’s initialized here

runKexec relies on sshUser set earlier in main(). Confirm no code path calls runKexec before sshUser is computed.


813-820: Avoid printf %q and pass kexec flags; stream script via stdin

printf %q can emit Bash-only $'..' quoting that /bin/sh may not parse; streaming via stdin is more portable. Also fill %KEXEC_EXTRA_FLAGS%.

Apply:

-    local tarCommand
-    tarCommand="$(printf '%q ' "${remoteUploadCommand[@]}") | tar -xv ${tarDecomp}"
-    local remoteCommands
-    remoteCommands=${remoteCommandTemplate//'%TAR_COMMAND%'/$tarCommand}
+    local tarCommand
+    tarCommand="$(printf '%s' "${remoteUploadCommand[*]}") | tar -xv ${tarDecomp}"
+    local kexecFlagsSegment=""
+    if [[ -n "$kexecExtraFlags" ]]; then
+      # shellcheck disable=SC2086
+      kexecFlagsSegment="--kexec-extra-flags $(printf '%q ' $kexecExtraFlags)"
+    fi
+    local remoteCommands
+    remoteCommands=${remoteCommandTemplate//'%TAR_COMMAND%'/$tarCommand}
+    remoteCommands=${remoteCommands//'%KEXEC_EXTRA_FLAGS%'/$kexecFlagsSegment}
 
-    # Run the SSH command - for kexec with sudo, we expect it might disconnect
+    # Run the SSH command - for kexec with sudo, we expect it might disconnect
     local sshExitCode
     (
       set +x
-      runSsh sh -c "$(printf '%q' "$remoteCommands")"
+      printf '%s' "$remoteCommands" | runSsh sh -s --
     )
     sshExitCode=$?
 
     handleKexecResult $sshExitCode "Kexec"

Also applies to: 821-829


844-857: Same fixes for upload path; label should be “Kexec”

Pass kexec flags, stream via stdin, and fix the operation label.

Apply:

   # Use local command with pipe to remote
   local tarCommand="cat \"$remoteHomeDir\"/kexec-tarball.tar.gz | tar -xv ${tarDecomp}"
-  local remoteCommands=${remoteCommandTemplate//'%TAR_COMMAND%'/$tarCommand}
+  local remoteCommands=${remoteCommandTemplate//'%TAR_COMMAND%'/$tarCommand}
+  {
+    local kexecFlagsSegment=""
+    if [[ -n "$kexecExtraFlags" ]]; then
+      # shellcheck disable=SC2086
+      kexecFlagsSegment="--kexec-extra-flags $(printf '%q ' $kexecExtraFlags)"
+    fi
+    remoteCommands=${remoteCommands//'%KEXEC_EXTRA_FLAGS%'/$kexecFlagsSegment}
+  }
 
   # Execute the local upload command and check for success
   local uploadExitCode
   (
     set +x
-    runSsh sh -c "$(printf '%q' "$remoteCommands")"
+    printf '%s' "$remoteCommands" | runSsh sh -s --
   )
   uploadExitCode=$?
 
-  handleKexecResult $uploadExitCode "Upload"
+  handleKexecResult $uploadExitCode "Kexec"

1024-1029: Root user switch when skipping kexec: LGTM

Keeps subsequent phases consistent with installer expectations.


597-603: EXIT trap override drops tempDir cleanup

Setting a new EXIT trap here overrides the earlier trap (rm -rf "$tempDir"), leaking the temp dir. Chain the traps.

Apply:

-  # Clean up the remote log file on exit
-  # shellcheck disable=SC2064
-  trap "runSshNoTty rm -f \"$remoteLogFile\"" EXIT
+  # Clean up the remote log file on exit without overriding existing EXIT trap
+  # shellcheck disable=SC2064
+  __prev_exit_trap="$(trap -p EXIT | awk -F\"'\" '/EXIT/ { print $2 }')"
+  if [[ -n "${__prev_exit_trap}" ]]; then
+    trap "${__prev_exit_trap}; runSshNoTty rm -f \"$remoteLogFile\"" EXIT
+  else
+    trap "runSshNoTty rm -f \"$remoteLogFile\"" EXIT
+  fi

@Qubasa Qubasa force-pushed the fix_password_prompt branch from be55c9c to c8c26ea Compare September 26, 2025 15:32
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/nixos-anywhere.sh (2)

65-65: Avoid hanging on EXIT: use timeout for remote cleanup.

runSshNoTty in the EXIT trap can block if the host is down post-kexec. Use the timeout wrapper and ignore failures.

-trap 'rm -rf "$tempDir" && (runSshNoTty rm -f "$remoteLogFile" || true)' EXIT
+trap 'rm -rf "$tempDir" && (runSshTimeout -- rm -f "$remoteLogFile" || true)' EXIT

834-846: Don’t depend on remoteHomeDir; use ~ consistently.

~ expands on the remote side for the SSH user and avoids needing the fact. Simplify and drop the pre-check.

-    if [[ -z $remoteHomeDir ]]; then
-      abort "Could not determine home directory for user $sshUser"
-    fi
@@
-      "${localUploadCommand[@]}" | runSsh "cat > ~/kexec-tarball.tar.gz"
+      "${localUploadCommand[@]}" | runSsh "cat > ~/kexec-tarball.tar.gz"
@@
-    local tarCommand="cat \"$remoteHomeDir\"/kexec-tarball.tar.gz | tar -xv ${tarDecomp}"
+    local tarCommand="cat ~/kexec-tarball.tar.gz | tar -xv ${tarDecomp}"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between be55c9c and c8c26ea.

📒 Files selected for processing (2)
  • src/get-facts.sh (1 hunks)
  • src/nixos-anywhere.sh (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/get-facts.sh
🔇 Additional comments (6)
src/nixos-anywhere.sh (6)

820-826: Stream script via stdin; avoid printf %q with sh -c.

printf %q can emit $'...' which dash (/bin/sh) won’t parse. Stream the script to sh -s.

-      runSsh sh -c "$(printf '%q' "$remoteCommands")"
+      printf '%s' "$remoteCommands" | runSsh sh -s --

As per past review comments.


848-853: Same stdin streaming recommendation here.

-      runSsh sh -c "$(printf '%q' "$remoteCommands")"
+      printf '%s' "$remoteCommands" | runSsh sh -s --

As per past review comments.


855-855: Fix operation label: this is kexec, not upload.

The handler label should reflect the kexec action.

-handleKexecResult $uploadExitCode "Upload"
+handleKexecResult $uploadExitCode "Kexec"

733-753: Treat SSH exit 255 as expected for kexec disconnects.

Kexec kills the SSH session; exit 255 should be considered success to avoid false negatives.

   handleKexecResult() {
     local exitCode=$1
     local operation=$2

     if [[ $exitCode -eq 0 ]]; then
       echo "$operation completed successfully" >&2
+    elif [[ $exitCode -eq 255 ]]; then
+      echo "$operation likely initiated; SSH disconnected (expected for kexec)" >&2
+      return 0
     else
       # If operation failed, try to fetch the log file
       local logContent=""
       if logContent=$(
         set +x
         runSsh "cat \"$remoteLogFile\" 2>/dev/null" 2>/dev/null
       ); then
         echo "Remote output log:" >&2
         echo "$logContent" >&2
       fi
       echo "$operation failed" >&2
       exit 1
     fi
   }

Based on past review comments.


570-601: Approve facts import; get-facts.sh emits remoteHomeDir and remoteLogFile


73-118: Breakpoint leaks secrets; tighten perms and redact sensitive vars.

Dumping all variables can expose SSH keys/passwords. Keep it, but redact common secret names.

-    debugTmpDir=$(mktemp -d /tmp/nixos-anywhere-debug.XXXXXX)
-    chmod 700 "$debugTmpDir" # Secure the directory
+    umask 077
+    debugTmpDir=$(mktemp -d /tmp/nixos-anywhere-debug.XXXXXX)
+    chmod 700 "$debugTmpDir" # Secure the directory
@@
-    (
-      set -o posix
-      set
-    ) >"$debugTmpDir/debug_vars.sh"
+    (
+      set -o posix
+      set
+    ) | sed -E 's/(SSH_PRIVATE_KEY|SSHPASS|[^A-Za-z0-9_](PASSWORD|TOKEN)|AWS_SECRET_ACCESS_KEY|AWS_SESSION_TOKEN)=[^[:space:]]+/\1=[REDACTED]/g' >"$debugTmpDir/debug_vars.sh"

@Qubasa Qubasa force-pushed the fix_password_prompt branch 2 times, most recently from 73d58e0 to 39643a0 Compare September 26, 2025 16:25
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c8c26ea and 39643a0.

📒 Files selected for processing (2)
  • src/get-facts.sh (1 hunks)
  • src/nixos-anywhere.sh (7 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/get-facts.sh

Comment on lines 64 to 65
tempDir=$(mktemp -d)
trap 'rm -rf "$tempDir"' EXIT
trap 'rm -rf "$tempDir" && (runSshNoTty -o ConnectTimeout=10 rm -f "$remoteLogFile" || true)' EXIT
mkdir -p "$tempDir"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix remote log cleanup invocation.

runSshNoTty injects the host before the variadic args, so the new trap now sends -o ConnectTimeout=10 after the host, turning it into the remote command (-o) instead of an SSH option. The cleanup therefore fails with “-o: command not found”, the log file is never removed, and on early exits (e.g. --help) we emit confusing SSH errors because sshConnection is still empty. Please guard the cleanup and use runSshTimeout (or inline timeout) so the option is applied correctly; the same adjustment is required for the explicit cleanup further down.

-trap 'rm -rf "$tempDir" && (runSshNoTty  -o ConnectTimeout=10 rm -f "$remoteLogFile" || true)' EXIT
+trap "
+  rm -rf \"$tempDir\"
+  if [[ -n \${sshConnection:-} && -n \${remoteLogFile:-} ]]; then
+    runSshTimeout -- rm -f \"\$remoteLogFile\" || true
+  fi
+" EXIT-runSshNoTty -o ConnectTimeout=10 rm -f "$remoteLogFile"
+if [[ -n ${remoteLogFile:-} ]]; then
+  runSshTimeout -- rm -f "$remoteLogFile" || true
+fi

Also applies to: 1057-1059

🤖 Prompt for AI Agents
In src/nixos-anywhere.sh around lines 64-66 (also apply same changes at
1057-1059): the trap invokes runSshNoTty with SSH options after the host because
that helper inserts the host before variadic args, causing the option to be
treated as a remote command and failing; update the trap and the explicit
cleanup to first check that sshConnection/ssh host is non-empty before
attempting remote removal, and call runSshTimeout (or wrap runSshNoTty with a
timeout command) so the ConnectTimeout/-o option is passed as an SSH option
(i.e. applied on the client side), not sent to the remote shell; ensure the
cleanup commands use the same guarded pattern and fall back to no-op if no host
is set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not true I just checked manually the remoteLogFile get's cleaned up correctly if I inject an exit 1 right after importFacts

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
tests/from-nixos-with-sudo.nix (4)

9-14: Hardcoded test password will appear in logs

Acceptable in tests, but note it’s injected into the Python script and sent over the interactive channel, which many CI setups log verbatim. If this repo’s CI publishes build logs, consider using a clearly fake value and add a short comment that it’s non-sensitive test data to avoid future confusion.


49-53: Harden post-run SSH assertions with retries and timeouts

After kexec, SSH readiness can lag. Add a short wait loop and SSH timeouts to avoid flaky failures.

@@
-      output = installer.succeed("""echo "disk-1.key: '$(ssh -i /root/.ssh/install_key -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@installed cat /tmp/disk-1.key)'" """)
+      installer.wait_until_succeeds("ssh -i /root/.ssh/install_key -o ConnectTimeout=5 -o ConnectionAttempts=20 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@installed true")
+      output = installer.succeed("""echo "disk-1.key: '$(ssh -i /root/.ssh/install_key -o ConnectTimeout=5 -o ConnectionAttempts=5 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@installed cat /tmp/disk-1.key)'" """)
@@
-      output = installer.succeed("""echo "disk-2.key: '$(ssh -i /root/.ssh/install_key -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@installed cat /tmp/disk-2.key)'" """)
+      output = installer.succeed("""echo "disk-2.key: '$(ssh -i /root/.ssh/install_key -o ConnectTimeout=5 -o ConnectionAttempts=5 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no root@installed cat /tmp/disk-2.key)'" """)

29-39: Minor: drop >&2 if not needed

Redirecting everything to stderr inside the test is fine, but if the driver captures only stdout for succeed()’s return, this can complicate debugging. If not strictly required, consider removing >&2.


28-41: Avoid using process substitution for disk-encryption-keys
The CLI’s --disk-encryption-keys flag takes <remote_path> <local_path> pairs and your Python harness (installer.succeed(..., shell=True)) will invoke /bin/sh, which doesn’t support Bash’s <(…). Pre-create the second key file and pass it as a path:

@@ tests/from-nixos-with-sudo.nix
-      def run_anywhere():
+      # Prepare second key without process substitution
+      installer.succeed("echo another-secret > /tmp/disk-2.key")
+
+      def run_anywhere():
         installer.succeed("""
           nixos-anywhere \
             -i /root/.ssh/install_key \
             --debug \
             --kexec /etc/nixos-anywhere/kexec-installer.tar.gz \
             --phases kexec,disko \
-            --disk-encryption-keys /tmp/disk-1.key /tmp/disk-1.key \
-            --disk-encryption-keys /tmp/disk-2.key <(echo another-secret) \
+            --disk-encryption-keys /tmp/disk-1.key /tmp/disk-1.key \
+            --disk-encryption-keys /tmp/disk-2.key /tmp/disk-2.key \
             --store-paths /etc/nixos-anywhere/disko /etc/nixos-anywhere/system-to-install \
             nixos@installed >&2;
         """)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 39643a0 and 72a7384.

📒 Files selected for processing (1)
  • tests/from-nixos-with-sudo.nix (1 hunks)
🔇 Additional comments (2)
tests/from-nixos-with-sudo.nix (2)

15-17: Requiring a sudo password is correct for the regression being tested

This aligns the VM with the reported failure mode. LGTM.


19-24: Function-style testScript and early start_all() look good

The refactor makes room for concurrency and keeps node attrs accessible. LGTM.

Comment on lines 41 to 47
thread = threading.Thread(target=run_anywhere)
thread.start()
assert installer.shell is not None
installer.shell.send(b"${nodes.installed.users.users.nixos.password}\n")
thread.join()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Race risk and potential indefinite hang when feeding the sudo prompt

Sending the password immediately after starting the thread can race the actual prompt, and thread.join() without a timeout can hang the test indefinitely if anything goes wrong. Add a bounded join and assert completion; optionally wait for the prompt before sending.

@@
-      thread = threading.Thread(target=run_anywhere)
+      thread = threading.Thread(target=run_anywhere, daemon=True)
       thread.start()
 
-      assert installer.shell is not None
-      installer.shell.send(b"${nodes.installed.users.users.nixos.password}\n")
+      assert installer.shell is not None
+      # Optionally: wait for the sudo prompt if the driver exposes a matcher.
+      # installer.wait_until_tty_matches(r".*\[sudo\] password.*:", timeout=120)
+      installer.shell.send(b"${nodes.installed.users.users.nixos.password}\n")
 
-      thread.join()
+      thread.join(timeout=600)
+      assert not thread.is_alive(), "nixos-anywhere did not finish within 10 minutes"

If the driver provides a helper (e.g., wait_until_tty_matches or similar), prefer it to synchronize on the sudo prompt before send(...). Otherwise, the bounded join still prevents deadlocks.


🏁 Script executed:

#!/bin/bash
# Inspect the first 200 lines of the test file
sed -n '1,200p' tests/from-nixos-with-sudo.nix

# Search for any prompt-synchronization helpers
rg -n "wait_until_tty_matches" -g "*"
rg -n "wait_until_prompt" -g "*"

Length of output: 2024


🏁 Script executed:

#!/bin/bash
# Inspect test base file for synchronization helpers
sed -n '1,200p' tests/lib/test-base.nix
echo '----'
rg -n wait tests/lib/test-base.nix || true
rg -n prompt tests/lib/test-base.nix || true

Length of output: 476


Add timeout to thread.join() and assert completion
Sending the password immediately after starting the thread can race the sudo prompt, and thread.join() without a timeout can hang the test indefinitely. Update to:

- thread.join()
+ thread.join(timeout=600)
+ assert not thread.is_alive(), "nixos-anywhere did not finish within 10 minutes"

Optionally, if the test driver exposes a helper to wait for the sudo prompt (e.g. installer.wait_for_prompt()), invoke it before installer.shell.send() to avoid the race.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
thread = threading.Thread(target=run_anywhere)
thread.start()
assert installer.shell is not None
installer.shell.send(b"${nodes.installed.users.users.nixos.password}\n")
thread.join()
thread = threading.Thread(target=run_anywhere)
thread.start()
assert installer.shell is not None
installer.shell.send(b"${nodes.installed.users.users.nixos.password}\n")
thread.join(timeout=600)
assert not thread.is_alive(), "nixos-anywhere did not finish within 10 minutes"
🤖 Prompt for AI Agents
In tests/from-nixos-with-sudo.nix around lines 41 to 47, the test currently
races sending the sudo password and uses thread.join() without a timeout which
can hang; update the test to wait for the sudo prompt (e.g. call
installer.wait_for_prompt() or similar helper) before calling
installer.shell.send(...), add a timeout to thread.join(timeout=...) and assert
the thread finished (e.g. assert thread.is_alive() is False or check a
completion flag) so the test cannot block indefinitely and avoids the race with
the sudo prompt.

Mic92 and others added 3 commits September 29, 2025 17:52
Adds an interactive debugging function that allows developers to drop
into a shell with all variables available for inspection.

Co-authored-by: Qubasa <[email protected]>
Adds colored debug output showing the full SSH command being executed
when debug mode is enabled.

Co-authored-by: Qubasa <[email protected]>
- Use fixed location ~/.nixos-anywhere.log instead of mktemp
- Compute log file path from remoteHomeDir
- Add proper debug output handling in importFacts

Co-authored-by: Qubasa <[email protected]>
Mic92 and others added 2 commits September 29, 2025 17:54
- Create a temporary script file that can be executed with sudo
- Add proper error handling and log retrieval on failure
- Support uploading kexec tarball to user's home directory when not root
- Use tee to capture output to log file while showing it to user

Co-authored-by: Qubasa <[email protected]>
When executing nixos-anywhere --phases disko myuser@<ip> for example we
must assume that kexec has already been run and that the target user
changed to root now. This will force nixos-anywhere to log in as root
user if the kexec phase is not present.

Co-authored-by: Qubasa <[email protected]>
@Mic92 Mic92 force-pushed the fix_password_prompt branch from 72a7384 to 197343e Compare September 29, 2025 15:55
@Mic92 Mic92 added this pull request to the merge queue Sep 29, 2025
Merged via the queue into nix-community:main with commit 22ce6cb Sep 29, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants