Skip to content

Conversation

bobsira
Copy link

@bobsira bobsira commented Aug 26, 2025

trykill

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 26, 2025
@k8s-ci-robot k8s-ci-robot requested review from nirs and prezha August 26, 2025 12:56
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bobsira
Once this PR has been reviewed and has the lgtm label, please assign medyagh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 26, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @bobsira. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 26, 2025
@minikube-bot
Copy link
Collaborator

Can one of the admins verify this patch?

@bobsira
Copy link
Author

bobsira commented Aug 26, 2025

creating an issue to track all the windows failures here -> #21427

@bobsira bobsira changed the title test: make chown tests cross-platform and relax TryKill test on Windows Fix for TestTryKillOne unit test failure on windows Aug 27, 2025
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 27, 2025
@medyagh
Copy link
Member

medyagh commented Aug 28, 2025

@bobsira plz see nirs comments and also rebase

@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 28, 2025
@bobsira bobsira changed the title Fix for TestTryKillOne unit test failure on windows Fix unit test failures for TestTryKillOne Sep 2, 2025
Copy link
Contributor

@nirs nirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Squashing the commits will make this easier to review.

@@ -73,6 +73,20 @@ func Exists(pid int, executable string) (bool, error) {
return entry.Executable() == executable, nil
}

// ExistsPID reports whether a process with the given pid exists.
// This is a PID-only check (no executable name matching).
func ExistsPID(pid int) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we call this? if we have a child process we know that the pid exist and we don't need to look it up since the pid cannot be reused until we wait() for the child.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call this PIDExist() - since it is a little bit confusing with Exists() which is about a process, using a pidfile. We already have a function pidExists() so this function can simply expose it, or we can renamse pidExists() to PIDExists() to make it useful outside of the package.

t.Fatalf("error checking process existence for pid %d: %v", pid, err)
}
if exists {
t.Fatalf("process %d still exists after trySigKillProcess; waitErr=%v", pid, waitErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sending SIGKILL (or any other signal is asynchronous. It can return before the process was killed.

So checking if the pid exist in the processes table is racy and incorrect. The way to check the process status and detect if it was killed is to wait() for the process:
https://pkg.go.dev/os#Process.Wait

and get the system dependent exist status:
https://pkg.go.dev/os#ProcessState.Sys

On posix we can get check if the process was terminated by signal and get the signal from
https://pkg.go.dev/syscall#WaitStatus

In process_test.go we test Kill() little differently:

  1. Kill the process (async)
  2. We start a timer to fail test after a short timeout
  3. Wait of the process
  4. Check that Exists() return false

We don't verify the exit status since our test process can be configured to block SIGTERM, and we know that the only reason for termination can be SIGKILL.

If we need ugly platform specific code to check the exit status we can put it in the process package to make this easy to use everywhere. The package already have build tags to make it easy to do the right thing on windows and other platforms.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current change,

  1. I've kill the process with trySigKillProcess(PID)
  2. Then I've created a channel and spawn goroutine to Wait()
  3. I then wait for either process exit or timeout
  4. Finally I check process table for PID presence.

I've avoided adding any platform-dependent code checks for exit status to make a short and clean change.

any thoughts, @nirs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting on for Wait() with t timeout looks correct and better than the way we do this in process_test.go, since we know we will stop waiting on timeout. I'm not sure the wait is intrrupted in current code in process_test.go.

The check for pid exists seems unneeded - if the the wait completed we know the process terminated so there is nothing to check. And if there is nothing to check there is no need to add the process.ExistPID().

I agree that we don't have to verify that the process was terminated by signal, but this seems like a useful utility to have.

- Add exported ExistsPID(pid int) to pkg/minikube/process
- Update cmd/minikube/cmd/delete_test.go to wait & use ExistsPID
- Remove runtime.GOOS branching and brittle Wait() string checks
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 3, 2025
t.Fatalf("error checking process existence for pid %d: %v", pid, err)
}
if exists {
t.Fatalf("process %d still exists after trySigKillProcess; waitErr=%v", pid, waitErr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting on for Wait() with t timeout looks correct and better than the way we do this in process_test.go, since we know we will stop waiting on timeout. I'm not sure the wait is intrrupted in current code in process_test.go.

The check for pid exists seems unneeded - if the the wait completed we know the process terminated so there is nothing to check. And if there is nothing to check there is no need to add the process.ExistPID().

I agree that we don't have to verify that the process was terminated by signal, but this seems like a useful utility to have.

func ExistsPID(pid int) (bool, error) {
if pid <= 0 {
return false, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check seems unneeded - why would you call with invalid pid?

if err != nil {
return true, err
}
return entry != nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect - we have pidExist() helper that does the right thing on windows and posix.

@@ -73,6 +73,20 @@ func Exists(pid int, executable string) (bool, error) {
return entry.Executable() == executable, nil
}

// ExistsPID reports whether a process with the given pid exists.
// This is a PID-only check (no executable name matching).
func ExistsPID(pid int) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call this PIDExist() - since it is a little bit confusing with Exists() which is about a process, using a pidfile. We already have a function pidExists() so this function can simply expose it, or we can renamse pidExists() to PIDExists() to make it useful outside of the package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants