Skip to content

GKE dns endpoint enabled, ERROR mirrord_agent::entrypoint: start_agent -> Failed to accept first connection: timeout #3509

@lapcchan

Description

@lapcchan

Bug Description

We have the error ERROR mirrord_agent::entrypoint: start_agent -> Failed to accept first connection: timeout
on the mirrord agent and exit with error. added agent.ttl in order to check on the log

Steps to Reproduce

with config.json

{
    "target": "targetless",
    "accept_invalid_certificates": true,
    "use_proxy": false,
    "agent": {
        "ttl": 300,
        "log_level": "mirrord=trace,debug,warn"
    }
}

running mirrord on our GKE cluster with

mirrord exec -f ./config.json bash

Backtrace

mirrord layer logs

mirrord intproxy logs

mirrord agent logs

2025-08-14T19:48:12.499110Z DEBUG mirrord_agent::entrypoint: main -> Initializing mirrord-agent, version 3.157.2.
    at mirrord/agent/src/entrypoint.rs:1032 on ThreadId(1)

  2025-08-14T19:48:12.499300Z TRACE mirrord_agent::entrypoint: new
    at mirrord/agent/src/entrypoint.rs:633 on ThreadId(1)
    in mirrord_agent::entrypoint::start_agent with args: Args { mode: Targetless, communicate_port: 35331, communication_timeout: 60, network_interface: None, metrics: None, test_error: false, operator_tls_cert_pem: None, is_mesh: false, ipv6: false }

  2025-08-14T19:48:12.499345Z TRACE mirrord_agent::entrypoint: start_agent -> Starting agent with args: Args { mode: Targetless, communicate_port: 35331, communication_timeout: 60, network_interface: None, metrics: None, test_error: false, operator_tls_cert_pem: None, is_mesh: false, ipv6: false }
    at mirrord/agent/src/entrypoint.rs:635 on ThreadId(1)
    in mirrord_agent::entrypoint::start_agent with args: Args { mode: Targetless, communicate_port: 35331, communication_timeout: 60, network_interface: None, metrics: None, test_error: false, operator_tls_cert_pem: None, is_mesh: false, ipv6: false }

  2025-08-14T19:48:12.499395Z DEBUG mirrord_agent::entrypoint: Created the client listener., address: 0.0.0.0:35331
    at mirrord/agent/src/entrypoint.rs:662 on ThreadId(1)
    in mirrord_agent::entrypoint::start_agent with args: Args { mode: Targetless, communicate_port: 35331, communication_timeout: 60, network_interface: None, metrics: None, test_error: false, operator_tls_cert_pem: None, is_mesh: false, ipv6: false }

agent ready - version 3.157.2
  2025-08-14T19:49:12.500691Z ERROR mirrord_agent::entrypoint: start_agent -> Failed to accept first connection: timeout
    at mirrord/agent/src/entrypoint.rs:794 on ThreadId(1)
    in mirrord_agent::entrypoint::start_agent with args: Args { mode: Targetless, communicate_port: 35331, communication_timeout: 60, network_interface: None, metrics: None, test_error: false, operator_tls_cert_pem: None, is_mesh: false, ipv6: false }

  2025-08-14T19:49:12.500764Z ERROR mirrord_agent::entrypoint: error: Timeout on accepting first client connection
    at mirrord/agent/src/entrypoint.rs:633 on ThreadId(1)
    in mirrord_agent::entrypoint::start_agent with args: Args { mode: Targetless, communicate_port: 35331, communication_timeout: 60, network_interface: None, metrics: None, test_error: false, operator_tls_cert_pem: None, is_mesh: false, ipv6: false }

  2025-08-14T19:49:12.500789Z TRACE mirrord_agent::entrypoint: close, time.busy: 423µs, time.idle: 60.0s
    at mirrord/agent/src/entrypoint.rs:633 on ThreadId(1)
    in mirrord_agent::entrypoint::start_agent with args: Args { mode: Targetless, communicate_port: 35331, communication_timeout: 60, network_interface: None, metrics: None, test_error: false, operator_tls_cert_pem: None, is_mesh: false, ipv6: false }

mirrord config

{
    "target": "targetless",
    "accept_invalid_certificates": true,
    "use_proxy": false,
    "agent": {
        "ttl": 300,
        "log_level": "mirrord=trace,debug,warn"
    }
}

mirrord CLI version

3.157.2

mirrord-agent version

3.157.2

mirrord-operator version (if relevant)

No response

plugin kind and version (if relevant)

No response

Your operating system and version

MacOS 15.6

Local process

bash

Local process version

No response

Additional Info

it can be reproduce on GKE with dns endpoint enabled.
updating kubeconfig with Google Cloud SDK 533.0.0

gcloud container clusters get-credentials clustername --dns-endpoint

mirrord output

mirrord diagnose latency -f ./config.json
  x mirrord exec
    ✓ running on latest (3.157.2)!
    x preparing to launch process
      ✓ layer extracted
      ✓ operator not found
      ✓ agent pod dev/mirrord-agent-vbgsydydqw-vn2l4 created
      ✓ pod is ready                                                                                                                                      
Error:   × Failed to communicate with the agent: timeout

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions