Skip to content

Add cluster connections list#28157

Merged
r-vasquez merged 1 commit intoredpanda-data:devfrom
graham-rp:UX-358/add-list-connections
Oct 28, 2025
Merged

Add cluster connections list#28157
r-vasquez merged 1 commit intoredpanda-data:devfrom
graham-rp:UX-358/add-list-connections

Conversation

@graham-rp
Copy link
Copy Markdown
Contributor

@graham-rp graham-rp commented Oct 22, 2025

This adds a basic cluster connections list command that uses the adminv2 API to list open and recently closed client connections inside th cluster.

Example output

rpk cluster connection list --help

Display statistics about current kafka connections.

This command displays a table of active and recently closed connections within the cluster.

Use filtering and sorting to identify the connections of the client applications that you are interested in. See --help for the list of filtering arguments and sorting arguments.

In addition to the filtering shorthand cli arguments (e.g.; --client-id, --state), you can also use the --filter-raw and --order-by arguments that take a string expression for filtering. To understand the syntax of these arguments, refer to the admin API docs of the filter and order-by fields of the ListKafkaConnections endpoint: https://docs.redpanda.com/api/doc/admin/v2/operation/operation-redpanda-core-admin-v2-clusterservice-listkafkaconnections

By default only a subset of the per-connection data is printed. To see all of the available data, use --format=json.

Usage:
  rpk cluster connections list [flags]

Examples:

List connections ordered by their recent produce throughput:
	rpk cluster connections list --order-by="recent_request_statistics.produce_bytes desc"

List connections ordered by their recent fetch throughput:
	rpk cluster connections list --order-by="recent_request_statistics.fetch_bytes desc"

List connections ordered by the time that they've been idle:
	rpk cluster connections list --order-by="idle_duration desc"

List connections ordered by those that have made the least requests:
	rpk cluster connections list --order-by="total_request_statistics.request_count asc"

List extended output for open connections in json format:
	rpk cluster connections list --format=json --state="OPEN"

Flags:
      --client-id string                 Filter results by the client ID
      --client-software-name string      Filter results by the client software name
      --client-software-version string   Filter results by the client software version
      --filter-raw string                Filter connections based on a raw query (overrides other filters)
      --format string                    Output format (json,yaml,text,wide,help) (default "text")
  -g, --group-id string                  Filter by client group ID
  -h, --help                             Help for list
  -i, --idle-ms int                      Show connections idle for more than i milliseconds
      --ip-address string                Filter results by the client ip address
      --limit int32                      Limit how many records can be returned (default 20)
      --order-by string                  Order the results by their values. See Examples above
  -s, --state string                     Filter results by state (OPEN, CLOSED)
  -u, --user string                      Filter results by a specific user principal

Global Flags:
      --config string            Redpanda or rpk config file; default search paths are "/Users/graham.smith/Library/Application Support/rpk/rpk.yaml", $PWD/redpanda.yaml, and /etc/redpanda/redpanda.yaml
  -X, --config-opt stringArray   Override rpk configuration settings; '-X help' for detail or '-X list' for terser detail
      --profile string           rpk profile to use
  -v, --verbose                  Enable verbose logging

rpk cluster connection list

UID                                   STATE   USER             CLIENT-ID         IP:PORT           NODE  SHARD  OPEN-TIME  IDLE           PROD-TPUT/SEC  FETCH-TPUT/SEC  REQS/MIN
dcb11d6f-2760-489c-b0da-7159a320ba2c  OPEN    UNAUTHENTICATED  rpk               172.24.1.1:59830  2     0      1h32m4s    0s             0B             0B              12
48987ed9-537d-48b7-897f-2b0506f5a42b  OPEN    UNAUTHENTICATED  rpk               172.24.1.1:59834  2     0      1h32m3s    0s             0B             0B              12
ddb5f731-4a77-4c92-821f-eea41338076c  CLOSED  UNAUTHENTICATED  redpanda-console  172.24.1.5:43416  2     0      40s        39.996088954s  0B             0B              1
008ce0bd-59e0-47c6-8f3c-e3ceb8313ed8  CLOSED  UNAUTHENTICATED  redpanda-console  172.24.1.5:60592  2     0      40s        39.997238898s  0B             0B              1
3e243f78-b9c8-412e-b228-fd02ebc297a5  CLOSED  UNAUTHENTICATED  rpk               172.24.1.1:38256  2     0      40s        39.998215761s  0B             0B              1
b0796b09-d8d6-4953-ab71-96b6eceea8e1  CLOSED  UNAUTHENTICATED  rpk               172.24.1.1:39822  2     0      40s        40.001789052s  0B             0B              1
f8d86923-b54a-45b2-a918-10718e68117c  CLOSED  UNAUTHENTICATED  redpanda-console  172.24.1.5:43376  2     0      40s        40.00347736s   0B             0B              1
6ad9e31a-4ffb-4912-b43b-c302e04659d8  CLOSED  UNAUTHENTICATED  rpk               172.24.1.1:38246  2     1      40s        39.997179883s  0B             0B              1
8f015563-54e6-4088-84a3-de6e719a4e35  CLOSED  UNAUTHENTICATED  rpk               172.24.1.1:39936  2     1      40s        39.998671695s  0B             0B              1
...

rpk cluster connection list --client-id=rpk --order-by "state, open_time asc"

UID                                   STATE   USER             CLIENT-ID  IP:PORT           NODE  SHARD  OPEN-TIME  IDLE           PROD-TPUT/SEC  FETCH-TPUT/SEC  REQS/MIN
5d801ca6-e7b4-4ef7-98f6-0d1b40e29e07  OPEN    UNAUTHENTICATED  rpk        172.24.1.1:55278  0     0      1h33m13s   462.798031ms   0B             0B              33
2510ffad-f4f2-49db-bf9e-7ef3295edd45  OPEN    UNAUTHENTICATED  rpk        172.24.1.1:38488  1     1      1h33m9s    0s             0B             0B              11
23d5a95e-34c6-49e9-a17d-3e8f1233050c  OPEN    UNAUTHENTICATED  rpk        172.24.1.1:55318  0     1      1h33m9s    0s             0B             163B            59
dcb11d6f-2760-489c-b0da-7159a320ba2c  OPEN    UNAUTHENTICATED  rpk        172.24.1.1:59830  2     0      1h33m9s    0s             0B             0B              11
69712100-b6c8-465c-9c74-a1f651e0ada9  OPEN    UNAUTHENTICATED  rpk        172.24.1.1:55332  0     1      1h33m9s    462.619368ms   0B             0B              21
48987ed9-537d-48b7-897f-2b0506f5a42b  OPEN    UNAUTHENTICATED  rpk        172.24.1.1:59834  2     0      1h33m9s    0s             0B             0B              12
37000ff9-da71-4aac-b7ce-1c9fef4e1f0e  OPEN    UNAUTHENTICATED  rpk        172.24.1.1:38496  1     0      1h33m9s    0s             0B             0B              12
7ebab7f3-e731-4560-bff6-b08b99efc59b  OPEN    UNAUTHENTICATED  rpk        172.24.1.1:55334  0     0      1h33m9s    0s             0B             0B              13
...

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.2.x
  • v25.1.x
  • v24.3.x

Release Notes

Features

  • Added a beta cluster connections list command

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented Oct 22, 2025

Retry command for Build#74729

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/quick_terminate_test.py::QuickTerminateTest.test_terminate
tests/rptest/tests/shadow_linking_rnot_test.py::ShadowLinkingRandomOpsTest.test_node_operations@{"failures":true}

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented Oct 22, 2025

CI test results

test results on build#74729
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
DataMigrationsApiTest test_higher_level_migration_api null integration https://buildkite.com/redpanda/redpanda/builds/74729#019a0da3-81f7-436c-aefe-d6f5392acbed FLAKY 19/21 upstream reliability is '99.61612284069098'. current run reliability is '90.47619047619048'. drift is 9.13993 and the allowed drift is set to 50. The test should PASS https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=DataMigrationsApiTest&test_method=test_higher_level_migration_api
QuickTerminateTest test_terminate null integration https://buildkite.com/redpanda/redpanda/builds/74729#019a0d99-39f4-4f38-a153-d0e244dc50a3 FAIL 0/1 https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=QuickTerminateTest&test_method=test_terminate
ShadowLinkingRandomOpsTest test_node_operations {"failures": false} integration https://buildkite.com/redpanda/redpanda/builds/74729#019a0d99-39f6-42ef-8dbc-c718a456919f FLAKY 5/21 upstream reliability is '37.22222222222222'. current run reliability is '23.809523809523807'. drift is 13.4127 and the allowed drift is set to 50. The test should PASS https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ShadowLinkingRandomOpsTest&test_method=test_node_operations
ShadowLinkingRandomOpsTest test_node_operations {"failures": false} integration https://buildkite.com/redpanda/redpanda/builds/74729#019a0da3-8201-46e0-b21f-c82ecc29e545 FLAKY 4/21 upstream reliability is '37.22222222222222'. current run reliability is '19.047619047619047'. drift is 18.1746 and the allowed drift is set to 50. The test should PASS https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ShadowLinkingRandomOpsTest&test_method=test_node_operations
ShadowLinkingRandomOpsTest test_node_operations {"failures": true} integration https://buildkite.com/redpanda/redpanda/builds/74729#019a0d99-39ed-41b6-a791-4461d4dfb2ef FLAKY 1/21 upstream reliability is '33.82352941176471'. current run reliability is '4.761904761904762'. drift is 29.06162 and the allowed drift is set to 50. The test should PASS https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ShadowLinkingRandomOpsTest&test_method=test_node_operations
ShadowLinkingRandomOpsTest test_node_operations {"failures": true} integration https://buildkite.com/redpanda/redpanda/builds/74729#019a0da3-81f0-423e-933c-af444ac0fc3c FAIL 0/1 https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=ShadowLinkingRandomOpsTest&test_method=test_node_operations
test results on build#75073
test_class test_method test_arguments test_kind job_url test_status passed reason test_history
LogCompactionTxRemovalTest test_tx_control_batch_removal null integration https://buildkite.com/redpanda/redpanda/builds/75073#019a27f7-d024-4b84-854d-1b5ab65538f3 FLAKY 15/21 upstream reliability is '92.5925925925926'. current run reliability is '71.42857142857143'. drift is 21.16402 and the allowed drift is set to 50. The test should PASS https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=LogCompactionTxRemovalTest&test_method=test_tx_control_batch_removal
LogCompactionTxRemovalUpgradeTest test_tx_control_batch_removal_with_upgrade {"test_case_name": "All aborts"} integration https://buildkite.com/redpanda/redpanda/builds/75073#019a27f7-d026-4519-bfb6-ef527cdfe452 FLAKY 17/21 upstream reliability is '90.0'. current run reliability is '80.95238095238095'. drift is 9.04762 and the allowed drift is set to 50. The test should PASS https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=LogCompactionTxRemovalUpgradeTest&test_method=test_tx_control_batch_removal_with_upgrade
NodesDecommissioningTest test_multiple_decommissions {"cloud_topic": true} integration https://buildkite.com/redpanda/redpanda/builds/75073#019a27ef-3df8-47aa-a45b-c979fafa4bb3 FAIL 0/1 https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=NodesDecommissioningTest&test_method=test_multiple_decommissions
PartitionMovementTest test_empty {"num_to_upgrade": 0} integration https://buildkite.com/redpanda/redpanda/builds/75073#019a27ef-3df4-46e6-9b21-83966aa33cb3 FLAKY 17/21 upstream reliability is '90.63625450180072'. current run reliability is '80.95238095238095'. drift is 9.68387 and the allowed drift is set to 50. The test should PASS https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=PartitionMovementTest&test_method=test_empty
PartitionMovementTest test_empty {"num_to_upgrade": 0} integration https://buildkite.com/redpanda/redpanda/builds/75073#019a27f7-d021-40a5-bcb3-e14a1daecd6a FLAKY 18/21 upstream reliability is '90.63625450180072'. current run reliability is '85.71428571428571'. drift is 4.92197 and the allowed drift is set to 50. The test should PASS https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=PartitionMovementTest&test_method=test_empty
TLSMetricsTestChain test_cert_chain_metrics null integration https://buildkite.com/redpanda/redpanda/builds/75073#019a27f7-d01f-4777-8b6f-d137835ac398 FAIL 0/1 https://redpanda.metabaseapp.com/dashboard/87-tests?tab=142-dt-individual-test-history&test_class=TLSMetricsTestChain&test_method=test_cert_chain_metrics

@graham-rp graham-rp force-pushed the UX-358/add-list-connections branch 2 times, most recently from a047685 to 5009dd7 Compare October 23, 2025 14:23
@vbotbuildovich
Copy link
Copy Markdown
Collaborator

Retry command for Build#74793

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/random_node_operations_smoke_test.py::RedpandaNodeOperationsSmokeTest.test_node_ops_smoke_test@{"cloud_storage_type":1,"mixed_versions":false}

@graham-rp graham-rp requested a review from r-vasquez October 23, 2025 16:42
Comment on lines +216 to +220
// TODO: add guardrails and define a proper field mapping
fset.StringVar(&orderBy, "order-by", "", "Order the results by their values. See Examples above")

// TODO: establish a limit
fset.Int32Var(&limit, "limit", defaultPageSize, "Limit how many records can be returned")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's left so we can finish these TODOs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second one is scale testing from core, but the first is more involved. We'd have to figure out how to map things from our end back to the proto labels, and then document it in a way that makes sense to users. I think that's realistically a new ticket.

@graham-rp graham-rp force-pushed the UX-358/add-list-connections branch from 5009dd7 to e1de2d8 Compare October 23, 2025 17:04
Comment on lines +39 to +51
var tableHeaders = []string{
"NODE",
"SHARD",
"UID",
"CLIENT-ID",
"IP:PORT",
"OPEN-TIME",
"IDLE",
"PROD-TPUT",
"FETCH-TPUT",
"REQS/MIN",
"STATE",
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a USER column for sure, and I did a bit of opinionated reordering of the columns here:

Suggested change
var tableHeaders = []string{
"NODE",
"SHARD",
"UID",
"CLIENT-ID",
"IP:PORT",
"OPEN-TIME",
"IDLE",
"PROD-TPUT",
"FETCH-TPUT",
"REQS/MIN",
"STATE",
}
var tableHeaders = []string{
"UID",
"STATE",
"USER"
"CLIENT-ID",
"IP:PORT",
"NODE",
"SHARD",
"OPEN-TIME",
"IDLE",
"PROD-TPUT",
"FETCH-TPUT",
"REQS/MIN",
}

Then the order of importance for more fields is this IMO:

    "AVG BATCH",
    "GROUP-ID",
	"AUTH-MECHANISM",
    "LISTENER",
	"TRANSACTIONAL-ID",
	"GROUP-INSTANCE-ID",
	"GROUP-MEMBER-ID",

Ideally I would add basically all of the above data, but I guess it would quickly become too wide and wrap around. I think you mentioned at some point that we could provide a flexible set of columns here. I think that could be super valuable. But that could definitely be a follow up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - I've got ~150 chars on the table as is after adding the user, so I don't think I want to add any more just yet (they're available with --format=json). Ideally we'd have a --columns=... or a more interactive TUI output

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have --format=wide that I think is currently the same as --format=text. Should we add more columns when --format=wide and in that case accept that the text might wrap around if you don't have enough screen space?

@graham-rp graham-rp force-pushed the UX-358/add-list-connections branch 4 times, most recently from da011ee to 93fa3fe Compare October 23, 2025 23:56
Copy link
Copy Markdown
Contributor Author

@graham-rp graham-rp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update based on feedback

@graham-rp graham-rp force-pushed the UX-358/add-list-connections branch 3 times, most recently from 2f6a679 to 9e2d68c Compare October 24, 2025 13:04
Comment on lines +39 to +51
var tableHeaders = []string{
"NODE",
"SHARD",
"UID",
"CLIENT-ID",
"IP:PORT",
"OPEN-TIME",
"IDLE",
"PROD-TPUT",
"FETCH-TPUT",
"REQS/MIN",
"STATE",
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have --format=wide that I think is currently the same as --format=text. Should we add more columns when --format=wide and in that case accept that the text might wrap around if you don't have enough screen space?

fset.StringVar(&filterRaw, "filter-raw", "", "Filter connections based on a raw query (overrides other filters)")

// TODO: add guardrails and define a proper field mapping
fset.StringVar(&orderBy, "order-by", "", "Order the results by their values. See Examples above")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about defining a set of shorthands for ordering as well, similar to how filtering works? I think these would make sense:

--top-producers       # order by produce throughput desc
--top-consumers       # order by fetch throughput desc
--most-idle           # order by idle_duration desc
--least-idle          # order by idle_duration asc
--oldest              # order by open_time asc
--newest              # order by open_time desc

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventually that probably makes sense - for the time being I think the first thing is documenting a mapping between the API vals and what adminv2 is expecting

@graham-rp graham-rp force-pushed the UX-358/add-list-connections branch 2 times, most recently from 00f4b9a to ab83506 Compare October 24, 2025 17:17
@graham-rp graham-rp force-pushed the UX-358/add-list-connections branch 3 times, most recently from 23d1295 to 83b0ca7 Compare October 27, 2025 16:05
@graham-rp graham-rp force-pushed the UX-358/add-list-connections branch from c0bc030 to c8a968e Compare October 27, 2025 18:23
pgellert
pgellert previously approved these changes Oct 27, 2025
@vbotbuildovich
Copy link
Copy Markdown
Collaborator

Retry command for Build#75054

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/quick_terminate_test.py::QuickTerminateTest.test_terminate

@graham-rp graham-rp force-pushed the UX-358/add-list-connections branch from c8a968e to a234d3b Compare October 27, 2025 19:57
@graham-rp graham-rp force-pushed the UX-358/add-list-connections branch from a234d3b to c8a968e Compare October 27, 2025 22:44
@r-vasquez r-vasquez force-pushed the UX-358/add-list-connections branch from c8a968e to 41f9426 Compare October 27, 2025 22:48
This adds a basic `cluster connections list` command that uses the
adminv2 API to list open and recently closed client connections inside
the cluster.

Co-authored-by: Gellért Peresztegi-Nagy <pereszteginagy.gellert@gmail.com>
@r-vasquez r-vasquez force-pushed the UX-358/add-list-connections branch from 41f9426 to 38230d0 Compare October 27, 2025 22:51
@r-vasquez
Copy link
Copy Markdown
Contributor

Force push 1 & 2 :

Rebasing with dev to fix merge conflicts and bumping rpadmin to prevent that from happening again, as we are fixing a bug on an upstream dependency in #28222

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented Oct 28, 2025

Retry command for Build#75073

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/tls_metrics_test.py::TLSMetricsTestChain.test_cert_chain_metrics
tests/rptest/tests/nodes_decommissioning_test.py::NodesDecommissioningTest.test_multiple_decommissions@{"cloud_topic":true}

@r-vasquez r-vasquez merged commit 025b300 into redpanda-data:dev Oct 28, 2025
23 checks passed
@r-vasquez r-vasquez mentioned this pull request Oct 28, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants