Skip to content

Replace ClientContext/TabletLocator/MetadataServicer with AccumuloTableInfoFetcher facade#3449

Open
SethSmucker wants to merge 5 commits intointegrationfrom
task/clientcontext-facade-migration
Open

Replace ClientContext/TabletLocator/MetadataServicer with AccumuloTableInfoFetcher facade#3449
SethSmucker wants to merge 5 commits intointegrationfrom
task/clientcontext-facade-migration

Conversation

@SethSmucker
Copy link
Collaborator

Summary

  • Create AccumuloTableInfoFetcher facade in core/connection-pool centralizing Accumulo metadata operations behind public APIs
  • Replace Thrift RPC in BulkIngestMapFileLoader.getMajorCompactionCount() with getActiveCompactions() API
  • Replace MetadataServicer in TableSplitsCache with locate() API
  • Replace TabletLocator/ClientContext in BulkInputFormat online path with locate() API
  • Remove TabletLocator from import-control-accumulo.xml allowlist

Part of #2443

- Remove getClientContext utility from AccumuloConnectionFactory
- Update PushdownScheduler to use tableOperations().tableIdMap()

Fixes #3339
Part of #2443
…Is (#2443)

Create AccumuloTableInfoFetcher in core/connection-pool that centralizes
Accumulo table metadata operations behind public APIs, replacing direct
usage of ClientContext, ThriftClientTypes, TabletLocator, and
MetadataServicer.

Migrated callers:
- BulkIngestMapFileLoader: replace Thrift RPC with getActiveCompactions()
- TableSplitsCache: replace MetadataServicer with locate() API
- BulkInputFormat: replace TabletLocator/ClientContext online path with
  locate() API; offline path deferred (uses KeyExtent, separate task)

Also removes TabletLocator from import-control-accumulo.xml allowlist.
@SethSmucker SethSmucker force-pushed the task/clientcontext-facade-migration branch from e9a4617 to 5f29ebf Compare March 2, 2026 20:25
* Get the count of running major compactions across all tablet servers using the public {@code getActiveCompactions()} API.
* <p>
* Note: This counts only running compactions (not queued), which differs slightly from the original Thrift-based implementation that also counted queued
* compactions. This is acceptable because the MAJC_THRESHOLD default is 3000 (a high safety margin) and this is polled on each bulk load cycle.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure relying solely on active will work for us. On some of the instances, the tservers never really stop compacting so having a threshold on the active would either always trigger or never trigger. The queued really illustrates the backlog. Thoughts @ivakegg @hlgp ?

SethSmucker and others added 3 commits March 23, 2026 12:12
- Remove implementation details from AccumuloTableInfoFetcher javadocs
- Throw TableNotFoundException instead of TableDeletedException in BulkInputFormat
- Add datawave-core-connection-pool to root pom dependencyManagement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants