Skip to content

Options rework#187

Draft
toptobes wants to merge 17 commits into
mainfrom
KG-misc-work
Draft

Options rework#187
toptobes wants to merge 17 commits into
mainfrom
KG-misc-work

Conversation

@toptobes
Copy link
Copy Markdown
Collaborator

@toptobes toptobes commented May 12, 2026

still fleshing it out but some questions include:

  • do we want to omit Base*Options for things like CollectionUpdateManyOptions which has no table equiv.?
  • do we need generics on options classes which don't need it?
    • on one hand, better forward compatability
    • on the other hand, longer and less reusable types
  • should countdocuments really have a no-filter overload?
  • can we switch to using = null for options so we don't need as many overloads
  • what dir should options be in? currently it's not uniform

@toptobes toptobes marked this pull request as draft May 12, 2026 06:45
@sl-at-ibm
Copy link
Copy Markdown
Collaborator

sl-at-ibm commented May 12, 2026

Here is my opinion:

[1] do we want to omit Base*Options for things like CollectionUpdateManyOptions which has no table equiv.?

Yes

[2] do we need generics on options classes which don't need it?

I'd say no, but am less sure here. Do you have specific cases in mind where a type parameter might become useful in the future?

[3] should countdocuments really have a no-filter overload?

Good point. Having users pass an explicitly CollectionFilter.Empty() might be beneficial in avoiding indiscriminate usage. FWIW, in astrapy both filter and upper_bound are required parameters (the former possibly passed as {}).
So ... => I'd remove this overload.

[4] can we switch to using = null for options so we don't need as many overloads

I've got nothing against it.

[5] what dir should options be in? currently it's not uniform

Very good point. I am reviewing the current location of existing *Options*.cs files and will post an assessment below for further discussion.

@sl-at-ibm
Copy link
Copy Markdown
Collaborator

@toptobes Here is my assessment of a possible more-or-less consistent structure for the various Options files. We can start from here and refine as needed.

Sections here are the desired location, the file path+name in the lists is the current one (as of main now).

Notes:

  1. (!) marks that the file is in a different location than the section it is listed in.
  2. I simply did a find for "*Options*cs" files, so this might not cover all files to relocate. First let's agree on a general plan, we can be more detailed later.
  3. Also xmldoc sometimes defines wrong scope e.g. in "InsertManyOptions" it says Options for inserting multiple documents into a collection. but it's also for tables.

DELETE (these seem unused?)

  • Core/Query/FindAPIOptions.cs
  • Core/IParentCursor.cs
  • Core/CountDocumentsCommandOptions.cs (currently not used, probably part of some rework in this PR? Should be shareable coll/tables if remains)

IN Admin/

  • Admin/DatabaseCreationOptions.cs
  • Admin/ListDatabaseOptions.cs
  • Core/BlockingCommandOptions.cs (!)
  • Core/FindAvailableRegionsCommandOptions.cs (!)
  • Core/FindEmbeddingProvidersCommandOptions.cs (!)
  • Core/FindRerankingProvidersCommandOptions.cs (!)
  • Core/CreateKeyspaceCommandOptions.cs (!)
  • Core/DatabaseCollectionCommandOptions.cs (? here or in Collections/ It's used in create/get collections, so maybe here...?)
  • Core/DatabaseTableCommandOptions.cs (? same as above)

IN Core/

  • Core/AnalyzerOptions.cs
  • Core/CommandOptions.cs
  • Core/DatabaseCommandOptions.cs
  • Core/FilterOptions.cs (perhaps rename to "AnalyzerFilterOptions" ? It might seem misleading as is)
  • Core/LexicalOptions.cs
  • Core/RerankServiceOptions.cs
  • Core/TokenizerOptions.cs
  • Core/CreateTableCommandOptions.cs
  • Core/DropTableCommandOptions.cs
  • Core/VectorServiceOptions.cs
  • Core/RerankOptions.cs
  • Core/HttpClientOptions.cs
  • Core/TimeoutOptions.cs
  • Core/UpdateOptions.cs (most of the content is coll-specific, but UpdateOneOptions is also for tables, so: here)
  • Core/InsertManyOptions.cs

IN Core/Query/

Are we ok to keep all things "Query" here? (rather than stick to the core/coll/tables like the rest?) I'm OK with that._
Also no coll/table subdirs here, too much structure? Works for me.
Assuming this is the decision, the following might all stay here, maybe some renames to clarify scope:

  • Core/Query/FindAndRerankOptions.cs (!? tough call. Probably a Table FARR will have different signature, if it ever comes. So this should (a) be renamed and (b) go to collections?)
  • Core/Query/IFindOptions.cs
  • Core/Query/CollectionFindOneOptions.cs
  • Core/Query/FindOptions.cs
  • Core/Query/TableFindOneOptions.cs
  • Core/Query/TableFindOptions.cs
  • Core/Query/CollectionFindOptions.cs
  • Core/Query/CollectionFindManyOptions.cs
  • Core/Query/IFindManyOptions.cs
  • Core/Query/TableFindManyOptions.cs

IN SerDes/

  • SerDes/AnalyzerOptionsConverter.cs

IN Tables/

  • Tables/TableIndexCreationOptions.cs
  • Tables/CreateIndexCommandOptions.cs
  • Tables/DropIndexCommandOptions.cs
  • Tables/TableTextIndexOptions.cs
  • Tables/TableVectorIndexOptions.cs
  • Tables/TableIndexOptions.cs
  • Core/TableDeleteOptions.cs (!)
  • Core/CreateTypeCommandOptions.cs (!)
  • Core/DropTypeCommandOptions.cs (!)

IN Collections/

  • Core/DefaultIdOptions.cs (!)
  • Core/DeleteOptions.cs (!, also rename with 'collections' in the name?)
  • Core/IndexingOptions.cs (!)
  • Core/LexicalOptionsAttribute.cs (!)
  • Core/VectorOptions.cs (!)
  • Core/ReplaceOptions.cs (!)

@toptobes
Copy link
Copy Markdown
Collaborator Author

  1. 👍
  2. not really, I actually removed the type params anyways I got annoyed with dealing with them after 5 seconds (they were previously necessary as many of the options objects used to contain filters within them, but not anymore)
  3. 100% agreed, just double checking since there was already a no-filter overload
  4. yay

@toptobes
Copy link
Copy Markdown
Collaborator Author

as for the location of the options objects I'll come back to this at the end

/// <summary>
/// Returns a single document from the collection based on the provided <see cref="CollectionFindOneOptions{T}"/>.
/// This will return the first document found, most often used in conjunction with <see cref="CollectionFindOneOptions{T}.Sort"/>.
/// See <see cref="CollectionFindOneOptions{T}"/> for more details on sorting, projecting and the other options for finding a document.
Copy link
Copy Markdown
Collaborator

@sl-at-ibm sl-at-ibm May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the CollectionFindOneOptions{T}.Sort reference in line above this one is an actual warning (apparently inherited attributes/methods aren't tracked), a line line this one, while not giving warnings, would arguably need to be changed to refer to the base BaseFindOptions perhaps?

(Note: I know you aren't looking at xmldocs warnings yet, this is more a remark to plan the closing work for this PR. Which incidentally I can volunteer to do myself if you have better things to do).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CollectionFindOneOptions will end up inheriting docs from BaseFindOneOptions

findOptions = findOptions.WithFilterParam(filter);
var command = CreateCommand("findOne").WithPayload(findOptions).AddCommandOptions(commandOptions);
findOptions ??= new CollectionFindOneOptions<T>();
var commandOptions = findOptions;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why this method directly uses the findOptions if provided, while e.g. InsertManyAsync merges the options tree (above, line 194), putting the method arg on top of the pile?

I mean, options such as timeouts, etc can potentially be specified at either level and apply to both method. (This is a broader question, probably, about whether the "merge the options tree" pattern should be ubiquitous).

(Which in itself raises a more general question. What is the expected behaviour if invoking e.g. FindOneAsync one passes a different Environment or Destination in the CollectionFindOneOptions?

Note that some kind of "misplaced layer for options" occurs in other clients already and is not a problem in itself. E.g. in astrapy you can pass a whole spawn_api_options when creating a Collection off a Database, including table-specific serdes settings - which of course won't be ever looked at. So "the new env/dest is ignored" is probably a valid answer. This comment originated by looking at this line but has wide scope).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated – options were in a bit of a transitional stage as I was sorting out some bugs

What is the expected behaviour if invoking e.g. FindOneAsync one passes a different Environment or Destination in the CollectionFindOneOptions?

this is very much a more general question and I'm not sure it's in the scope of this PR. Wdyt should happen though?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is very much a more general question and I'm not sure it's in the scope of this PR.
Agree. I'd also add that this could realistically be pushed postGA.

Wdyt should happen though?
Ignoring the new settings sounds inelegant. Probably should throw.
Maybe a good, centralized solution could be to restrict the options Merge operation to "mergeable fields" (timeouts, etc), and refuse to merge when other fields are passed.

(and in case a "major merge" is used, maybe when creating a client - I don't recall, there may be an additional MergeComprehensively method, or something like that).

Comment thread src/DataStax.AstraDB.DataApi/Core/Enumeration/FindCursor.cs Outdated
{
throw new ArgumentException("Cannot run ordered insert_many concurrently.");
}

foreach (var doc in documents)
{
InsertValidator.Validate(doc);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another off-topic (sorry). Do we want this validation to happen?
It probably consumes very little cpu cycles in the whole "insertion" latency budget, however it might be not a client's job, at least to an extent.
For example, null is a valid _id for the API, and so is "" (I think).
If you agree, this item would become a postGA ticket.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh I was this close (🤏) from removing it myself, I'm also not sure about keeping that validation

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there's nothing that wouldn't be caught by the server and returned as an error in the response (as the general clients' philosophy would prefer).
Except those not-real-errors that we noted above.
If it's convenient to do it in this branch, feel free to remove 🤷

/// <returns></returns>
/// <param name="filter">The filter to match documents.</param>
/// <param name="options">Options for the delete operation.</param>
/// <returns>The result of the delete operation.</returns>
/// <remarks>
/// Deleting all documents in a collection is not recommended for large collections. However, if needed
/// you can pass null as the filter to delete all documents in the collection.
Copy link
Copy Markdown
Collaborator

@sl-at-ibm sl-at-ibm May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while you're at it, you might change this line to suggest usage of CollectionFilter.Empty()

(esp. since we plan to reject actual nulls for safety).

/// and that an exception will be thrown by this method if this limit is encountered.
/// </remarks>
/// <throws cref="DocumentCountExceedsMaxException">Thrown if the number of documents to count exceeds the limit</throws>
public Task<int> CountDocumentsAsync(int maxDocumentsToCount)
Copy link
Copy Markdown
Collaborator

@sl-at-ibm sl-at-ibm May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skedwards88 I believe this being removed (the overload with just the upper bound) means the docs example under astra-db-serverless/api-reference/document-methods/count-all.html#count-all-documents will need to get a CollectionFilters.Empty() as first parameter.
(this is intentional, to encourage users to be conscious about counting all on large collections)

}

[Fact]
[SkipWhenNotAstra] // TODO why is this not throwing on HCD in CI? It throws just fine when running on HCD locally
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably related (albeit in reverse, local<-->runner) to what I experience locally, namely that on my laptop when I run Data API + HCD there's no actual user/pwd check. Anything I pass would just work. (I always blamed podman for failing in mounting some cassandra.yaml, but never managed to get to the bottom of it).

Guard.NotNull(row, nameof(row));

options = options?.ShallowCopy() ?? new();
SetRowSerializationOptions<T>(options, true);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another tangential thought is that SetRowSerializationOptions does act immutably, i.e.: "if it weren't for the fact that it reads the type of Row, it could be a static method".
I wonder if it makes sense to make it into one, with Row an explicit parameter. That would give compile-time guarantees against future bugs that end up mutating something in the instance (not a real concern, but ... )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants