Alphanumeric partial matching fails (e.g., "610" doesn't find "CL610-ABC") #976
-
|
Hi, I am experiencing and issue where searching for partial alphanumeric codes results in 0 hits, or empty array. Consider following words
Currently none of these have hits. The behavior is a bit strange, if I the search term starts with the target word beginning it works. Or if the mid-word term is only letters, we get some hits (E.g: I am wondering whether or not I am doing something wrong, or maybe there is a built-in way to enable substring/partial matching for alphanumeric identifiers? const orama = create({
schema: {
id: "string",
name: "string",
type: "enum",
region_id: "number",
location: "geopoint",
sample_names: "string[]",
well_id: "number",
well_name: "string",
fluid_type: "string",
} as const,
});
// ...
const results = await search(orama, {
term: query,
limit: 12,
properties: ["name", "sample_names", "well_name"],
}); |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Hi @davlet61, this is expected as Doing substring match is feasable, but will drastically increase index size (and therefore memory utilization) and it may not be doable on a browser. I hope this clarifies things! |
Beta Was this translation helpful? Give feedback.
Hi @davlet61, this is expected as
610is a substring ofCL610-ABC. The tokenizer splitsCL610-ABCinto["CL610", "ABC"]and can't find a string starting with610.Doing substring match is feasable, but will drastically increase index size (and therefore memory utilization) and it may not be doable on a browser.
I hope this clarifies things!