Skip to content

Conversation

@nukyan
Copy link

@nukyan nukyan commented Jan 19, 2026

Each positional embedding module has its own method for calculating frequency bands, so this should not be computed within build_fourier_pos_embed(). And also, for those who wish to implement alternative band calculation methods, simply override the _compute_bands() function. This is a convenient way.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@rwightman
Copy link
Collaborator

@nukyan I've thought about refactoring this a bit, but I don't feel I can/should break compat for the original fn name. So the solution would be to split into two components, have the original fn call those, and refactor the modules to call the separated fn....

In the grid creation I'd also planned to unify the dinov3 style grid (0.5 center w/ normalization) as an option in addition to 'pixels' and the default index based approach.

@nukyan
Copy link
Author

nukyan commented Jan 24, 2026

@nukyan I've thought about refactoring this a bit, but I don't feel I can/should break compat for the original fn name. So the solution would be to split into two components, have the original fn call those, and refactor the modules to call the separated fn....

In the grid creation I'd also planned to unify the dinov3 style grid (0.5 center w/ normalization) as an option in addition to 'pixels' and the default index based approach.

Okay, I've already revert back the original function signature. Now those modules uses _build_fourier_pos_embed and _build_rotary_pos_embed to get embedding values. As public functions, I think it would be better for build_fourier_pos_embed and build_rotary_pos_embed to have bands=None as default.

Additionally, I believe refactoring it is the better choice for the better code quality. But as a newcomer to CV, I'm afraid I can't offer any assistance with this matter.

@rwightman
Copy link
Collaborator

@nukyan thanks for the update, still thinking about the API a bit and whether it makes sense for me to slot my dinov3 idea on top...

Once you have a project with thousands of users and a whole lot of other projects depending on it, you cannot arbitrarily refactor for the sake of improving code quality. You have to weight the likelihood of breaking other people's use of your library against the quality improvement. One cannot assume that people are only using the library functions via the models themselves, many do use sub-modules, functions directly and that's hard to track.

With modules, functions like this, when I started I had only a few models using it and as usual, decided to cut off refinement at some point to 'ship it'... and then over time more models started using it and yeah, would have benefited from a better initial design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants