Fix scikit-learn 1.6+ compatibility#197
Merged
MaxHalford merged 1 commit intoMaxHalford:masterfrom Jan 9, 2026
Merged
Conversation
Owner
|
Looks like this won't hurt, thanks 👍 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes a compatibility issue with scikit-learn 1.6+ where the
CAclass (and by inheritance,MCA) would raise anAttributeErrorwhen callingfit()or any method decorated with@utils.check_is_fitted.Documentation
Problem
When using
princewithscikit-learn1.6+, users encounter the following error:Root Cause
scikit-learn1.6 introduced a new public API for estimator tags, replacing the private__get_tags__()method with__sklearn_tags__(). This method is provided bysklearn.base.BaseEstimator.The
CAclass only inherited fromutils.EigenvaluesMixinand did not inherit fromBaseEstimator. Whensklearn.utils.validation.check_is_fitted()is called (via the@utils.check_is_fitted decorator), it now requires__sklearn_tags__()to be present.Solution
ca.py: Addedsklearn.base.BaseEstimatorto theCAclass inheritance chainmca.py: Reordered the inheritance to fix the resulting MRO (Method Resolution Order) conflictCAnow inherits fromBaseEstimator, havingMCAalso directly inherit fromBaseEstimatorbeforeCAcaused an MRO conflict. The fix putsCAfirst (which brings inBaseEstimator), followed byTransformerMixin.Classes Affected
CABaseEstimatorinheritanceMCABaseEstimatorviaCAPCABaseEstimatorFAMDBaseEstimatorviaPCAMFABaseEstimatorviaPCAGPABaseEstimatorTesting
✅ Verified all estimators work correctly with
scikit-learn1.7.0Validation Code
Backward Compatibility
This change is backward compatible with older versions of
scikit-learn. Inheriting fromBaseEstimatoris a standard practice for estimators that are compatible withsklearnand provides additional benefits likeget_params()andset_params()methods.Additional Notes
The code changes were proposed by Claude Opus 4.5 Thinking using the following prompt:
Prompt
I have forked the prince package repository, which is a package that implements statistical procedures for multivariate exploratory data in Python, such as PCA, CA, MCA, MFA, FAMD, and GPA. I would like your assistance with contributing to this open source package.Problem Summary
The prince package is dependent on the scikit-learn package. However, the prince package has not been updated to comply with the new API introduced in the latest versions of scikit-learn. Therefore, I am receiving the following error anytime I invoke the fit() method of any of the analyses:
Root Cause
This error is occuring because scikit-learn 1.6+ changed their internal API, replacing get_tags() with sklearn_tags(). Here are more details from the
scikit-learndocumentation and migration guide:"""
The following change was introduced in version 1.6.0 of scikit-learn: sklearn_tags was introduced for setting tags in estimators. Scikit-learn introduced estimator tags in version 0.21 as a private API and mostly used in tests. However, these tags expanded over time and many third party developers also need to use them. Therefore in version 1.6 the API for the tags was revamped and exposed as public API. The estimator tags are annotations of estimators that allow programmatic inspection of their capabilities, such as sparse matrix support, supported output types and supported methods. The estimator tags are an instance of Tags returned by the method sklearn_tags. These tags are used in different places, such as is_regressor or the common checks run by check_estimator and parametrize_with_checks, where tags determine which checks to run and what input data is appropriate. Tags can depend on estimator parameters or even system architecture and can in general only be determined at runtime and are therefore instance attributes rather than class attributes. It is unlikely that the default values for each tag will suit the needs of your specific estimator. You can change the default values by defining a sklearn_tags() method which returns the new values for your estimator’s tags.
"""
The error I have encountered is surfaced by the @utils.check_is_fitted decorator, which calls sklearn's check_is_fitted() validation, which in sklearn 1.6+ requires the sklearn_tags() method to exist.
Proposed Solution
Testing
After making changes, verify the fix works with the following code:
Acceptance Criteria