Skip to content

Fix scikit-learn 1.6+ compatibility#197

Merged
MaxHalford merged 1 commit intoMaxHalford:masterfrom
colelandolt:bugfix/colelandolt/scikit-learn-1.6-compatibility
Jan 9, 2026
Merged

Fix scikit-learn 1.6+ compatibility#197
MaxHalford merged 1 commit intoMaxHalford:masterfrom
colelandolt:bugfix/colelandolt/scikit-learn-1.6-compatibility

Conversation

@colelandolt
Copy link
Contributor

@colelandolt colelandolt commented Jan 9, 2026

Summary

This PR fixes a compatibility issue with scikit-learn 1.6+ where the CA class (and by inheritance, MCA) would raise an AttributeError when calling fit() or any method decorated with @utils.check_is_fitted.

Documentation

Problem

When using prince with scikit-learn 1.6+, users encounter the following error:

AttributeError: The following error was raised: 'CA' object has no attribute '__sklearn_tags__'. It seems that there are no classes that implement `__sklearn_tags__` in the MRO and/or all classes in the MRO call `super().__sklearn_tags__()`. Make sure to inherit from `BaseEstimator` which implements `__sklearn_tags__` (or alternatively define `__sklearn_tags__` but we don't recommend this approach).

Root Cause

scikit-learn 1.6 introduced a new public API for estimator tags, replacing the private __get_tags__() method with __sklearn_tags__(). This method is provided by sklearn.base.BaseEstimator.
The CA class only inherited from utils.EigenvaluesMixin and did not inherit from BaseEstimator. When sklearn.utils.validation.check_is_fitted() is called (via the @utils.check_is_fitted decorator), it now requires __sklearn_tags__() to be present.

Solution

  • ca.py: Added sklearn.base.BaseEstimator to the CA class inheritance chain
  • mca.py: Reordered the inheritance to fix the resulting MRO (Method Resolution Order) conflict
    • Since CA now inherits from BaseEstimator, having MCA also directly inherit from BaseEstimator before CA caused an MRO conflict. The fix puts CA first (which brings in BaseEstimator), followed by TransformerMixin.

Classes Affected

Class Status Notes
CA ✅ Fixed Added BaseEstimator inheritance
MCA ✅ Fixed Reordered bases; inherits BaseEstimator via CA
PCA ✅ Already OK Already inherited from BaseEstimator
FAMD ✅ Already OK Inherits BaseEstimator via PCA
MFA ✅ Already OK Inherits BaseEstimator via PCA
GPA ✅ Already OK Already inherited from BaseEstimator

Testing

✅ Verified all estimators work correctly with scikit-learn 1.7.0

Validation Code
import prince

dataset = prince.datasets.load_french_elections()
ca = prince.CA(
    n_components=3,
    n_iter=3,
    copy=True,
    check_input=True,
    engine='sklearn',
    random_state=42
)
ca = ca.fit(dataset)
print(ca.eigenvalues_summary)

Backward Compatibility

This change is backward compatible with older versions of scikit-learn. Inheriting from BaseEstimator is a standard practice for estimators that are compatible with sklearn and provides additional benefits like get_params() and set_params() methods.

Additional Notes

The code changes were proposed by Claude Opus 4.5 Thinking using the following prompt:

Prompt I have forked the prince package repository, which is a package that implements statistical procedures for multivariate exploratory data in Python, such as PCA, CA, MCA, MFA, FAMD, and GPA. I would like your assistance with contributing to this open source package.

Problem Summary

The prince package is dependent on the scikit-learn package. However, the prince package has not been updated to comply with the new API introduced in the latest versions of scikit-learn. Therefore, I am receiving the following error anytime I invoke the fit() method of any of the analyses:

AttributeError: The following error was raised: 'CA' object has no attribute '__sklearn_tags__'. It seems that there are no classes that implement `__sklearn_tags__` in the MRO and/or all classes in the MRO call `super().__sklearn_tags__()`. Make sure to inherit from `BaseEstimator` which implements `__sklearn_tags__` (or alternatively define `__sklearn_tags__` but we don't recommend this approach). Note that `BaseEstimator` needs to be on the right side of other Mixins in the inheritance order.

Root Cause

This error is occuring because scikit-learn 1.6+ changed their internal API, replacing get_tags() with sklearn_tags(). Here are more details from the scikit-learn documentation and migration guide:

"""
The following change was introduced in version 1.6.0 of scikit-learn: sklearn_tags was introduced for setting tags in estimators. Scikit-learn introduced estimator tags in version 0.21 as a private API and mostly used in tests. However, these tags expanded over time and many third party developers also need to use them. Therefore in version 1.6 the API for the tags was revamped and exposed as public API. The estimator tags are annotations of estimators that allow programmatic inspection of their capabilities, such as sparse matrix support, supported output types and supported methods. The estimator tags are an instance of Tags returned by the method sklearn_tags. These tags are used in different places, such as is_regressor or the common checks run by check_estimator and parametrize_with_checks, where tags determine which checks to run and what input data is appropriate. Tags can depend on estimator parameters or even system architecture and can in general only be determined at runtime and are therefore instance attributes rather than class attributes. It is unlikely that the default values for each tag will suit the needs of your specific estimator. You can change the default values by defining a sklearn_tags() method which returns the new values for your estimator’s tags.
"""

The error I have encountered is surfaced by the @utils.check_is_fitted decorator, which calls sklearn's check_is_fitted() validation, which in sklearn 1.6+ requires the sklearn_tags() method to exist.

Proposed Solution

  1. Investigate prince/utils.py to understand the EigenvaluesMixin class and the check_is_fitted decorator implementation.
  2. Ensure proper inheritance from sklearn's BaseEstimator. The fix should either:
    • Have CA inherit from BaseEstimator (with correct MRO order: class CA(utils.EigenvaluesMixin, BaseEstimator))
    • OR implement sklearn_tags() method in the CA class or in EigenvaluesMixin
  3. Maintain backward compatibility with older sklearn versions by also keeping _more_tags() if needed.
  4. Apply the same fix to all estimator classes in the package: CA, MCA, PCA, FAMD, etc.

Testing

After making changes, verify the fix works with the following code:

import prince

dataset = prince.datasets.load_french_elections()
ca = prince.CA(
    n_components=3,
    n_iter=3,
    copy=True,
    check_input=True,
    engine='sklearn',
    random_state=42
)
ca = ca.fit(dataset)
print(ca.eigenvalues_summary)

Acceptance Criteria

  • ca.fit() works without AttributeError on scikit-learn 1.6+
  • all existing tests pass
  • backward compatibility maintained with scikit-learn < 1.6
  • fix applied consistently across all estimator classes in the package

@MaxHalford
Copy link
Owner

Looks like this won't hurt, thanks 👍

@MaxHalford MaxHalford merged commit 84beb41 into MaxHalford:master Jan 9, 2026
2 checks passed
@colelandolt colelandolt deleted the bugfix/colelandolt/scikit-learn-1.6-compatibility branch January 9, 2026 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants