Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: added support for ordered categoricals in kendall/spearman correlation #60493

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

micheleuap
Copy link

@micheleuap micheleuap commented Dec 4, 2024

This adds support for Kendall and Spearman correlation on ordered categorical data - no change in behaviour for Pearson correlation or for unordered categoricals.

As shown in the bug report

df = DataFrame({'a' : [1, 2, 3, 4], 'b' : [4, 3, 2, 1]})
df['b'] = df['b'].astype('category').cat.set_categories([4, 3, 2, 1], ordered=True)
df.corr(method='spearman')

used to return -1, going forward this will return 1. I.e. this does not try to convert the categories to floats, but instead uses the stated order of each category.

@micheleuap micheleuap changed the title added support for ordered categoricals in kendall/spearman correlation BUG: added support for ordered categoricals in kendall/spearman correlation Dec 4, 2024
@micheleuap micheleuap marked this pull request as draft December 12, 2024 07:05
@micheleuap micheleuap marked this pull request as ready for review December 12, 2024 07:07
@micheleuap
Copy link
Author

Tests used to pass - freethreading started segfaulting after I changed the .rst file - Highly doubt this is a problem with the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: spearman correlation doesn't work on non-numeric data
1 participant