Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Added isascii() string method fixing issue #59091 #60532

Open
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

avecasey
Copy link

This pull request introduces the isascii() string method to the pandas API, enabling users to efficiently check whether all characters in each string are ASCII. The implementation leverages the pyarrow.compute.string_is_ascii function for high-performance processing when working with Arrow-backed pandas objects. The method has been integrated into the pandas string accessor (str) via the _map_and_wrap utility, ensuring it aligns with pandas' existing API design.

Summary of Changes:

Added the isascii method to the string accessor using _map_and_wrap for dynamic registration and documentation integration.

Implemented the core logic in _str_isascii, which uses pyarrow.compute.string_is_ascii.
Added corresponding tests to verify correctness, performance, and edge cases.

Updated the whatsnew documentation to include the new feature in the upcoming version notes.

This enhancement addresses requests for additional string validation methods (closes #59091) and aligns pandas with commonly needed text processing capabilities.

@WillAyd
Copy link
Member

WillAyd commented Dec 23, 2024

I think this looks good in general. I don't think the CI failures are related - can you try merging in main to see if that fixes things?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: String methods has no method "isascii()"
2 participants