Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RandomForestClassifier slower than original sklearn #1050

Closed
matchyc opened this issue Oct 11, 2022 · 10 comments
Closed

RandomForestClassifier slower than original sklearn #1050

matchyc opened this issue Oct 11, 2022 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@matchyc
Copy link

matchyc commented Oct 11, 2022

Describe the bug
RandomForestClassifier slower than original sklearn

To Reproduce
Steps to reproduce the behavior:

Before Intel One API acceleration, time elapsed is: 117.54324022123 seconds

After Intel One API acceleration, time elapsed is: 131.16063022613525 seconds

code:
In the main function, there are MLP\RFClassifier\DecisionTree.
only RFClassifier is supported by intel sklearnex, but I still got much slower than the original sklearn pack. I suppose it should be at least equal.

 original_main_start = time.time()
    main()
 original_main_end = time.time()

    from sklearnex import patch_sklearn
    from sklearnex.ensemble import RandomForestClassifier
    patch_sklearn()

 from sklearn.preprocessing._encoders import OneHotEncoder
    from sklearn.preprocessing import LabelEncoder
    from sklearn import datasets, preprocessing, metrics
    import matplotlib.pyplot as plt
    from sklearn.neural_network import MLPClassifier
    from sklearn.tree import DecisionTreeClassifier, ExtraTreeClassifier
    # from sklearn.ensemble import RandomForestClassifier
    from numpy.random.mtrand import np
    import numpy as np
    import sklearn.linear_model as sk
    from pandas.core.frame import DataFrame
    from sklearn.metrics import recall_score,precision_score,roc_curve,roc_auc_score
    
    intel_main_start = time.time()
    main()
    intel_main_end = time.time()
    print(f'After Intel One API acceleration, time elapsed is: {intel_main_end - intel_main_start} seconds')

Environment:

  • OS: Ubuntu 20.04 focal
  • Version: scikit-learn-intelex 2021.20221004.171638
  • python: 3.8
@matchyc matchyc added the bug Something isn't working label Oct 11, 2022
@ahuber21 ahuber21 self-assigned this Feb 23, 2023
@ahuber21
Copy link
Contributor

Hi @matchyc, thanks for reporting. Just to give an update. The issue was reproduced and it depends on the max_features parameter. For a small number of features per split (as is the case with the default setting sqrt(n_features)), our splitting algorithm is suboptimal. We're working on an update.

@ahuber21
Copy link
Contributor

ahuber21 commented Mar 2, 2023

Hi @matchyc and @Innixma, a quick update.
We have an option called memorySavingMode in the C++ backend that is hardcoded to False in scikit-learn-intelex: https://github.com/intel/scikit-learn-intelex/blob/master/daal4py/sklearn/ensemble/_forest.py#L254

It looks like this option struggles with many-feature datasets. Running with memorySavingMode=True decreased my run time from 160 seconds to 11.6 seconds, which is about 3x faster than stock scikit-learn.

Do you have the resources to give this a shot? I'm working on understanding this behavior in greater detail, and additional input from your side would be very helpful.
For instance, I print out the depth of every tree in the code snippet below and I see that running with memorySavingMode=True produces consistently shallower trees. That's a surprise to me that needs to be understood.

$ python rf_slow.py
Train optimized
Took 11.61 seconds
[16, 11, 31, 22, 18, 30, 11, 15, 29, 18, 17, 33, 29, 25, 29, 25, 24, 12, 29, 23, 31, 31, 27, 18, 13, 29, 20, 19, 29, 20, 12, 19, 25, 15, 38, 18, 11, 27, 30, 35, 31, 24, 17, 32, 31, 27, 23, 30, 21, 13, 3, 22, 20, 19, 23, 24, 22, 18, 8, 30, 24, 25, 10, 19, 19, 8, 11, 28, 21, 35, 23, 25, 13, 23, 24, 29, 31, 23, 22, 26, 23, 9, 13, 10, 27, 20, 29, 25, 24, 25, 17, 32, 20, 35, 24, 26, 31, 22, 25, 27, 30, 4, 33, 9, 22, 23, 34, 28, 23, 8, 22, 20, 25, 27, 17, 5, 25, 29, 18, 15, 23, 22, 23, 32, 28, 19, 30, 24, 5, 24, 27, 8, 29, 25, 13, 22, 25, 19, 19, 10, 24, 10, 25, 26, 23, 23, 24, 21, 30, 25, 31, 28, 30, 9, 14, 12, 33, 22, 13, 9, 21, 23, 3, 17, 8, 25, 17, 30, 24, 26, 23, 5, 23, 21, 36, 27, 13, 18, 13, 23, 26, 24, 7, 9, 17, 12, 27, 4, 14, 11, 4, 24, 30, 25, 17, 29, 25, 29, 33, 11, 39, 26, 20, 19, 32, 29, 15, 34, 20, 24, 23, 27, 27, 27, 30, 19, 26, 27, 15, 26, 14, 11, 31, 15, 27, 25, 26, 25, 20, 20, 25, 37, 34, 32, 39, 31, 21, 28, 10, 28, 20, 18, 31, 25, 9, 26, 11, 23, 22, 21, 4, 13, 29, 21, 37, 23, 17, 15, 33, 26, 25, 19, 26, 20, 32, 28, 19, 8, 13, 28, 34, 28, 25, 25, 21, 42, 17, 32, 9, 24, 9, 24, 22, 4, 22, 25, 21, 30, 19, 21, 23, 29, 24, 23, 20, 31, 17, 22, 24, 22]
Train stock
Took 29.54 seconds
[83, 65, 56, 78, 85, 57, 94, 65, 120, 68, 61, 95, 63, 67, 76, 65, 64, 79, 86, 72, 91, 52, 84, 90, 60, 58, 53, 74, 65, 68, 69, 66, 77, 79, 59, 70, 84, 66, 74, 61, 71, 73, 51, 68, 58, 83, 52, 87, 69, 69, 72, 97, 88, 82, 60, 63, 76, 69, 66, 70, 92, 61, 85, 74, 86, 112, 80, 58, 57, 73, 60, 63, 80, 86, 63, 126, 78, 69, 66, 73, 74, 64, 59, 78, 64, 72, 115, 65, 94, 88, 59, 81, 70, 64, 62, 73, 64, 71, 57, 85, 102, 62, 59, 91, 92, 71, 96, 81, 74, 70, 88, 72, 77, 78, 78, 60, 52, 64, 85, 71, 68, 98, 80, 74, 59, 87, 66, 75, 60, 95, 84, 85, 66, 56, 98, 93, 92, 75, 72, 65, 91, 56, 121, 71, 71, 77, 80, 89, 82, 67, 76, 84, 84, 79, 100, 67, 72, 72, 65, 67, 70, 76, 63, 74, 74, 81, 71, 72, 86, 77, 78, 109, 69, 71, 73, 75, 59, 74, 67, 80, 81, 57, 75, 73, 74, 73, 83, 97, 90, 85, 68, 72, 69, 89, 66, 70, 57, 135, 70, 70, 63, 83, 57, 69, 69, 74, 70, 76, 65, 60, 87, 80, 67, 66, 67, 136, 64, 73, 71, 100, 77, 81, 58, 73, 77, 83, 64, 63, 81, 98, 65, 74, 68, 64, 68, 60, 75, 79, 95, 76, 70, 74, 88, 66, 68, 74, 94, 67, 85, 75, 94, 74, 72, 92, 77, 71, 80, 66, 65, 70, 90, 74, 84, 54, 109, 71, 60, 66, 77, 71, 62, 85, 92, 92, 85, 86, 71, 104, 64, 67, 71, 67, 70, 82, 53, 85, 95, 66, 70, 79, 79, 62, 70, 62, 88, 64, 90, 57, 86, 62]
Training done.

Would be great to hear if you observe any other oddities when enabling memory saving mode.

PS: @Innixma, the above example is training 300 estimators on the kddcup upselling data set. All in all, RAM consumption is about the same as stock scikit-learn, and about 15 GB in total

@Innixma
Copy link

Innixma commented Mar 7, 2023

@ahuber21 I'd be happy to test assuming the issue of RandomForest predictive performance not aligning between scikit-learn and intelex has been resolved.

If the performance is identical given identical hyperparameters, then training/inference speedups would be meaningful and I'd try to find time to benchmark it, but would like to confirm if the accuracy delta has been fixed.

@Innixma
Copy link

Innixma commented Mar 7, 2023

For example, it appears that users are still reporting model quality deltas for RF: #1090

@ahuber21
Copy link
Contributor

ahuber21 commented Mar 7, 2023

Fair point. I actually made quite some progress in the meantime and I'm hoping to provide more details soon. Naturally, I will be testing the model accuracy after I applied my changes, so I hope that I can comment on #1090 as well.

@smith558
Copy link

Same issue. Training a many features RandomForestClassifier with an observable increase in execution time with sklearnex compared to standard scikit-learn.

@ahuber21
Copy link
Contributor

ahuber21 commented Mar 23, 2023

@smith558 we're finalizing checks on uxlfoundation/oneDAL#2292. Please check the next release for an update. If the problem persists, please post a reproducer. Thanks

@syakov-intel
Copy link

Please reopen if the issues persists.

@smith558
Copy link

@syakov-intel, when is the next release planned?

@ahuber21
Copy link
Contributor

ahuber21 commented Apr 20, 2023

The fix from uxlfoundation/oneDAL#2292 is included in 2023.1.1. Feel free to try it out
https://anaconda.org/intel/scikit-learn-intelex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants