-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run evaluation in Docker #290
Comments
cc @xNJL @guillaume-le-fur who encountered this vulnerability during the last event. If you have other suggestions on improving things security wise don't hesitate. |
So we need to:
|
But I agree that running already the worker in a docker would remove some security issues. |
Thumbs up for separating training and scoring, also for efficiency (scoring can take time, now it blocks the dispatcher). We'd need to restructure ramp-workflow script but it's relatively straightforward. It's done together now so one can choose not to save predictions and reload them. It will help to make ramp-workflow more readable. |
I don't know if there is already an issue for it but it would be good to run submissions in a Docker container. That would allow limiting the amount of resources (CPU, memory) a submission can use and apply other restrictions (e.g. remove network access).
The step 1 of this could be to add another worker setup that would run the same conda worker but inside docker. One could mount relevant folders with miniconda and data. Very roughly something like,
I think by mounting the right folders, one might even use default docker images.
This would help with resource limits, but not with access to hidden test data. Since it will be present on the filesystem, users can access it (and this is what is happening the current teaching event we are doing with @massich and @mathurinm).
Step 2 would be to mount only the features of the hidden test set (i.e. without the target column) inside Docker, compute predictions, then score final predictions in a separate docker environment. So that target column can not be accessed in principle by users.
@glemaitre please comment if I forgot something (I have not looked in detail into how workers are implemented).
cc @maikia
The text was updated successfully, but these errors were encountered: