Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make "Trusted Publishing" works for our PyPI releasing #41937

Open
potiuk opened this issue Sep 1, 2024 · 12 comments
Open

Make "Trusted Publishing" works for our PyPI releasing #41937

potiuk opened this issue Sep 1, 2024 · 12 comments
Assignees
Labels
area:dev-env CI, pre-commit, pylint and other changes that do not change the behavior of the final code area:dev-tools

Comments

@potiuk
Copy link
Member

potiuk commented Sep 1, 2024

We already agreed with the ASF infrastrucutre that it would be great to use Trusted publishing to publish packages to PyPI. Curently we are using "twine" and local API keys by release managers - but https://docs.pypi.org/trusted-publishers/ allows to configure our PyPI organisation to accept "Github Actions" workflows publishing to PyPI via dedicated workflows - where GitHub Actions is a trusted publisher.

The documentation explains how to do it - we will need to involve INFRA to configure it for our repository.

The idea to implement is is that Github Actions workflow should not "build" the packages to publish in PyPI - but they should download them from "https://downloads.apache.org/" and "https://dist.apache.org/repos/dist/dev/airflow/" for RC/Alpha/Beta packages, verify their integrity (checksums/signatures) similarly to https://github.com/apache/airflow/blob/main/dev/README_RELEASE_AIRFLOW.md and publishing those packages.

Intead of "twine upload", release manager should just run a workflow in GitHub Actions that should download packages from apache svn/downloads and publish them in PyPI after verification.

@potiuk potiuk converted this from a draft issue Sep 1, 2024
@potiuk potiuk added the area:dev-env CI, pre-commit, pylint and other changes that do not change the behavior of the final code label Sep 5, 2024
@potiuk
Copy link
Member Author

potiuk commented Oct 27, 2024

Just to add - example changes in workflows proposed by pip pypa/pip#13048

@gopidesupavan
Copy link
Member

@potiuk is this ready to pick? just checking.

@potiuk
Copy link
Member Author

potiuk commented Nov 26, 2024

Yep.

@gopidesupavan
Copy link
Member

Thanks, to my understand this one to implement for all the packages that we release today? airflow, providers, airflow python client?

I’m planning to implement in generalize way, so that it can be useful for other Apache projects. This would allow them to integrate their scripts if they have any custom, and run it as a GitHub Action, WDYT? :)

@potiuk
Copy link
Member Author

potiuk commented Nov 29, 2024

Thanks, to my understand this one to implement for all the packages that we release today? airflow, providers, airflow python client?

Absolutely. We can start with providers - then we will have a chance to test it quickly - and then use it for the others.

I’m planning to implement in generalize way, so that it can be useful for other Apache projects. This would allow them to integrate their scripts if they have any custom, and run it as a GitHub Action, WDYT? :)

Perfect. ASF infrastructure already has a repo for shared actions in ASF: https://github.com/apache/infrastructure-actions and we are just discussing to splitting it to separate actions, but the idea is to have something that will be reusable across many projects.

I think the best (and most reusable) way of publishing is to use packages released in "svn".

We should be able to plug in this step in the release process. And we have two different steps there:

  1. RC candidates.

We should just download all packages from https://dist.apache.org/repos/dist/dev/airflow/providers/pypi/

Yes it really useful with pypi folder and simple to make automated process verification

But we need to make sure that the "pypi" packages are also stored in SVN - the little difficulty is that we currently do not upload "pypi" RC packages to SVN. The "rc" candidates in SVN are different than the one we publish to PyPI, because they do not contain "rc" in the version (because potentially they might become final candidates to upload).

So we should modify the process and make sure that we upload "pypi" packages to SVN (I propose to add "pypi" subfolder - so the release process shoudl be updated to clean/recreate the pypi subfolder and push the pypi packages there.

Then our action should be as simple as a) download the right packages from SVN b) push them to PyPI via trusted publishing. I have a feeling that this might be simply a standard xml composite action using existing actions from GitHub - nothing fancy, we might just add a few options:

a) test mode - do everything except that final step will be just printing what should be done
b) verification options - it should be possible to also download and verify the signatures, checksums (and later maybe licences) in the downloaded artifacts.

  1. Final packages:

Final packages can be directly downloaded from https://github.com/apache/airflow/blob/main/dev/README_RELEASE_PROVIDER_PACKAGES.md#publish-release-to-svn - i.e. download and push packages from https://dist.apache.org/repos/dist/release/airflow/providers/ - the packages are the same as the ones uploaded to PyPI,

There - the little difficulty with it that we should know WHICH packages to upload. And this I think means that the easiest will be to do it just before this step:

for file in "${SOURCE_DIR}"/*
do
 base_file=$(basename ${file})
 cp -v "${file}" "${AIRFLOW_REPO_ROOT}/dist/${base_file//rc[0-9]/}"
 svn mv "${file}" "${base_file//rc[0-9]/}"
done

This makes sense to me and makes it easier to compare. This is where I’m considering providing a flexible option in the GitHub Action to allow adding any extra scripts/custom code for validation.

And upload them from the "dist" SVN not from the "release". Because at this moment, release manager already removed the files that were removed during voting - so "dist" contains only the packages that in a moment will be promoted to be "final" packages. But the files are still in "dev" and those are the only ones we should upload to PyPI. In "release" there will be all the previosly uploaded last version of the packages - even those we do not release now.

There, likely we need some controls - for example, being able to manually override which packages we want to publish.

We might start with a simple set of features, but later on that action might become a little more feature-full.

Yes, I’ll start with a simple approach, review the design later, and build on it incrementally.

@potiuk
Copy link
Member Author

potiuk commented Nov 29, 2024

BTW. Speaking of licences - @Claudenw works on a new version of RAT that should allow us to verify licences without unpacking the files, so once this is released (I guess very soon) - we wil be able to use it in the GitHub Action and do the sources licence verification. Here, one of the options we will have to add is to be able to provide an extra file with "ratignore".

@potiuk
Copy link
Member Author

potiuk commented Nov 29, 2024

Also later we might add reproducibility check - so we will be able to build the same packages locally on GitHub Actions and compare them with the ones we pulled from SVN - that would be an ultimate check for reproducibility - whether there are no MITM attacks etc. and will be ultimately a way to verify if the future platform running ATR (Apache Trusted Releases) build has not been compromised.

@gopidesupavan
Copy link
Member

Thats fantastic :) I was reviewing the current release process steps and had a few questions about the packages published to SVN versus those published to PyPI. Thanks for clarifying—it's much clearer now.

The idea of a test/verification mode is excellent!

@potiuk
Copy link
Member Author

potiuk commented Nov 29, 2024

Yeah. the main thing is that the "RC" packages we keep in SVN do not have RC in version - as they can "eventually" when voted become final packages (we just move them to "release" SVN when they graduate). But we cannot upload them to PyPI - becuase PyPI packages are "immutable" - so we need to prepare the "pypi" version of the same RC.

BTW. This RC variation MIGHT change in the future - PyPI team has two things they are working on:

  • https://peps.python.org/pep-0759/ PEP 759 – External Wheel Hosting - so we might be able to just upload metadata to PYPI and host our packages directly from SVN
  • (no Draft PEP yet - but there are discussions) - allow to have "staging" releases of packages - i.e. upload packages with pre-release versions as the "same" packages but not "finalize" them - immutability will come when the package is put in "released" staget

@gopidesupavan gopidesupavan self-assigned this Nov 29, 2024
@gopidesupavan gopidesupavan moved this from Ready to In progress in CI / DEV ENV planned work Nov 29, 2024
@gopidesupavan
Copy link
Member

gopidesupavan commented Dec 1, 2024

Alright have basic feat workflow. it will not do any publish but it checks out the packages from the svn and validates checksum and svn check. if this workflow design and approach make sense we can further extend this one.

Action repo:
gopidesupavan/gh-svn-pypi-publisher#1

This is usage repo:
config: https://github.com/gopidesupavan/example-workspace/blob/main/publish-config.yml
In this config i have referred apache svn repo with release provider folder to test, but that can be changed to pypi folder. :)

sample github action workflow: https://github.com/gopidesupavan/example-workspace/blob/main/.github/workflows/custom-action-example.yml

SVN check:
It takes all the extensions from the config file and checks if the package has all the required extensions or not.

Checksum check:
It takes the checksum type from the config file and checks if the package has the required checksum or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-env CI, pre-commit, pylint and other changes that do not change the behavior of the final code area:dev-tools
Projects
Status: In progress
Development

No branches or pull requests

2 participants