-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add HTTP/3 SQL query #111
Add HTTP/3 SQL query #111
Conversation
I do think this is the correct approach. Some further thoughts:
|
I'm not sure that's quite true yet. The June crawl is just finishing but for the May crawl I saw very little But anyway, I agree it's a short term measure and
Actually checking the
Let's take this point off line as not really related to this PR. We can explore as part of the HTTP chapter of the Web Almanac. I would ask have you any evidence to suggest it will abandon connections to switch to HTTP/3 connections? Seems inefficient when a connection is already established. |
Co-authored-by: Rick Viscomi <[email protected]>
Just follow up, I did some more digging on the "dip": select '2020_05_01' as date, count(1) as NumResponses,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2020_05_01_desktop`
UNION ALL
select '2020_06_01' as date, count(1) as NumResponses,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2020_06_01_desktop`
UNION ALL
select '2020_07_01' as date, count(1) as NumResponses,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2020_07_01_desktop`
UNION ALL
select '2020_08_01', count(1),
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2020_08_01_desktop`
UNION ALL
select '2020_09_01', count(1),
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2020_09_01_desktop`
UNION ALL
select '2020_10_01', count(1),
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2020_10_01_desktop`
UNION ALL
select '2020_11_01', count(1),
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2020_11_01_desktop`
UNION ALL
select '2020_12_01', count(1),
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2020_12_01_desktop`
UNION ALL
select '2021_01_01', count(1),
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2021_01_01_desktop`
UNION ALL
select '2021_02_01', count(1),
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2021_02_01_desktop`
UNION ALL
select '2021_03_01', count(1),
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3%') as anyh3,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-27=%') as h327,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') LIKE '%h3-29=%') as h329,
COUNTIF(REGEXP_EXTRACT(REGEXP_EXTRACT(respOtherHeaders, r'alt-svc = (.*)'), r'(.*?)(?:, [^ ]* = .*)?$') = 'clear') as clear,
COUNTIF(respOtherHeaders LIKE '%alt-svc%') as altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%') as cloudflare,
COUNTIF(lower(_cdn_provider) LIKE '%cloudflare%' AND respOtherHeaders LIKE '%alt-svc%') as cloudflare_altsvc,
COUNTIF(lower(_cdn_provider) LIKE '%google%') as google,
COUNTIF(lower(_cdn_provider) LIKE '%google%' AND respOtherHeaders LIKE '%alt-svc%') as google_altsvc
from `httparchive.summary_requests.2021_03_01_desktop`
order by date
The drop from August 2020 looks to be with Google disabling And similarly, apparently Cloudflare disabled QUIC for a bit while investigating issues - which is what we see in Nov 2020 - Jan 2021. Either way our metrics are also reflected in other graphs from other providers, so do think our graph is accurately reflecting support. |
HTTP/3 is here! Well QUIC is and HTTP/3 is not far behind. We've already noticed HTTP/2 dropping because of this so think we should add an HTTP/3 graph, so adding the query is the first part of this.
There's a few complexities to consider compared to the equivalent HTTP/2 graph:
The HTTP/2 query is ridiculously expensive (see #110 ). I've gone for the cheaper suggestion for HTTP/3 since we don't need that historical data when the data was wrong. It's not quite as cheap as the HTTP/2 suggestion as need to look at
alt-svc
response header (for reasons I'll discuss below), but still it's only $15 as opposed to $1,000+.HTTP/3 has been around for a while in pre-release versions and arguably Google's QUIC could be included in this going back nearly 10 years. However I've decided to limit this to HTTP/3 (aka
h3
), however I have included the last draft version (h3-29
) as that is what will become HTTP/3 as soon as the IETF signs of the RFCs — so it is HTTP/3 in all but name. It is also what is being used now, and so explains the offset from HTTP/2 dropping, whereas waiting forh3
will show a delay between that drop, and this ramp up. Thath3-29
version started appearing from June 2020 so the graph will start before HTTP/3 officially exists but think that's OK. This sheet ran the query for a few combinations and current usage is from 0% (pure HTTP/3 orh3
), 9-10% (HTTP/3,h3
orh3-29
), slightly higher 9-10% for all versions ofh3
, and 11.5% if we also include Google's QUIC. So I think the second option is the right one.HTTP/3 will often not be used by our crawl. This is because, by default the browser uses TCP and then receives an
alt-svc
header saying "hey I support HTTP/3, so next time you're talking to me why not use that." If it needs to open another connection (e.g. an uncredentialled connection for Fetch or the like) then it might use HTTP/3, but it's not gonna be consistent. This is different to HTTP/2 where it's negotiated as part of HTTPS so should be used if supported. So, rather than measure HTTP/3 usage, I've decided to measure HTTP/3 support by looking at protocol used ORalt-svc
support. Got some support online for this approach. Will need to add an explanation to the chart when we add this to the site to explain this difference from HTTP/2 and how it may double count usage in both HTTP/2 and HTTP/3 chart for some sites but think it's the right measure.