HTTP Archive 💾
@httparchive.org
760 followers 32 following 28 posts
Public dataset that tracks how the web is built. Maintained by @patmeenan.com, @paulcalvano.bsky.social, @tunetheweb.com, @maxostapenko.com, and Nurullah Demir
Posts Media Videos Starter Packs
Pinned
What do you think? Shall we do another Web Almanac this year? Diving deep into all the data we collect to see what’s changing in web trends?

Check out this post if interested in getting involved:
github.com/HTTPArchive/...

Or nominate your favorite experts that you’d love to see author a chapter.
Contribute to the 2025 Web Almanac · HTTPArchive almanac.httparchive.org · Discussion #4062
Dear all, We are excited to announce the Call for Contributions for the 2025 Web Almanac (6th Edition)! The Web Almanac is an annual report that provides an overview of the state of the web, based ...
github.com
There's still more things we want to add, and please do let us know what YOU want, but we've achieved feature parity with the old report (and then some!) so it's time to make the commitment and move forward!

Let us know any issues or requests here:
github.com/HTTPArchive/...
A massive "thank you!" to @fossheim.bsky.social for creating such a beautiful front end, and to @maxostapenko.com for doing all the backend work!

🙏 Thank you both 🙏
We've taken the new version of the Core Web Vitals Tech Report out of Beta and now consider this the report to use:

httparchive.org/reports/tech...

Anyone still using the old Looker Studio-based report, it's time to check out the new goodness with much quicker response times and improved UX!
Reposted by HTTP Archive 💾
New @httparchive.org analysis about sites adding AI Bots and Crawlers to their robots.txt files. While robots.txt doesn't "block" bots by itself, it's a clear demonstration of the preferences of site owners and the sentiment towards AI crawlers on today's web. paulcalvano.com/2025-08-21-a...
AI Bots and Robots.txt
There’s been a lot of discussion lately around AI crawlers and bots, which are used to train LLMs and/or fetch content on behalf of their users. In the past few weeks I’ve seen blog posts about the am...
paulcalvano.com
We're please to welcome Nurullah Demir as the newest maintainer of the HTTP Archive project!

Nurullah has helped lead the Web Almanac over the last two years and we look forward to seeing what this year's edition comes up with!

Welcome to the team Nurullah!
🚨 Calling all web experts! 🚨

The 2025 Web Almanac is still open for contributors!

Know someone perfect for it? Mention them here and help us reach the right folks. 🙌

📢 Please help us spread the word!

🔗 Learn more: github.com/HTTPArchive/...
Reposted by HTTP Archive 💾
It's that time of year where the web almanac is seeking contributors for the next edition.
It takes a fair amount of work,but it's ridiculously rewarding,
& guaranteed warm fuzzy feelings when you see the chapter you contributed towards published.

I encourage you to give it a go see:
Contribute to the 2025 Web Almanac · HTTPArchive almanac.httparchive.org · Discussion #4062
Dear all, We are excited to announce the Call for Contributions for the 2025 Web Almanac (6th Edition)! The Web Almanac is an annual report that provides an overview of the state of the web, based ...
github.com
What do you think? Shall we do another Web Almanac this year? Diving deep into all the data we collect to see what’s changing in web trends?

Check out this post if interested in getting involved:
github.com/HTTPArchive/...

Or nominate your favorite experts that you’d love to see author a chapter.
Contribute to the 2025 Web Almanac · HTTPArchive almanac.httparchive.org · Discussion #4062
Dear all, We are excited to announce the Call for Contributions for the 2025 Web Almanac (6th Edition)! The Web Almanac is an annual report that provides an overview of the state of the web, based ...
github.com
Reposted by HTTP Archive 💾
Ask me how excited I got seeing @johnmu.com cite my and @tamethebots.com ‘s Page Weight chapter at Search Central Live
John Mueller on stage in front of a graph of Page Weight by year
Reposted by HTTP Archive 💾
Day 010 #100DaysOfPerf: In an apt compliment and follow up to the page weight post, working with media is one rife with challenges. Media loading, media formats, media codecs... But it's what makes the web so valuable.
✨ MEDIA ✨ chapter from the @httparchive.org is another worth checking 🧵⬇️
illustration of an video camera projecting a PLAY BUTTON . caption: "MEDIA"
Reposted by HTTP Archive 💾
Though I shared a link from the @httparchive.org and the 2024 Web Almanac, sharing a chart from the the actual archive: over 5 yrs, the page weight has grown almost 30%, and there has never been an indication of slowing down. The almanac chapter covers much of the data proving so. 🧵⬇️
chart indication page weight of desktop and mobile pages between February 1 2020 and February 1 2025

MEDIAN DESKTOP
2678.0 KB 
up 28.7%

MEDIAN MOBILE
2409.8 KB
up 27.8%
Reposted by HTTP Archive 💾
Reposted by HTTP Archive 💾
Day 007 #100DaysOfPerf: Since I touched on the @httparchive.org Web Almanac, decided to continue sharing more of their content. A note: The HTTP Archive really started as a way to see how the web was built, and its performance. So it will always have a hint of performance discussion. Today? FONTS 🧵⬇️
Illustration of 'F' font being pushed down a conveyor belt by illustrated characters. Caption "fonts"
Reposted by HTTP Archive 💾
Day 006 #100DaysOfPerf: Let's keep it moving by highlighting a great piece from the @httparchive.org Web Almanac, and their #performance chapter. "No one ever complained about a fast website", is a classic quote over the years, and the Performance Chapter highlights web data and patterns 🧵⬇️
Illustration of a browser window, a stopwatch, and some resources. Caption: "part two, chapter 9, performance."
So they may be the same as desktop pages, they may be responsive and deliver different content to mobiles (based on user-agent, or viewport size), or they may be mobile-optimized sites not visited by desktop devices (e.g. m.facebook.com).
We get a list of origins visited by mobile devices (based on user-agent) from the Chrome User Experience Report (CrUX). We then crawl them with mobile viewport, mobile user agent, and 4G setting as detailed in our methodology: almanac.httparchive.org/en/2024/meth...
Methodology | The Web Almanac by HTTP Archive
Describes how the 2024 Web Almanac was put together: The Datasets and Tools used and how the project was run.
almanac.httparchive.org
An enormous thanks to all 88 contributors who made the 2024 edition possible.

And we've already started talking about a 2025 edition. Check out our GitHub repo to get a sense of the project, and there's an interest form if you want to help build the 2025 edition:
github.com/HTTPArchive/...
github.com
That marks the end of the 2024 Web Almanac. A little later than usual but worth it in the end.

And also means the ebook is now available:
almanac.httparchive.org/en/2024/tabl...

754 pages jam packed with all sorts of nerdy goodness!!!
Table of Contents | Web Almanac 2024
Table of Contents for the 2024 Web Almanac, listing each section: Page Contents, User Experience, Content Publishing, Content Distribution.
almanac.httparchive.org
Reposted by HTTP Archive 💾
I recently published my annual dive into the
@httparchive.org, focusing on page growth, #webperf and #ux:

www.speedcurve.com/blog/page-bl...

A common question is "How big SHOULD my pages be?" According to analysis by @infrequently.org, the ideal page should be <1.4 MB with <365 KB coming from JS.
A table that gives ideal versus actual page size and JavaScript size:

The ideal JS weight is under 365 kilobytes. The actual median JS weight is 650 kilobytes. The actual JS weight at the 90th percentile is 1825 kilobytes.

The ideal total page weight is under 1.4 megabytes. The actual median page weight is 2.6 megabytes. The actual JS weight at the 90th percentile is 11.1 megabytes.
Reposted by HTTP Archive 💾
Just published the results of my annual dive into the @httparchive.org. Key findings:

😱 Med page has grown 8%
😱 90p page has grown 24%
😱 90p mobile page is 10MB
😱 Main culprits: JS & video

Dig in & learn what your page size targets should be and how to hit them: www.speedcurve.com/blog/page-bl...
SpeedCurve | Page bloat update: How does ever-increasing page size affect your business and your users?
The median web page has grown 8% in one year. How does this affect your Core Web Vitals, your search rank, your business and your users?
www.speedcurve.com
Just about to start...
✨ The Web Almanac LIVE STREAM II ✨
featuring Chapters (authors):
🔸 SEO (Jamie, Mikael)
🔸 Privacy (Max O)
🔸 HTTP (Robin)
🔸 Cookies (Yana)
🔸 3rd Parties (Yash)
📆 Thursday, January 16th
⏰ 14h EST, everytimezone.com/s/6e8b3a3d
🔗 www.youtube.com/live/zCiMls2...
A 🔄 would be ✨🙏🏾✨.
Web Almanac
By HTTP Archive
THE WEB ALMANAC LIVE STREAM! Il. The Avatar of the host
The avatars for 6 persons. 
ONLINE
JANUARY 16, THURSDAY
time: 14h EST, 11h PST, 20h CET.
who: the authors!