in reply to Codeberg

😲🤬 re: what's happened to @Codeberg today
The AI ballyhoo *is* a real DDoS against one of the few code hosting sites that has taken a stand against slurping #FOSS code into LLM training sets — in violation of #copyleft licenses.
This is what lawless & deregulation bring us. ∃ plenty of blame to go around, but #Microsoft & #GitHub deserve the bulk of it as they trailblazed the idea that FOSS code-hosting sites are lucrative targets.
giveupgithub.org
#GiveUpGitHub #FreeSoftware #OpenSource
in reply to Codeberg

It seems like the AI crawlers learned how to solve the Anubis challenges. Anubis is a tool hosted on our infrastructure that requires browsers to do some heavy computation before accessing Codeberg again. It really saved us tons of nerves over the past months, because it saved us from manually maintaining blocklists to having a working detection for "real browsers" and "AI crawlers".
in reply to Codeberg

eBPF could be more effective and easy on the CPU, since it acts on a way lower network layer. Anubis kinda has it's limits and it's way too easy to circumvent (as you found out)

Maybe it's worth it to consider eBPF (if not already happened)

And thanks guys for your work. I'm a proud supporter and I'll continue to support your work. Companies shouldn't control the Open Source space

This entry was edited (4 weeks ago)
in reply to Codeberg

Anubis is extremely easy to bypass, you just have to change the User-Agent to not contain Mozilla, please get proper bot protection.

ulveon.net/p/2025-08-09-vangua…
This post talks briefly about other alternatives. Try Berghain, Balooproxy, or go-away.

in reply to Codeberg

May be would it be possible to use the tool described in :

saturation.social/@clive/11497…


behold the "HTML bomb"

it's a defensive counterattack on AI web-scrapers that persistently scrape and rescrape your web site, even when you tell them not to

the bomb file *looks* like a tiny HTML page, but when scraped -- or even requested by a regular browser ...

... it unpacks into a huge-ass 10-gig HTML page ...

... which quickly crashes any browser or scraper

Item #6 in my latest "Linkfest" newsletter, free to read and subscribe to here: buttondown.com/clivethompson/a…