shakedown.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A community for live music fans with roots in the jam scene. Shakedown Social is run by a team of volunteers (led by @clifff and @sethadam1) and funded by donations.

Administered by:

Server stats:

270
active users

#gptbot

0 posts0 participants0 posts today
spielleitung<p><a href="https://mastodon.pnpde.social/tags/GPTBot" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPTBot</span></a> macht nach wie vor ca. 20% der Zugriffe dieser Mastodon-Instanz aus, aber der Crawler bekommt nur noch von <a href="https://mastodon.pnpde.social/tags/Iocaine" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Iocaine</span></a> generierten Unsinn. Das reduziert die Datenmenge, die wir an ihn ausliefern, drastisch und zerstört die Qualität unseres Datensatzes für ihn vollkommen.</p><p>Es hilft uns also Kosten zu sparen, verschlechtert die LLM und macht auch noch diebische Freude! Win-Win-Win! :KritischerTreffer: </p><p><a href="https://mastodon.pnpde.social/tags/MastoAdmin" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MastoAdmin</span></a> <a href="https://mastodon.pnpde.social/tags/OpenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAI</span></a></p>
Kevin Karhan :verified:<p><span class="h-card" translate="no"><a href="https://toot.cafe/@baldur" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>baldur</span></a></span> <em>nodds in agreement</em> at my current employer we had to block <a href="https://infosec.space/tags/OpenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAI</span></a>'s entire <a href="https://github.com/greyhat-academy/lists.d/blob/1a61ef878ec970c554f7263ec06d57fdc4d49e3e/scrapers.ipv4.block.list.tsv#L6" rel="nofollow noopener" target="_blank">IP ranges</a> as they literally <a href="https://infosec.space/tags/DDoS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DDoS</span></a>'d a <a href="https://infosec.space/tags/customer" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>customer</span></a> with spoofed <a href="https://infosec.space/tags/UserAgent" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>UserAgent</span></a>(s) [instead of using <a href="https://infosec.space/tags/GPTbot" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPTbot</span></a>]…</p><ul><li><em>It's really fucking annoying!</em></li></ul>
Kevin Karhan :verified:<p><span class="h-card" translate="no"><a href="https://mastodon.social/@khobochka" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>khobochka</span></a></span> guess why I <a href="https://github.com/greyhat-academy/lists.d/blob/main/scrapers.ipv4.block.list.tsv" rel="nofollow noopener" target="_blank">maintain</a> a <a href="https://infosec.space/tags/Scraper" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Scraper</span></a> <a href="https://infosec.space/tags/blocklist" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>blocklist</span></a>?</p><ul><li>In fact I know <em>multiple</em> people and organizations that decide to basically redirect <a href="https://infosec.space/tags/ValueRemoving" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ValueRemoving</span></a> <a href="https://infosec.space/tags/Scrapers" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Scrapers</span></a> like <a href="https://infosec.space/tags/GPTbot" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPTbot</span></a>, <a href="https://infosec.space/tags/ByteSpider" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ByteSpider</span></a> (which <a href="https://www.youtube.com/watch?v=Hi5sd3WEh0c" rel="nofollow noopener" target="_blank">literally</a> <a href="https://infosec.space/tags/DDoS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DDoS</span></a>'d <a href="https://infosec.space/tags/MattKC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MattKC</span></a> because <a href="https://infosec.space/tags/ClownFlare" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ClownFlare</span></a> are a <em>criminally incompetent</em> <a href="https://infosec.space/tags/RogueISP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RogueISP</span></a>!) to <a href="https://infosec.space/tags/Hetzner" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Hetzner</span></a>'s <a href="http://hil-speed.hetzner.com/" rel="nofollow noopener" target="_blank">10GB Speedtest file</a> which can be found at <code>http://hil-speed.hetzner.com/10GB.bin</code> as an extra middlefinger!</li></ul><p><a href="https://infosec.space/tags/Cloudflare" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Cloudflare</span></a> <a href="https://infosec.space/tags/hetznered" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>hetznered</span></a> <a href="https://infosec.space/tags/ByteDance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ByteDance</span></a> <a href="https://infosec.space/tags/ChatGPT" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ChatGPT</span></a></p>
beSpacific<p>The <a href="https://newsie.social/tags/NewYorkTimes" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NewYorkTimes</span></a> has blocked <a href="https://newsie.social/tags/OpenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAI</span></a>’s <a href="https://newsie.social/tags/webcrawler" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>webcrawler</span></a>, meaning that OpenAI can’t use content from the publication to train its AI models. If you check the NYT’s robots.txt page, you can see that the NYT disallows <a href="https://newsie.social/tags/GPTBot" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPTBot</span></a>, the crawler that OpenAI introduced earlier this month. Based on the <a href="https://newsie.social/tags/InternetArchive" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>InternetArchive</span></a>’s <a href="https://newsie.social/tags/WaybackMachine" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WaybackMachine</span></a>, it appears NYT blocked the crawler as early as August 17th. <a href="https://www.theverge.com/2023/8/21/23840705/new-york-times-openai-web-crawler-ai-gpt" rel="nofollow noopener" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">theverge.com/2023/8/21/2384070</span><span class="invisible">5/new-york-times-openai-web-crawler-ai-gpt</span></a> <a href="https://newsie.social/tags/copyright" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>copyright</span></a> <a href="https://newsie.social/tags/legalresearch" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>legalresearch</span></a></p>
Paul Chambers🚧<p><a href="https://oldfriends.live/tags/OpenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAI</span></a> IP block ranges if you want to block them from your instance and scraping your content. I saw Mastodon devs added something to block <a href="https://oldfriends.live/tags/GPTBot" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPTBot</span></a> via robots.txt a few days ago. Here are the IP ranges:</p><p><a href="https://oldfriends.live/tags/MastoAdmin" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MastoAdmin</span></a> <a href="https://oldfriends.live/tags/FediBlock" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FediBlock</span></a></p><p>20.15.240.64/28<br>20.15.240.80/28<br>20.15.240.96/28<br>20.15.240.176/28<br>20.15.241.0/28<br>20.15.242.128/28<br>20.15.242.144/28<br>20.15.242.192/28<br>40.83.2.64/28</p><p><a href="https://openai.com/gptbot-ranges.txt" rel="nofollow noopener" target="_blank"><span class="invisible">https://</span><span class="">openai.com/gptbot-ranges.txt</span><span class="invisible"></span></a></p><p><a href="https://www.theverge.com/2023/8/7/23823046/openai-data-scrape-block-ai" rel="nofollow noopener" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">theverge.com/2023/8/7/23823046</span><span class="invisible">/openai-data-scrape-block-ai</span></a></p><p><a href="https://github.com/mastodon/mastodon/pull/26396" rel="nofollow noopener" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/mastodon/mastodon/p</span><span class="invisible">ull/26396</span></a></p>
IT News<p>Sites scramble to block ChatGPT web crawler after instructions emerge - Enlarge (credit: Getty Images) </p><p>Without announcement, OpenAI re... - <a href="https://arstechnica.com/?p=1960108" rel="nofollow noopener" target="_blank"><span class="invisible">https://</span><span class="">arstechnica.com/?p=1960108</span><span class="invisible"></span></a> <a href="https://schleuss.online/tags/machinelearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinelearning</span></a> <a href="https://schleuss.online/tags/webscraming" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>webscraming</span></a> <a href="https://schleuss.online/tags/webcrawling" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>webcrawling</span></a> <a href="https://schleuss.online/tags/aiethics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>aiethics</span></a> <a href="https://schleuss.online/tags/chatgpt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatgpt</span></a> <a href="https://schleuss.online/tags/chatgtp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatgtp</span></a> <a href="https://schleuss.online/tags/biz" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>biz</span></a>⁢ <a href="https://schleuss.online/tags/gptbot" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>gptbot</span></a> <a href="https://schleuss.online/tags/openai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>openai</span></a> <a href="https://schleuss.online/tags/tech" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>tech</span></a> <a href="https://schleuss.online/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a></p>
h o ʍ l e t t<p><a href="https://mamot.fr/tags/OpenAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenAI</span></a> just admitted it has a <a href="https://mamot.fr/tags/bot" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>bot</span></a> that crawls the <a href="https://mamot.fr/tags/web" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>web</span></a> to collect <a href="https://mamot.fr/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> training data. If you don't block <a href="https://mamot.fr/tags/GPTbot" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GPTbot</span></a>, that's self-sabotage.<br><a href="https://www.businessinsider.com/openai-gptbot-web-crawler-content-creators-ai-bots-2023-8?IR=T" rel="nofollow noopener" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">businessinsider.com/openai-gpt</span><span class="invisible">bot-web-crawler-content-creators-ai-bots-2023-8?IR=T</span></a></p>