shakedown.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A community for live music fans with roots in the jam scene. Shakedown Social is run by a team of volunteers (led by @clifff and @sethadam1) and funded by donations.

Administered by:

Server stats:

285
active users

#webcrawling

0 posts0 participants0 posts today
IT News<p>Sites scramble to block ChatGPT web crawler after instructions emerge - Enlarge (credit: Getty Images) </p><p>Without announcement, OpenAI re... - <a href="https://arstechnica.com/?p=1960108" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="">arstechnica.com/?p=1960108</span><span class="invisible"></span></a> <a href="https://schleuss.online/tags/machinelearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machinelearning</span></a> <a href="https://schleuss.online/tags/webscraming" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>webscraming</span></a> <a href="https://schleuss.online/tags/webcrawling" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>webcrawling</span></a> <a href="https://schleuss.online/tags/aiethics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aiethics</span></a> <a href="https://schleuss.online/tags/chatgpt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>chatgpt</span></a> <a href="https://schleuss.online/tags/chatgtp" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>chatgtp</span></a> <a href="https://schleuss.online/tags/biz" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>biz</span></a>⁢ <a href="https://schleuss.online/tags/gptbot" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>gptbot</span></a> <a href="https://schleuss.online/tags/openai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>openai</span></a> <a href="https://schleuss.online/tags/tech" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>tech</span></a> <a href="https://schleuss.online/tags/ai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ai</span></a></p>
Doc Edward Morbius ⭕​<p><strong>Hacker News front-page analytics</strong></p><p>A question about what states were most-frequently represented on the HN homepage had me do some quick querying via Hacker News's Algolia search ... which is <strong>NOT</strong> limited to the front page. Those results were ... surprising (Maine and Iowa outstrip the more probable results of California and, say, New York). Results are further confounded by other factors.</p><p>Thread: <a href="https://news.ycombinator.com/item?id=36076870" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="ellipsis">news.ycombinator.com/item?id=3</span><span class="invisible">6076870</span></a></p><p>HN provides an interface to historical front-page stories (<a href="https://news.ycombinator.com/front" rel="nofollow noopener noreferrer" target="_blank"><span class="invisible">https://</span><span class="">news.ycombinator.com/front</span><span class="invisible"></span></a>), and <em>that</em> can be crawled by providing a list of corresponding date specifications, e.g.:</p><pre><code>https://news.ycombinator.com/front?day=2023-05-25<br></code></pre><p>Easy enough.</p><p>So I'm crawling that and compiling a local archive. Rate-limiting and other factors mean that's only about halfway complete, and a full pull will take another day or so.</p><p>But I'll be able to look at story titles, sites, submitters, time-based patterns (day of week, day of month, month of year, yearly variations), and other patterns. There's also looking at mean points and comments by various dimensions.</p><p>Among surprises are that as of January 2015, among the highest consistently-voted sites is The Guardian. I'd thought HN leaned consistently less liberal.</p><p>The full archive will probably be &lt; 1 GB (raw HTML), currently 123 MB on disk.</p><p>Contents are the 30 top-voted stories for each day since 20 February 2007.</p><p>If anyone has suggestions for other questions to ask of this, fire away.</p><p>And, as of early 2015, top state mentions are:</p><pre><code> 1. new york: 150<br> 2. california: 101<br> 3. texas: 39<br> 4. washington: 38<br> 5. colorado: 15<br> 6. florida: 10<br> 7. georgia: 10<br> 8. kansas: 10<br> 9. north carolina: 9<br>10. oregon: 9<br></code></pre><p>NY is highly overrepresented (NY Times, NY Post, NY City), likewise Washington (Post, Times, DC). Adding in "Silicon Valley" and a few other toponyms boosts California's score markedly. I've also got some city-based analytics.</p><p><a href="https://toot.cat/tags/hn" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>hn</span></a> <a href="https://toot.cat/tags/hackernews" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>hackernews</span></a> <a href="https://toot.cat/tags/data" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>data</span></a> <a href="https://toot.cat/tags/DataAnalysis" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataAnalysis</span></a> <a href="https://toot.cat/tags/WebCrawling" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>WebCrawling</span></a></p>
Dawn A<p>Doing an introduction here:</p><p>I'm Dawn from <a href="https://seocommunity.social/tags/Manchester" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Manchester</span></a>. <br>Work as <a href="https://seocommunity.social/tags/SEO" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SEO</span></a> consultant. <br>Love <a href="https://seocommunity.social/tags/TechSEO" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TechSEO</span></a> and <a href="https://seocommunity.social/tags/Contentstrategy" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Contentstrategy</span></a>, <a href="https://seocommunity.social/tags/webstrategy" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>webstrategy</span></a> , <a href="https://seocommunity.social/tags/ecommerce" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ecommerce</span></a>, <a href="https://seocommunity.social/tags/webcrawling" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>webcrawling</span></a>, <a href="https://seocommunity.social/tags/digitalmarketing" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>digitalmarketing</span></a>, <a href="https://seocommunity.social/tags/tech" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>tech</span></a>. <br>Learning <a href="https://seocommunity.social/tags/computerscience" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>computerscience</span></a>, <a href="https://seocommunity.social/tags/coding" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>coding</span></a> including <a href="https://seocommunity.social/tags/python" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>python</span></a>, <a href="https://seocommunity.social/tags/javascript" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>javascript</span></a>, <a href="https://seocommunity.social/tags/kotlin" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>kotlin</span></a>. <br>Keen interest in following <a href="https://seocommunity.social/tags/datascience" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>datascience</span></a>, <a href="https://seocommunity.social/tags/informationretrieval" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>informationretrieval</span></a>, <a href="https://seocommunity.social/tags/tech" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>tech</span></a> topics. <br>Love <a href="https://seocommunity.social/tags/running" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>running</span></a>, <a href="https://seocommunity.social/tags/trailrunning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>trailrunning</span></a>, <a href="https://seocommunity.social/tags/baking" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>baking</span></a>.<br>Love <a href="https://seocommunity.social/tags/pomeranians" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>pomeranians</span></a> <a href="https://seocommunity.social/tags/animals" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>animals</span></a><br>99% <a href="https://seocommunity.social/tags/vegan" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>vegan</span></a>, 100% <a href="https://seocommunity.social/tags/vegetarian" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>vegetarian</span></a>. Learning is a big part of my every day</p>