shakedown.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A community for live music fans with roots in the jam scene. Shakedown Social is run by a team of volunteers (led by @clifff and @sethadam1) and funded by donations.

Administered by:

Server stats:

285
active users

#trainingdata

0 posts0 participants0 posts today
Erik Jonker<p>Just wondering how you collect as much Mastodon content as possible for AI training purposes ?<br><a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/Mastodon" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Mastodon</span></a> <a href="https://mastodon.social/tags/trainingdata" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>trainingdata</span></a></p>
Coach Pāṇini ®<p><span class="h-card" translate="no"><a href="https://mastodon.social/@RuthMalan" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>RuthMalan</span></a></span> </p><p>The worst-case scenario here is you get sued?</p><p>Apparently, this author/text was not included in the Book3 dataset of pirated content used for LLM <a href="https://mastodon.world/tags/trainingdata" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>trainingdata</span></a>.</p>
IT News<p>Training a Self-Driving Kart - There are certain tasks that humans perform every day that are notoriously difficu... - <a href="https://hackaday.com/2024/12/21/__trashed-11/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">hackaday.com/2024/12/21/__tras</span><span class="invisible">hed-11/</span></a> <a href="https://schleuss.online/tags/convolutionalneuralnetwork" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>convolutionalneuralnetwork</span></a> <a href="https://schleuss.online/tags/machinelearning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>machinelearning</span></a> <a href="https://schleuss.online/tags/self" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>self</span></a>-driving <a href="https://schleuss.online/tags/trainingdata" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>trainingdata</span></a> <a href="https://schleuss.online/tags/autonomous" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>autonomous</span></a> <a href="https://schleuss.online/tags/crazykart" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>crazykart</span></a> <a href="https://schleuss.online/tags/training" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>training</span></a> <a href="https://schleuss.online/tags/go" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>go</span></a>-kart</p>
Erik Jonker<p>Harvard Is Releasing a Massive Free AI Training Dataset Funded by OpenAI and Microsoft. The project’s leader says that allowing everyone to access the collection of public-domain books will help “level the playing field” in the AI industry.<br><a href="https://archive.is/DrzFn#selection-575.0-581.152" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">archive.is/DrzFn#selection-575</span><span class="invisible">.0-581.152</span></a><br><a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/Harvard" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Harvard</span></a> <a href="https://mastodon.social/tags/trainingdata" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>trainingdata</span></a></p>
Coach Pāṇini ®<p><span class="h-card" translate="no"><a href="https://fosstodon.org/@wook" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>wook</span></a></span> <span class="h-card" translate="no"><a href="https://mastodon.social/@kottke" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>kottke</span></a></span> <br>I’m not 💯convinced that deleting tweets even does that. I suspect that he would have that data removed from the front-end UI, but still exist in the back-end database for <a href="https://mastodon.world/tags/trainingdata" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>trainingdata</span></a> purposes.</p>
petersuber<p>Update. HELIOS Open (<span class="h-card" translate="no"><a href="https://bird.makeup/users/heliosopen" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>heliosopen</span></a></span>) comments on v. 0.0.9 of the Open Source Initiative (<a href="https://fediscience.org/tags/osi" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>osi</span></a>, <span class="h-card" translate="no"><a href="https://social.opensource.org/@osi" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>osi</span></a></span>) definition of <a href="https://fediscience.org/tags/OpenSource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenSource</span></a> <a href="https://fediscience.org/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a>.</p><p>"If the definition doesn’t start by emphasizing the openness of training data out of the gate, [we] worry it will not get added in later." </p><p><a href="https://fediscience.org/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a></p>
Coach Pāṇini ®<p><span class="h-card" translate="no"><a href="https://chaosfem.tw/@theogrin" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>theogrin</span></a></span> </p><p>Last year, I wrote a series of posts to emphasize that the most underestimated work-stream of <a href="https://mastodon.world/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> strategy involved <a href="https://mastodon.world/tags/DataGovernance" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataGovernance</span></a>, <a href="https://mastodon.world/tags/DataQuality" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataQuality</span></a>, and <a href="https://mastodon.world/tags/DataStewardship" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataStewardship</span></a> of <a href="https://mastodon.world/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a> libraries, because <a href="https://mastodon.world/tags/LLMs" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LLMs</span></a> are a <a href="https://mastodon.world/tags/DataProducts" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataProducts</span></a>.</p><p>I’m just a bro on the internet.</p><p>It didn’t have to be this way 🤷🏻‍♂️</p><p><a href="https://www.superversive.co/blog/how-to-make-it-work-for-you" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">superversive.co/blog/how-to-ma</span><span class="invisible">ke-it-work-for-you</span></a></p>
petersuber<p>Update. The Open Source Initiative (<span class="h-card" translate="no"><a href="https://social.opensource.org/@osi" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>osi</span></a></span>, <a href="https://fediscience.org/tags/OSI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OSI</span></a>) is trying to define what counts as <a href="https://fediscience.org/tags/OpenSource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenSource</span></a> <a href="https://fediscience.org/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a>. <br><a href="https://simonwillison.net/2024/Aug/27/open-source-ai/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">simonwillison.net/2024/Aug/27/</span><span class="invisible">open-source-ai/</span></a></p><p>"There is one very notable absence from the definition: while it requires the code and weights be released under an OSI-approved <a href="https://fediscience.org/tags/license" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>license</span></a>, the <a href="https://fediscience.org/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a> itself is exempt from that requirement."</p>
Coach Pāṇini ®<p>The quality and stewardship of <a href="https://mastodon.world/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a> will correlate to the veracity and value of the <a href="https://mastodon.world/tags/LLM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LLM</span></a>.</p><p><a href="https://www.superversive.co/blog/ai-product-governance" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">superversive.co/blog/ai-produc</span><span class="invisible">t-governance</span></a></p>
Ecologia Digital<p>"It became clear from the panel’s discussion that the biggest challenge in defining <a href="https://mato.social/tags/opensourceAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>opensourceAI</span></a> lies in addressing the role of <a href="https://mato.social/tags/trainingdata" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>trainingdata</span></a>. Large language models (<a href="https://mato.social/tags/LLMs" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LLMs</span></a>) rely on vast data sets, often scraped from the internet without explicit permission. This messy data raises thorny questions about privacy, copyright and ethics.</p><p>Indeed, we know some of this data is flatly illegal."</p><p><a href="https://thenewstack.io/open-source-ai-what-about-data-transparency/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">thenewstack.io/open-source-ai-</span><span class="invisible">what-about-data-transparency/</span></a></p>
Coach Pāṇini ®<p><a href="https://mastodon.world/tags/ModelExplainability" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ModelExplainability</span></a>, <a href="https://mastodon.world/tags/DataLineage" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataLineage</span></a>, and editing the <a href="https://mastodon.world/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a> set are topics that will be in the news next year…assuming we make it.<br><a href="https://social.lol/@rom/112543674749743641" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">social.lol/@rom/11254367474974</span><span class="invisible">3641</span></a></p>
Coach Pāṇini ®<p><a href="https://mastodon.world/tags/DataQuality" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataQuality</span></a> criteria that is even more applicable to <a href="https://mastodon.world/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a> for <a href="https://mastodon.world/tags/LLMs" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LLMs</span></a></p>
Coach Pāṇini ®<p>As a 15-year veteran of daily Twitter usage, thinking of the <a href="https://mastodon.world/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a> set as an input for anything besides <a href="https://mastodon.world/tags/shitposting" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>shitposting</span></a> is fncking hysterical.<br><a href="https://www.threads.net/@gwestr/post/C6rey-XS067" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">threads.net/@gwestr/post/C6rey</span><span class="invisible">-XS067</span></a></p>
Coach Pāṇini ®<p><span class="h-card" translate="no"><a href="https://hachyderm.io/@kellogh" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>kellogh</span></a></span> <span class="h-card" translate="no"><a href="https://mastodon.social/@AnnemarieBridy" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>AnnemarieBridy</span></a></span> </p><p>For enterprise applications, the most valuable <a href="https://mastodon.world/tags/trainingdata" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>trainingdata</span></a> is behind corporate firewalls, not out on the internet.</p><p>And if that’s the case, maybe the models don’t need to be large in the first place.</p>
Coach Pāṇini ®<p>This is what poisoning the <a href="https://mastodon.world/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a> well with <a href="https://mastodon.world/tags/SyntheticMedia" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SyntheticMedia</span></a> looks like 🤷🏻‍♂️<br><a href="https://mastodon.social/@craigbrownphd/112395066892761698" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">mastodon.social/@craigbrownphd</span><span class="invisible">/112395066892761698</span></a></p>
Coach Pāṇini ®<p><span class="h-card" translate="no"><a href="https://mastodon.social/@caseynewton" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>caseynewton</span></a></span> <span class="h-card" translate="no"><a href="https://mastodon.social/@zoeschiffer" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>zoeschiffer</span></a></span> <br>The solution for <a href="https://mastodon.world/tags/botshit" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>botshit</span></a> poisoning the <a href="https://mastodon.world/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a> is a bunch of walled gardens with SSO and interoperability?</p><p>We’re in the dumbest, darkest timeline.</p>
Coach Pāṇini ®<p>Maybe the end-state for our <a href="https://mastodon.world/tags/SyntheticMachine" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SyntheticMachine</span></a> <a href="https://mastodon.world/tags/solarpunk" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>solarpunk</span></a> future is the weaponization of <a href="https://mastodon.world/tags/nuclear" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>nuclear</span></a> weapons to the size of a football, eventually leads to the innovation Dr. Noonien Soong and <a href="https://mastodon.world/tags/cybernetics" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>cybernetics</span></a> needs to power the sheer compute required for instantaneous <a href="https://mastodon.world/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a> ingest, modeling, and prompt stimulus in a humanoid form.</p>
Coach Pāṇini ®<p><span class="h-card" translate="no"><a href="https://mastodon.social/@randulo" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>randulo</span></a></span> pattern-matching for common sense is a sign of creativity and intelligence. The LLMs are only as “smart” as their <a href="https://mastodon.world/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a>, and that’s just what was on the internet, half of which was fake, and the other half exaggerated.</p>
Coach Pāṇini ®<p>Half of the <a href="https://mastodon.world/tags/TrainingData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TrainingData</span></a> on the <a href="https://mastodon.world/tags/internet" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>internet</span></a> was fake, and the other half was exaggerated.</p><p>That’s why the “<a href="https://mastodon.world/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a>” evangelists refuse to talk about <a href="https://mastodon.world/tags/DataGovernance" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataGovernance</span></a>, <a href="https://mastodon.world/tags/DataStewardship" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataStewardship</span></a>, and <a href="https://mastodon.world/tags/DataQuality" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataQuality</span></a>, insisting on hiding behind a curtain.</p>
Erik Jonker<p>How easy would it be to use mastodon data for training AI ? <br>I would think collecting public posts from all instances is easy or are there some blocking measures to prevent collecting information. Personally i have no objection that public posts are used for training AI, i know however a lot of people won't like it probably.<br><a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/Mastodon" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Mastodon</span></a> <a href="https://mastodon.social/tags/trainingdata" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>trainingdata</span></a></p>