shakedown.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A community for live music fans with roots in the jam scene. Shakedown Social is run by a team of volunteers (led by @clifff and @sethadam1) and funded by donations.

Administered by:

Server stats:

255
active users

#datacleaning

0 posts0 participants0 posts today
Harald Klinke<p>The {emend} R package leverages large language models to clean and standardize data:<br>✔️ Fix typos<br>✔️ Map inconsistent entries<br>✔️ Translate + reorder factors<br>✔️ Clean addresses &amp; dates</p><p>📦 Install via CRAN: install.packages("emend")<br><a href="https://anuopensci.github.io/emend/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">anuopensci.github.io/emend/</span><span class="invisible"></span></a><br>⚠️ Reproducibility may vary across systems. Validate before use.<br><a href="https://det.social/tags/rstats" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>rstats</span></a> <a href="https://det.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://det.social/tags/DataCleaning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataCleaning</span></a> <a href="https://det.social/tags/LLM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LLM</span></a> <a href="https://det.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenSource</span></a> <a href="https://det.social/tags/DataScience" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataScience</span></a></p>
Harald Klinke<p>Need to clean messy data with typos, abbreviations, or inconsistent labels?<br>📦 The {emend} R package uses LLMs to standardize categories, fix addresses &amp; more – perfect for bioinformatics, business, or any data project.<br>🔗 <a href="https://cran.r-project.org/web/packages/emend/index.html" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">cran.r-project.org/web/package</span><span class="invisible">s/emend/index.html</span></a><br><a href="https://det.social/tags/rstats" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>rstats</span></a> <a href="https://det.social/tags/DataCleaning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataCleaning</span></a> <a href="https://det.social/tags/LLM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LLM</span></a> <a href="https://det.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://det.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenSource</span></a></p>
gaby_wald<p>Conseil de lecture : "Data Cleaning (Pocket Primer), Oswald Campesato". </p><p><a href="https://framapiaf.org/tags/Data" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Data</span></a> <a href="https://framapiaf.org/tags/DataCleaning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataCleaning</span></a> <a href="https://framapiaf.org/tags/DataEngineer" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataEngineer</span></a> <a href="https://framapiaf.org/tags/Bash" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Bash</span></a> <a href="https://framapiaf.org/tags/Shell" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Shell</span></a> <a href="https://framapiaf.org/tags/VendrediLecture" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>VendrediLecture</span></a> ... (sed, grep, awk et autres !)</p><p>Commandes basiques et bien plus pour le traitement de données !</p>
OpenRefine<p><a href="https://fosstodon.org/tags/introductions" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>introductions</span></a></p><p>Hello Fediverse, we are <a href="https://fosstodon.org/tags/OpenRefine" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenRefine</span></a>, an open source power tool for working with messy data! We finally have an account here and will be sharing news about what is happening in the project and the broader community.</p><p><a href="https://fosstodon.org/tags/DataCleaning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataCleaning</span></a> <a href="https://fosstodon.org/tags/FOSS" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>FOSS</span></a> <a href="https://fosstodon.org/tags/OpenData" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenData</span></a> </p><p>− <span class="h-card"><a href="https://mamot.fr/@pintoch" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>pintoch</span></a></span></p>
Pamela Oliver<p>Wondering whether anyone is interested in an ongoing thread about the process of <a href="https://sciences.social/tags/DataCleaning" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataCleaning</span></a> and <a href="https://sciences.social/tags/PaperWriting" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>PaperWriting</span></a> in our project about <a href="https://sciences.social/tags/NewsCoverage" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>NewsCoverage</span></a> of <a href="https://sciences.social/tags/BlackProtest" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>BlackProtest</span></a> in mainstream newswires and Black newspapers. Lots of data to clean (~10K articles &amp; ~10K events), so looking for ways to get papers along the way. We clean data by issue cluster, so I reviewed the <a href="https://sciences.social/tags/MillionManMarch" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>MillionManMarch</span></a> articles &amp; decided to see if there is a paper about that. I think I saw some interesting patterns.</p>