Shoggoth Mini – A soft tentacle robot powered by GPT-4o and RL
Shoggoth Mini – A soft tentacle robot powered by GPT-4o and RL
RULER – Easily apply RL to any agent
Good article how reinforcement learning improved current AI models. Also illustrates that LLMs today are not just imitating.
https://arstechnica.com/ai/2025/07/how-a-big-shift-in-training-llms-led-to-a-capability-explosion/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social
#AI #reinforcementlearning
"Intelligence is figuring out how the world works rather than waiting for someone to tell you how the world works."
Join us as we hear from Andrew Barto and Richard Sutton, the 2024 #ACMTuringAward recipients as they discuss their work on #ReinforcementLearning.
AI Progress: Understanding the Myths Behind the Intelligence Explosion
As the AI landscape evolves, the concept of an 'intelligence explosion' raises questions about the future of AI development. This article delves into the current state of AI, addressing misconceptions...
https://news.lavx.hu/article/ai-progress-understanding-the-myths-behind-the-intelligence-explosion
My colleagues at TU Delft are seeking to hire a postdoc to work on Applied Planning and Scheduling under Uncertainty, with applications in modelling supply chain scenarios for offshore wind farm installation: https://careers.tudelft.nl/job/Delft-Postdoc-in-Applied-Planning-and-Scheduling-under-Uncertainty-2628-CD/814890902/
New instance, new #introduction!
I'm a #DataScientist with a background in #ReinforcementLearning and #ElectricalEngineering. Well, that's what my resume says, but really I'm a #poet and a SF/F #writer. I love to play #DnD and other #TTRPGs.
I use they/them pronouns, and "Dr." not "Mr.", please and thank you.
I maintain a blog at www.seanpatrick.phd which includes a current list of publications, including my debut sonnet collection, "Love, Death, and Other Surprises."
#AI #MachineLearning #BiasInAI #STEMSaturday #DeepLearning #ComputerVision #Robotics #ReinforcementLearning
Meet the editors of "Mitigating Bias in Machine Learning" Dr. Carlotta Berry and Dr. Brandeis Hill Marshall (Brandeis Marshall, PhD)
This practical guide shows, step by step, how to use machine learning to carry out actionable decisions that do not discriminate based on numerous human factors, including ethnicity and gender.
On Sale On Amazon https://a.co/d/dtMizVH
Adding my love letter to
arxiv.org/pdf/2304.01315
Empirical Design in Reinforcement Learning
by
Andrew Patterson, Samuel Neumann, Martha White, Adam White
JMLR 25 (2024) 1-63
These aren’t the heroes we deserve, but they are the heroes we need.
If you've ever worked with a physical robot and #ReinforcementLearning you've had to deal with delays. Thinking takes time, even at computer speeds, and the world doesn't stop.
One way to minimize the delays is for the to world to act on new commands mid-cycle, rather than wait for its next turn.
Supercon 2023: Teaching Robots How to Learn - Once upon a time, machine learning was an arcane field, the preserve of a precious... - https://hackaday.com/2024/09/03/supercon-2023-teaching-robots-how-to-learn/ #reinforcementlearning #2023hackadaysupercon #machinelearning #algorithm #arduino #esp32s3 #cons
Someone just shared this awesome comic with me. Does anyone know the original source? (I can't read the small signature.) 3 Complaining #machinelearning robots : #SupervisedLearning - they gave me so much to read, and test! #unsupervised - Me too. But at least they told you the answers. #reinforcementlearning - At least you don't get punished for every wrong action.
In architecture diagrams, they are often drawn as separate boxes, but as I go to implement a handful of use cases, I’m having trouble making that abstraction.
Speculative musings welcome
A fun part of working on a #ReinforcementLearning workbench is that I get to think about how to connect different kinds of agents to different kinds of worlds – representation, interfaces, abstraction.
Something I’m stumbling on is representing models and planners.
Is there such a thing as a planner distinct from a model? Or is planning just something a model does?
In object-oriented programming terms, would a planner be a separate class from a model? Or would it be a method in a model class?
What kind of bug would make machine learning suddenly 40% worse at NetHack? - Enlarge (credit: Aurich Lawson)
Members of the Legendary Compu... - https://arstechnica.com/?p=2028789 #reinforcementlearning #imitationlearning #machinelearning #softwarebugs #roguelikes #moonphase #roguelike #nethack #gaming #bugs #cuda #ai
Mini-Quadkopter lernt Fliegen in Sekunden | heise online
https://heise.de/-9623443 #DeepReinforcementLearning #ReinforcementLearning #RL