clifff @clifff

**Coach Pāṇini ®** @paninid@mastodon.world · Apr 27

**Coach Pāṇini ®** @paninid@mastodon.world · Apr 27

Coach Pāṇini ® @paninid@mastodon.world

“Some early examples that the team tried their #Transformer #architecture on included English-to-German translation, generating Wikipedia articles on "The Transformer", and parsing. These convinced the team that the Transformer is a general purpose #language model, and not just good for #translation.”

https://en.wikipedia.org/wiki/Attention_Is_All_You_Need

en.wikipedia.orgAttention Is All You Need - Wikipedia

#AttentionIsAllYouNeed

**Sir Rochard 'Dock' Bunson** @SrRochardBunson@universeodon.com · Mar 19

**Dr. Fortyseven █▓▒░** @fortyseven@defcon.social · Mar 19

**Oneleggedjedi** @Oneleggedjedi@mastodon.social · Mar 10

**IT News** @itnewsbot@schleuss.online · Jan 27

**Anthropy** @anthropy@mastodon.derg.nz · Dec 27, 2024 *

Dec 27, 2024 *

Anthropy @anthropy@mastodon.derg.nz

https://www.youtube.com/watch?v=UGO_Ehywuxc
absolutely amazing talk on #LLM "Mechanistic Interpretability", or in layman's terms, basically how to untangle LLMs and understand how they think, and what part of the neurons are responsible for something like 'doubt'.

I really like they use approachable not overly technical language in this, I can really recommend watching this if you want to understand more about how #Transformer architecture works.

YouTubeThe Dark Matter of AI [Mechanistic Interpretability]By Welch Labs

**John Autry** @JohnAutry@mindly.social · Nov 8, 2024

**IT News** @itnewsbot@schleuss.online · Nov 5, 2024

**derPUPE** @derPUPE@chaos.social · Oct 23, 2024

**Darrin Carlson** @dcb97@mastodon.social · Jul 8, 2024

**Lizard** @LizardSF@universeodon.com · Jun 12, 2024

**Lizard** @LizardSF@universeodon.com · May 30, 2024

**Great Pop Culture Debate** @gpcd@mas.to · May 18, 2024

May 18, 2024

Great Pop Culture Debate @gpcd@mas.to

Time for Culture Club, where our panel shares the movies, music, TV shows, books and more they’re loving this week.

Have you seen/heard/read any of the picks in the checklist?

#bridgerton #netflix #DragRace

**Mia’s Simulacrum** @m@lgbtqia.space · May 1, 2024 *

May 1, 2024 *

Mia’s Simulacrum @m@lgbtqia.space

A while ago i wrote here about a paper that has discovered that LLMs store knowledge in reproductions of simple linear functions. Although not all knowledge more things like "what is the capital of germany". [0]

There is something called Kolmogorov Arnold Networks [1] which build it's network directly on top of mathematical function in addition with some activation function.

So i am asking me now what a mixed network would be able to do, i imagine it quite performant to be able to store linear functions as such and not to have them simulated by a group of neurons.
But then again to bring the model to store these information's directly into such function is maybe impossible without some resource usage reward... which has the potential to introduce a whole extra of problems.
And then there is the problem with how to integrate such a thing in the first place.

[0] https://arxiv.org/pdf/2308.09124

[1] https://github.com/KindXiaoming/pykan

#AI #LLM #KAN

**IT News** @itnewsbot@schleuss.online · Feb 15, 2024

**IT News** @itnewsbot@schleuss.online · Feb 14, 2024

**Jonathan Emmesedi** @jemmesedi@c.im · Feb 10, 2024

**RockerDoc (Ricardo Ismach)** @RockerDoc@mastodon.social · Jan 19, 2024

Jan 19, 2024

RockerDoc (Ricardo Ismach) @RockerDoc@mastodon.social

Awesome! Japan has landed a #Transformer on the #Moon! https://www.npr.org/2024/01/18/1225328376/japan-moon-landing-robot-transformers-jaxa

Recent searches

Search options

Administered by:

Server stats:

#transformer