The "attention" system works by having an "attention" function that takes three vectors, one called the "query", one called the "key", and one called the "values", and outputs a vector. Inside this "attention" function are a bunch of additional learnable parameters. It works because conceptually, the "attention" lets you "query" using any of these keys to find values that are "similar". https://theaisummer.com/transformer/
Server run by the main developers of the project It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!