- specs on what each attention head does

Done:
- change attention to entire generated sequence
- change width of chatbot window
- add output token generation to attention, tokenization, etc
- output slider to look at each token (scrubber with top-5 at each position)
- experiment results side by side comparison
- output streaming for chatbot
- shorter, concise responses in system prompt
- add video links to glossary
    - three blue one brown