Andris V
AndrisWillow
AI & ML interests
Mechanistic interpretability, model diffing and alignment
Organizations
None yet
models 9
AndrisWillow/Llama3.2-1B-1B_it-BatchTopK_cc_diff_K70-13l-resid_pre-16k
Updated
AndrisWillow/Qwen2.5-0.5B_it-cc_diff-20l-resid_pre-16k
Updated
AndrisWillow/Qwen2.5-0.5B-L13-k50-lr5e-05-CCLoss
Updated
AndrisWillow/Qwen2.5-0.5B-L13-mu4.1e-02-lr5e-05-CCLoss
Updated
AndrisWillow/Qwen2.5-0.5B-0.5B_it-cc_diff_WTemplTok-13l-resid_pre-14k
Updated
AndrisWillow/Llama3.2-1B-1B_it-cc_diff-13l-resid_pre-16k
Updated
AndrisWillow/Qwen2.5-0.5B-crosscoder-13resid_pre_IT-chat-template-2
Updated
AndrisWillow/Qwen2.5-1.5B-1.5B_it-cc_diff-15l-resid_pre-16k
Updated
AndrisWillow/Qwen2.5-0.5B-0.5B_it-cc_diff-13l-resid_pre-14k
Updated
datasets 12
AndrisWillow/pile-500k
Viewer • Updated • 500k • 6
AndrisWillow/lmsys-500k-Llama3.2_chat_format
Viewer • Updated • 500k • 7
AndrisWillow/diffing-stats-Llama-3.2-1B-L13-k70-lr5e-05-CCLoss
Viewer • Updated • 16.4k • 3
AndrisWillow/Pile-Lmsys-1m-tokenized-1024-Llama3.2_chat_format
Viewer • Updated • 924k • 3
AndrisWillow/pile-lmsys-mix-1m-Llama3.2_chat_format
Viewer • Updated • 1M • 3
AndrisWillow/Pile-Lmsys-1m-tokenized-1024-Qwen2.5-WTemplTok
Viewer • Updated • 967k • 3
AndrisWillow/Pile-Lmsys-1m-tokenized-Llama-3.2-1B
Viewer • Updated • 916k • 3
AndrisWillow/Pile-Lmsys_qwen_format-1m-tokenized-Qwen2.5
Viewer • Updated • 966k • 3
AndrisWillow/pile-lmsys_qwen_format-mix-1m
Viewer • Updated • 1M • 3
AndrisWillow/Pile-Lmsys-1m-tokenized-1024-Qwen2.5
Viewer • Updated • 951k • 4