Text Generation
Transformers
finance
domain-specialization
amd-rocm
mi300x
fine-tuned
ml-intern

AMD Finance LLM

Domain-specific large language model fine-tuned for finance, optimized for AMD Instinct MI300X GPUs via ROCm.

Model Details

  • Base Model: Qwen/Qwen2.5-7B-Instruct
  • Parameters: 7.6B
  • Architecture: Qwen2
  • Fine-tuned on: 500K financial instruction examples

Training Recipe

Derived from ODA-Fin (arXiv:2603.07223) + Fin-R1 (arXiv:2503.16252):

Hyperparameter Value
Learning rate 1e-5 (cosine, 10% warmup)
Epochs 3
Batch size 1 × 16 grad accum
Sequence length 8192
Precision bfloat16
Attention Flash Attention 2 (CK backend)

Hardware Stack

  • AMD Instinct MI300X (192GB HBM3)
  • ROCm 6.2 + PyTorch
  • Hugging Face Optimum-AMD + vLLM for serving

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("shah-shazid-askary/amd-finance-llm")
tokenizer = AutoTokenizer.from_pretrained("shah-shazid-askary/amd-finance-llm")

messages = [{"role": "user", "content": "What is a P/E ratio?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Related

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for shah-shazid-askary/amd-finance-llm

Base model

Qwen/Qwen2.5-7B
Finetuned
(2570)
this model

Dataset used to train shah-shazid-askary/amd-finance-llm

Spaces using shah-shazid-askary/amd-finance-llm 2

Papers for shah-shazid-askary/amd-finance-llm