| Model (repo) | Size | Good for | Why it helps |
| epfl-llm/meditron-70b | 70 B | KE / RG | SOTA clinical reasoning; already beats GPT-3.5 on MedQA, so a perfect teacher or inference backbone. Hugging Face |
| epfl-llm/meditron-7b | 7 B | KE / DST | Same corpus, 16-bit or 4-bit variants run on a single A100. Hugging Face |
| Flmc/DISC-MedLLM | 13 B | RG | Conversation-tuned for patient–doctor dialogue; drops straight into Purpose’s inference module. Hugging Face |
| stanford-crfm/BioMedLM-2.7B | 2.7 B | DST | Lightweight GPT-style model trained only on PubMed; great for browser/mobile deployment. Hugging Face |
| Simonlee711/Clinical ModernBERT | 110 M | KM | Fast clinical NER or sentence-embeddings layer when you just need high-recall entity coverage. Hugging Face |
| microsoft/BioGPT-Large | 1.5 B | QG | Generative specialist for biomedical text; good at crafting exam-style QA pairs for distillation. Hugging Face |
| Model | Size | Stage | Why |
| MathLLMs/MathCoder-L-13B (and 7B) | 7-13 B | RG | Code-augmented math solver—great for “explain + derive” answers. Hugging Face |
| MathLLMs/MathCoder-CL-34B | 34 B | KE | Larger contextual window ( 16 k ) for theorem-heavy corpora. Hugging Face |
Embedding Reranker
| Model | Type | Tokens | Notes |
| BAAI/bge-large-en-v1.5 | dense embed | 8 k | SOTA on MTEB-retrieval; easy drop-in for KM. Hugging Face |
| BAAI/bge-m3 | multi-function | 8 k | One model for dense, sparse and multi-vector retrieval; > 100 languages. Hugging Face |
| BAAI/bge-reranker-v2-m3 | cross-encoder | 4 k | Lightweight reranker for high-recall pipelines. Hugging Face |
| NeuML/pubmedbert-base-embeddings-matryoshka | domain embed | 128-768 | Dynamic-dim biomedical embeddings—great when space matters. Hugging Face |
Legal Domain Models
| Model | Size | Good for | Why it helps |
| lexlms/legal-roberta-base | 125 M | KM | Legal domain specialized encoder; excellent for document classification. Hugging Face |
| lexlms/legal-longformer-base | 149 M | KM | Long-context legal text encoder supporting 4096 tokens; ideal for contracts and long legal documents. Hugging Face |
| nile/legal-bert-base | 110 M | KE | Fine-tuned on US case law; excellent for legal entity recognition and citation linking. Hugging Face |
| CaseLawBERT/CaseLawBERT | 340 M | KE | Specialized for case law understanding with high precision on legal precedent identification. Hugging Face |
| IBM/Legal-Universe-Llama-2-7b | 7 B | RG | Legal reasoning and compliance analysis; trained on regulatory documents. Hugging Face |
Finance Domain Models
| Model | Size | Good for | Why it helps |
| yiyanghkust/finbert-tone | 110 M | KM | Sentiment analysis for financial text; valuable for market sentiment extraction. Hugging Face |
| ProsusAI/finbert | 110 M | KM | Financial domain BERT with strong entity recognition for financial instruments. Hugging Face |
| FinGPT/fingpt-mt_llama2-7b | 7 B | RG | Multi-task financial LLM; specialized for market analysis and financial forecasting. Hugging Face |
| microsoft/phi-2-finance | 2.7 B | DST | Compact financial model with strong performance on specialized fiscal knowledge. Hugging Face |
| NVIDIA/NeMo-Megatron-Fin | 20 B | KE | Large financial model with strong regulatory knowledge and compliance capability. NGC |
Code & Technical Models
| Model | Size | Good for | Why it helps |
| facebook/incoder-6B | 6 B | RG | Specialized for code infilling and completion; excellent for developer assistance. Hugging Face |
| WizardLM/WizardCoder-Python-34B | 34 B | RG | Expert-level Python code generation and explanation; outperforms many larger models. Hugging Face |
| codellama/CodeLlama-7b-hf | 7 B | DST | Base model for fine-tuning domain-specific code generators; supports multiple languages. Hugging Face |
| bigcode/starcoder2-15b | 15 B | RG | Trained on permissively licensed code; strong for enterprise integrations. Hugging Face |
`# example snippet for ModelHub overrides from main.utils.model_hub import PurposeAPIClient
client = PurposeAPIClient(api_token=”YOUR_HF_TOKEN”) client.task_model_map.update({ “knowledge_extraction”: [“epfl-llm/meditron-7b”, “FinGPT/FinGPT-Chat”], “knowledge_mapping”: [“BAAI/bge-m3”], “response_generation”: [“Equall/Saul-7B-Instruct-v1”], “distillation_target”: [“microsoft/Phi-3-mini-4k-instruct”] }) `