Learning Materials
- Intro to Large Language Models by Andrej Karpathy
- Short Courses by DeepLearning.AI
- What We Learned from a Year of Building with LLMs: Part 1 Part 2 Part 3
- Book Understanding LangChain4j by Antonio Goncalves
Local LLMs
- LocalLLaMA community on Reddit
- Ollama
- LocalAI
- Guide to Choosing Quantization Methods and Inference Engines
Evaluations
- Your AI Product Needs Evals
- Creating a LLM-as-a-Judge That Drives Business Results
- A Practical Guide to RAG Pipeline Evaluation (Part 1: Retrieval)
- A Practical Guide to RAG Pipeline Evaluation (Part 2: Generation)
- How important is a Golden Dataset for LLM evaluation?
- Case Study: Reference-free vs Reference-based evaluation of RAG pipeline
- How to evaluate complex GenAI Apps: a granular approach
- Generate Synthetic Data to Test LLM Applications
Agents
Building effective agents by Anthropic
Leaderboards
Language Models
- LMSYS Chatbot Arena
- SEAL Leaderboards
- Comparing models for quality, speed, price, etc.
- Hallucinations: Vectara, Hallucinations
- Code Generation: BigCode
- Tools/Functions: Gorilla, Nexus, Toolbench
- Performance (latency, throughput, memory, etc.)
- Enterprise Scenarios