Model Context Protocol and Why You Should Know About It
LLM, RAG, and MCP, and what it means for vet med
A little while ago, I posted a rather harsh criticism of the Veterinary Innovation Council’s guidance document titled “What Do Veterinary Professionals Need to Know about Artificial Intelligence in 2025?”
I stand by the review, although I acknowledge it’s on the sharper side. My biggest problem with it is the wildly ambitious title, as though a veterinary professional’s learning on AI in 2025 could be complete by April.
I get the instinct to take a stand and have an opinion, to establish yourself as an authority, but to make a claim like that of the authors’ ambitious title of a guidance document gets my hackles up.
Veterinary professional or not, your learning shouldn’t stop just because somebody else claims that they’ve covered everything you need to know. And, further, I’d be especially wary of those in power who confidently assure you that you need to look no closer.
Predictably and conveniently, the world of artificial intelligence and its large language models has provided me with an excellent example of why your learning shouldn’t stop with self-proclaimed experts who announce that they’ve covered all that the veterinary professional needs to know about artificial intelligence in 2025.
There’s an exciting new development in LLMs that seems poised to rapidly accelerate the model development. It’s a bit technical, so I’m going to start at the beginning.

Alright, let’s review!
A Large Language Model (LLM) is a type of artificial intelligence trained on vast amounts of text to understand and generate human-like language. It’s a super-powered auto-complete,1 predicting the desired response based on an insanely massive history. OpenAI’s ChatGPT, Anthropic’s Claude, Google’s Gemini, and Meta’s LLaMA are all examples of LLMs, and I will sometimes refer to them as “the big vanilla models.”
RAG stands for Retrieval-Augmented Generation. It’s a technique that combines the search engine with the LLM; a the system searches a database or a document set to find relevant information, and then an LLM uses the retrieved information to generate a response.
RAG engines have gotten some attention of late in veterinary medicine because a) much of the quality veterinary information isn’t publicly accessible for the LLMs’ training and b) there’s an awful lot of garbage veterinary advice that is publicly accessible for the LLMs’ training.
A RAG engine would be able to search a given database for the information, for example, a veterinary hospital’s records, veterinary literature, or a research database, for up to date information. That is all quite valuable in a field like ours.
The Model Context Protocol is a way for AI tools like chatbots, assistants, or agents to get to and use the right information at the right time. The RAG is what you do (give the model better information to use) and the MCP is how you do it. The MCP provides a way to do it reliably, cleanly, and at scale. Think of it as a scalable way of having a librarian retrieve any book you need mid-test.
What this does is give developers building LLM tools because it’s turned AI integration chaos into a manageable, modular order. Now developers have a shared protocol for building smarter, context-aware AI systems.
So what?
It treats LLMs like active agents, not just passive responders.
Instead of just being prompted, the model requests what it needs and the MCP routes the request.
It standardizes the interfaces.
Think of it like a USB for LLMs, now there’s way fewer things that need their own plugs.
It scales.
You don’t need to rewrite the whole thing to add a tool. It just plugs in with the existing interface.
It works and has widespread adoption.
It’s an open protocol supported by Anthropic, OpenAI, and Google. Tools like Copilot, Cursor, and Replit are using it. It’s winning the battle to become a standard.
Why should we care in vet med? Because now it’s a lot easier to build better things.
Instead of jamming SOAP notes, lab results, or treatment plans through brittle prompts and shaky logic writing between systems (scribes, PIMS, calculators, etc), you can register tools and expose structured data so the model can pull what it needs.
While some of the big LLM models can do things like access tools, use memory from chats, call APIs, or reference files you’ve uploaded, it was much more limited than it now can be.
This is cool! This is exciting! And while it might seem arcane to those of us who do simple things like use LLMs or spay cats for a living, it’s got the potential to have some real positive impact on the way that things are built in our corner of the world.
But don’t take my word for it, keep learning! It’s not the last thing you’ll need to know this year.
A wildly reductive analogy.