I need to make a new update post on all the AI stuff. Things move so fast that I often just can't be bothered! I'm making this post mostly for myself and people who ask about something very specific.
LangChain recently announced classes for creating custom agents (I think they had some standard Agents before that too though). Haystack has Agents too, although it seems that their definition explicitly involves looping until the output is deemed ok, as most implementations need to do this anyway.
The way I understand this and see it implemented is that it's essentially an abstraction that allows LLMs (or rather, a pipeline of LLM functions) to use "tools". A tool could for example be a calculator, a search engine, a webpage retriever, etc. The Agent has a prompt where it can reason about which tool it's supposed to use, actually use these, and make observations, which it can then output.
It also allows for the decomposition of a task and taking it step by step, which can make the system much more powerful. It's a lot closer to how a human might reason. An example of this general idea taken to the extreme is Auto-GPT which you can send on its merry way to achieve some high level goal for you and hope it doesn't cost you an arm and a leg. Anyone remember HustleGPT btw?
There's something called the ReAct framework (Reasoning + Acting -- I know, unfortunate name) which is the common "prompt engineering" part of this, and prompts using this framework are usually built in to these higher-level libraries like LangChain and Haystack. You might also see the acronym MRKL (Modular Reasoning, Knowledge and Language, pronounced "miracle") being used. This comes from this older paper (lol, last year is now "old"), and it seems that ReAct is basically a type of MRKL system that is also able to "reason". They might be used interchangeably though and people are often confused about where they differ. The ReAct paper has much clearer examples.
A common tool is now, of course, embeddings search, which you can then chain to completion / chat. You might remember two months ago when I said at the bottom of my post about GPT use cases that this is where I think the gold dust lies. Back then, I had linked gpt_index; it's now called llama_index and has become relatively popular. It lets you pick what models you want to use (including the OpenAI ones still, unlike what the rename might suggest), what vector store you want to use (including none at all if you don't have loads of data), and has a lot of useful functionality, like automatically chopping up PDFs for your embeddings.
Not too long ago, OpenAI released their own plugin for this, that has a lot of the same conveniences. One surprising thing: OpenAI's plugin supports milvus.io as a vector store (an open-source, self-hosted version of the managed pinecone.io) while llama_index doesn't. I don't think it's worth messing around with that though tbh, and I think pinecone has one of those one-click installers on the AWS marketplace. If you're using Supabase, they support the pgvector extension for PostgreSQL, so you can just store your embeddings there, but from what I hear, it's not as good.
Of course, if you're subject to EU data regulations, you're going to use llama_index rather than send your internal docs off to the US. I say internal docs, because it seems everyone and their mother is trying to enter the organisational knowledge retrieval/assistant SaaS space with this. Some even raising huge rounds, with no defensibility (not even first-mover advantage). It's legitimately blowing my mind, and hopefully we don't see a huge pendulum swing in AI as we did crypto. We probably will tbh.
The only defensibility that may make sense is if you have a data advantage. Data is the gold right now. A friend's company has financial data that is very difficult to get a hold of, and using llama_index, which is the perfect use. Another potential example: the UK government's business support hotline service is sitting on a treasure trove of chat data right now also. Wouldn't it be cool to have an actually really good AI business advisor at your beck and call? Turn that into an Agent tool, and that's more juice to just let it run the business for you outright. Accelerando autonomous corporation vibes, but I digress!
Personally, I would quite like an Obsidian plugin to help me draw connections between notes in my personal knowledge base, help me organise things, and generally allow me to have a conversation with my "memory". It's a matter of time!