ITMay 23, 2026

Shared Memory for Agent Fleets: hermes-memory-pgvector

A Postgres + pgvector plugin that gives a fleet of hermes-agent minions a durable, semantically-searchable shared memory — with no LLM in the hot path. Published on PyPI.

By Andrea Borghi
Shared Memory for Agent Fleets: hermes-memory-pgvector

Shared Memory for Agent Fleets: hermes-memory-pgvector

When you run more than one AI agent — a marketing minion, a trading minion, an incident-response minion — they each need memory. The built-in memory tool gives each agent its own. That is fine until you want them to share, or until you want to recall what the agent learned six weeks ago without paying for an LLM round-trip just to look it up.

hermes-memory-pgvector is a small Postgres + pgvector plugin that mirrors agent memory writes into a shared, embedded, queryable store. Published on PyPI at v0.3.0.

What it actually does

In one line: "a storage layer that gives the built-in memory model durable, multi-tenant, semantically-searchable backing, with no LLM in the hot path."

Two tables, both with HNSW vector indexes. memory_entries mirrors writes to the agent's MEMORY.md/USER.md files. conversations stores substantive chat turns (≥40 chars, boilerplate filtered out). Embeddings are 768-dim, computed by an external endpoint — Ollama, OpenAI-compatible, your choice. The agent never blocks on it: writes return in microseconds and the embedding worker handles the rest on a background queue.

Per-agent themes by default

Every request carries an X-Hermes-Session-Key header that scopes data by agent_identity. Marketing's notes do not pollute trading's recall. When you actually want cross-theme search, pass scope='all' explicitly. The default is the safe one.

Why standalone, not a fork

hermes-agent closed its built-in memory provider list per policy, so this lives as a separate /plugins directory scan instead of an upstream fork. Drop it next to the agent, set a few config keys, restart. Rollback is symmetric: disable the provider, optionally drop the tables. No long-lived state to migrate, no kernel patches to maintain.

What it is not

Not a Honcho replacement. Not a knowledge graph. Not a RAG framework. It is intentionally a thin layer that does one thing — turn the built-in memory tool into a shared, durable, vector-searchable store — and gets out of the way. No LLM deriver, no dialectic loop, no opinion about how you should chunk or rerank. Just vector math.

Get it

pip install hermes-memory-pgvector

Source, migration, and config on GitHub:

👉 github.com/andreab67/hermes-memory-pgvector