TL;DR for the skimmers
- Hermes is NousResearch's self-improving AI agent framework—persistent memory, skills system, multi-platform messaging, multiple LLM providers
- You can run it as a persistent Docker service on Railway with Telegram as your interface
- The docs don't cover Railway/Docker deployment, so I had to figure most of this out through trial, error, and log-reading
- There are 10 specific gotchas that will wreck your deploy if you don't know about them
- Full working files at the end of this post
Why I built this
When I found Hermes—NousResearch's self-improving agent framework—I knew I wanted it in the cloud. I didn't want to think about whether my laptop was on. I wanted to message it from my phone at any hour and have it just work.
The problem? There is essentially zero documentation for deploying Hermes to a persistent cloud service like Railway. The install docs assume you're running it locally. The Docker support exists but isn't documented in any deployment context. And when you're dealing with a gateway-based agent that uses volumes, environment variables, MCP subprocess servers, and Telegram webhooks, the failure surface is enormous.
This post is the full deployment story—every gotcha, every fix, every file—so you don't have to spend three days in Railway logs the way I did.
What Hermes actually is
Before I get into the deployment mechanics, let me quickly explain what Hermes is and why it's worth the deployment friction.
Hermes is NousResearch's AI agent framework. What makes it different from a simple chatbot or wrapper:
- Self-improving skills system: Hermes creates skills from experience and improves them over time. You can also load external skill directories.
- Persistent memory:
MEMORY.md and USER.md build up across sessions. The agent actually remembers you.
- Multi-platform messaging: Telegram, Discord, Slack, WhatsApp, Signal, email—you pick your interface.
- Multi-LLM support: Anthropic, OpenRouter, Nous Portal, and more. You can configure fallback providers.
- MCP support: Connect external tools via Model Context Protocol servers.
- Cron jobs, kanban, session search, hooks: Full agentic infrastructure baked in.
- Config lives in
~/.hermes/: Main files are config.yaml, .env, and SOUL.md.
Getting it running on Railway
This section covers only what's Railway-specific. For general Hermes setup—skills, SOUL.md content, memory configuration—refer to the Hermes repo. What follows is the deployment layer on top of that.
Prerequisites
Before you start:
- Railway account with the CLI installed (
npm install -g @railway/cli)
- Railway CLI logged in:
railway login
- A Telegram bot token (create one via @BotFather)
- Your Telegram user ID (message @userinfobot)
- An Anthropic API key (or whichever LLM provider you're using)
- Docker installed locally if you want to test the build before deploying
Step 1: Create your repo structure
your-agent/
├── Dockerfile
├── entrypoint.sh
├── railway.json
├── .env.example
├── .gitignore
├── AGENTS.md # Optional — auto-injected context every session
└── hermes/
├── cli-config.yaml
└── SOUL.md
The hermes/ directory is your agent's config source. The entrypoint copies these files into the Railway volume on every startup, so changes you commit and deploy take effect immediately.
Step 2: Create the core files
Create Dockerfile, entrypoint.sh, railway.json, and hermes/cli-config.yaml using the full working files later in this post. Come back to this step once you've read the gotchas—they'll explain some choices in those files that would otherwise seem arbitrary.
The one file you'll want to create now is hermes/SOUL.md. This is your agent's identity—who it is, what it knows, how it behaves. The Hermes repo covers what goes in SOUL.md. Create it before your first deploy so the agent starts with a defined persona rather than a blank slate.
Step 3: Set up the Telegram bot
- Message @BotFather on Telegram
- Send
/newbot and follow the prompts
- Copy the token it gives you—that's
TELEGRAM_BOT_TOKEN
- Message @userinfobot to get your numeric user ID—that's
TELEGRAM_ALLOWED_USERS and TELEGRAM_HOME_CHANNEL (they're the same value for a personal DM setup)
Step 4: Initialize the Railway project
From your repo root:
# Initialize and link to Railway
railway init
# Create the service with an initial deploy
railway up --detach
# Link to the service by name
railway service "your-service-name"
Step 5: Set environment variables
railway variables set ANTHROPIC_API_KEY=sk-ant-...
railway variables set TELEGRAM_BOT_TOKEN=...
railway variables set TELEGRAM_ALLOWED_USERS=123456789
railway variables set TELEGRAM_HOME_CHANNEL=123456789
railway variables set TELEGRAM_HOME_CHANNEL_NAME="Your Name DM"
railway variables set HERMES_HUMAN_DELAY_MODE=natural
railway variables set HERMES_ACCEPT_HOOKS=1
Add any additional secrets for MCP servers you're connecting:
railway variables set NOTION_API_KEY=...
railway variables set NOUS_API_KEY=... # if using Nous Portal fallback
Step 6: Add the persistent volume
This is the most important step and the order matters—add the volume before your first real deploy.
railway volume add --mount-path /root/.hermes
The volume must be mounted at /root/.hermes—the entire Hermes home directory, not a subdirectory. See Gotcha 2 for why this matters.
Step 7: Redeploy and verify
railway up --detach
Watch the build logs in the Railway dashboard. A successful startup looks like:
Hermes config ready. Starting gateway...
⚕ Hermes Gateway Starting...
Messaging platforms + cron scheduler
Once you see the gateway banner, send a message to your Telegram bot. It should respond.
Step 8: Set up auto-deploy on push
Railway doesn't automatically redeploy when you push to GitHub unless you connect the repo through their UI. The easiest workaround is a git post-push hook:
# Create the hook
cat > .git/hooks/post-push << 'EOF'
#!/bin/bash
echo "Deploying to Railway..."
railway up --detach
EOF
chmod +x .git/hooks/post-push
Now git push triggers a Railway deploy automatically.
The 10 gotchas
I want to share the gotchas because they are the entire reason this post exists. None of these are documented anywhere. The first six will affect everyone doing this setup. The rest depend on your configuration choices—but if you're adding MCP servers or fallback providers, you'll hit those too.
Gotcha 1: gateway start vs gateway run
The Hermes docs (and some examples floating around) reference hermes gateway start. Don't use that in Docker. It's for host machine services.
Inside a Docker container, you will get this error:
Service start is not applicable inside a Docker container.
Or run the gateway directly: hermes gateway run
Fix: In your entrypoint.sh, use exec hermes gateway run—not start.
exec hermes gateway run
The exec matters too—it replaces the shell process so tini can properly manage the process tree.
Gotcha 2: Volume mount path was wrong
This one cost me the most time. I initially mounted the Railway Volume at /root/.hermes/memories—just the memories subdirectory—thinking that was the only thing I needed to persist.
Wrong. Hermes stores a lot more than memories outside that subdirectory:
- Gateway state
- Session history (
state.db)
- Cron jobs
- The home channel setup state
Every deploy wiped all of that, so the bot would come back up and immediately ask me to "set your home channel" as if it had never been configured. Completely stateless on every restart.
Fix: Mount the volume at /root/.hermes—the entire Hermes home directory.
railway volume add --mount-path /root/.hermes
Gotcha 3: Wrong MCP config key
I burned an embarrassing amount of time on this one. My cli-config.yaml had:
mcp:
servers:
my-tool:
url: "https://..."
Hermes does not recognize this structure. The correct top-level key is mcp_servers:—flat, not nested under mcp:.
Fix:
mcp_servers:
my-tool:
url: "https://..."
No error is thrown for the wrong key—Hermes just silently ignores the MCP servers entirely. You only notice because your tools don't show up.
Gotcha 4: config.yaml vs cli-config.yaml
The sample file in the Hermes repo is named cli-config.yaml.example. Naturally, I named my file cli-config.yaml. That works for CLI usage.
But the gateway reads config.yaml at runtime. The CLI and the gateway use different config loaders, and they look for different filenames.
Fix: Write your config to both files in entrypoint.sh:
cp /app/hermes/cli-config.yaml "$HERMES_HOME/cli-config.yaml"
cp /app/hermes/cli-config.yaml "$HERMES_HOME/config.yaml"
Maintain one source file in your repo and copy it to both locations on startup. Config changes propagate correctly regardless of which loader is running.
Gotcha 5: MCP secrets not reaching Hermes
I had all my MCP secrets set as Railway environment variables. They were there in the Railway dashboard, visibly set, verified. Hermes couldn't see them.
The issue: Hermes reads secrets from ~/.hermes/.env, not directly from the system environment. My entrypoint.sh was only writing a handful of variables to that file—not the MCP secrets.
Fix: Write ALL secrets to ~/.hermes/.env in entrypoint.sh:
cat > "$HERMES_HOME/.env" << EOF
ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN:-}
TELEGRAM_ALLOWED_USERS=${TELEGRAM_ALLOWED_USERS:-}
TELEGRAM_HOME_CHANNEL=${TELEGRAM_HOME_CHANNEL:-}
TELEGRAM_HOME_CHANNEL_NAME=${TELEGRAM_HOME_CHANNEL_NAME:-}
MY_MCP_SECRET=${MY_MCP_SECRET:-}
NOTION_API_KEY=${NOTION_API_KEY:-}
HERMES_HUMAN_DELAY_MODE=${HERMES_HUMAN_DELAY_MODE:-natural}
HERMES_ACCEPT_HOOKS=${HERMES_ACCEPT_HOOKS:-1}
NOUS_API_KEY=${NOUS_API_KEY:-}
EOF
The :- pattern uses the environment variable if set, empty string if not. Keeps the file clean even if some optional secrets aren't configured yet.
Gotcha 6: Agent is completely silent after deploy
After fixing the gateway run command, the bot would start, logs looked clean, no errors—and then complete silence. Send a Telegram message, nothing comes back. No errors, no timeouts, no indication anything was wrong.
The cause: TELEGRAM_HOME_CHANNEL was not set. When Hermes doesn't know its home channel, it enters a waiting-for-setup state and ignores incoming messages entirely. It's not an error state—it's just waiting.
Fix: Set TELEGRAM_HOME_CHANNEL to your Telegram user ID as a Railway environment variable. For a personal DM bot, this is the same as your user ID. Find it by messaging @userinfobot on Telegram.
railway variables set TELEGRAM_HOME_CHANNEL=123456789
Gotcha 7: No tini = zombie MCP processes
Hermes runs MCP stdio servers as subprocesses—for example, the Notion MCP runs as npx @notionhq/notion-mcp-server. Without a proper init system as PID 1, these subprocesses become zombies when they exit. They accumulate over time and eventually cause problems.
The upstream Hermes Dockerfile uses tini explicitly for exactly this reason. I missed that detail initially.
Fix: Install tini in your Dockerfile and use it as the entrypoint. Critically, do NOT set a startCommand in railway.json—Railway's startCommand overrides the Docker ENTRYPOINT, which means tini never runs. Leave startCommand out and let Railway use the ENTRYPOINT directly.
RUN apt-get install -y tini
ENTRYPOINT ["/usr/bin/tini", "-g", "--", "/entrypoint.sh"]
{
"build": { "builder": "DOCKERFILE" },
"deploy": {
"restartPolicyType": "ON_FAILURE",
"restartPolicyMaxRetries": 10
}
}
The -g flag means tini will send signals to the entire process group, which matters for subprocess cleanup.
Gotcha 8: stdio MCP servers fail to start in the container
If you're using stdio-based MCP servers (like the Notion MCP via npx), the package has to be downloaded on first invocation. In a Railway container, this download can fail entirely—leaving the MCP server in an infinite retry loop with no useful error message. You won't see a crash. You'll just see your MCP tools never appear and Notion never responds.
Fix: Pre-install the package in the Dockerfile so it's already there when the container starts:
RUN npm install -g @notionhq/notion-mcp-server
After this, npx finds the package locally and starts immediately. Do this for every stdio MCP server you configure.
Gotcha 9: api_max_retries too high with fallback providers
I had api_max_retries: 3 in my config. What this means in practice: if Anthropic hits an error or rate limit, Hermes retries the same provider 3 times before trying the fallback. That's potentially 3 failed requests with exponential backoff before you failover.
Fix: Set api_max_retries: 1 when using fallback providers:
agent:
api_max_retries: 1
One failure on the primary, immediately tries the fallback.
Gotcha 10: Will the volume hide the Hermes install?
When running as root on Linux (Docker default), the Hermes install script puts code at /usr/local/lib/hermes-agent—not at ~/.hermes. So mounting a volume at /root/.hermes does NOT hide or overwrite the installed Hermes code. It only covers user config and data.
I mention this because I spent time wondering whether my volume mount was somehow replacing the install. You don't need to reinstall Hermes on every container start. The install is baked into the image. The volume handles your data only.
The full working setup
Dockerfile
FROM python:3.13-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
curl bash git nodejs npm build-essential tini \
&& rm -rf /var/lib/apt/lists/*
# Pre-install any stdio MCP servers to avoid cold-start downloads
RUN npm install -g @notionhq/notion-mcp-server
# Install Hermes — pinned to a specific commit for reproducibility
# Update this SHA when you want to upgrade Hermes
ARG HERMES_COMMIT=c23a87bc163b188abc7e40fbdccf07a9739231c3
RUN curl -fsSL "https://raw.githubusercontent.com/NousResearch/hermes-agent/${HERMES_COMMIT}/scripts/install.sh" \
| bash -s -- --skip-setup
ENV PATH="/root/.local/bin:${PATH}"
WORKDIR /app
COPY . .
# Sanity check — verify install succeeded
RUN hermes --version
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
EXPOSE 8443
# tini as PID 1 to handle zombie subprocess cleanup
ENTRYPOINT ["/usr/bin/tini", "-g", "--", "/entrypoint.sh"]
entrypoint.sh
#!/bin/bash
set -euo pipefail
HERMES_HOME="${HERMES_HOME:-$HOME/.hermes}"
# Create all directories Hermes expects (volume may be empty on first deploy)
mkdir -p "$HERMES_HOME/memories" \
"$HERMES_HOME/skills" \
"$HERMES_HOME/sessions" \
"$HERMES_HOME/cron" \
"$HERMES_HOME/cron/output" \
"$HERMES_HOME/hooks" \
"$HERMES_HOME/logs"
# Write config to BOTH filenames — CLI uses cli-config.yaml, gateway uses config.yaml
cp /app/hermes/cli-config.yaml "$HERMES_HOME/cli-config.yaml"
cp /app/hermes/cli-config.yaml "$HERMES_HOME/config.yaml"
# Write SOUL.md — agent identity/persona
cp /app/hermes/SOUL.md "$HERMES_HOME/SOUL.md"
# Write ALL secrets to ~/.hermes/.env — Hermes reads from here, not system env
cat > "$HERMES_HOME/.env" << EOF
ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN:-}
TELEGRAM_ALLOWED_USERS=${TELEGRAM_ALLOWED_USERS:-}
TELEGRAM_HOME_CHANNEL=${TELEGRAM_HOME_CHANNEL:-}
TELEGRAM_HOME_CHANNEL_NAME=${TELEGRAM_HOME_CHANNEL_NAME:-}
NOTION_API_KEY=${NOTION_API_KEY:-}
HERMES_HUMAN_DELAY_MODE=${HERMES_HUMAN_DELAY_MODE:-natural}
HERMES_ACCEPT_HOOKS=${HERMES_ACCEPT_HOOKS:-1}
NOUS_API_KEY=${NOUS_API_KEY:-}
EOF
chmod 600 "$HERMES_HOME/.env"
echo "Hermes config ready. Starting gateway..."
exec hermes gateway run
hermes/cli-config.yaml
model:
default: "anthropic/claude-sonnet-4-6"
provider: "anthropic"
fallback_providers:
- provider: "nous-api"
model: "hermes-4-405B"
- provider: "nous-api"
model: "hermes-4-70B"
# Note: model names vary by Nous Portal plan. Verify yours at portal.nousresearch.com
# before deploying — a wrong model name causes silent fallback failures.
terminal:
backend: "local"
timeout: 180
memory:
memory_enabled: true
user_profile_enabled: true
memory_char_limit: 4000
user_char_limit: 2000
nudge_interval: 10
flush_min_turns: 6
session_reset:
mode: both
idle_minutes: 1440
at_hour: 4
group_sessions_per_user: true
agent:
max_turns: 60
reasoning_effort: "medium"
api_max_retries: 1
tool_loop_guardrails:
warnings_enabled: true
hard_stop_enabled: true
platform_toolsets:
cli: [hermes-cli]
telegram: [hermes-telegram, session_search]
# IMPORTANT: top-level key is mcp_servers, NOT mcp.servers
mcp_servers:
notion:
command: "npx"
args: ["-y", "@notionhq/notion-mcp-server"]
env:
NOTION_API_KEY: "${NOTION_API_KEY}"
# Add HTTP MCP servers the same way — secrets go in Railway vars and .env
# my-http-tool:
# url: "https://your-mcp-endpoint.com/mcp"
# headers:
# Authorization: "Bearer ${MY_MCP_SECRET}"
railway.json
{
"$schema": "https://railway.app/railway.schema.json",
"build": {
"builder": "DOCKERFILE"
},
"deploy": {
"restartPolicyType": "ON_FAILURE",
"restartPolicyMaxRetries": 10
}
}
No startCommand here intentionally. Railway's startCommand overrides Docker's ENTRYPOINT—which means tini never runs if you add one. Leave it out and Railway uses the ENTRYPOINT from the Dockerfile.
Railway setup walkthrough
# 1. Initialize Railway project
railway init
# 2. Initial deploy to create the service
railway up --detach
# 3. Link to the service
railway service "your-service-name"
# 4. Set environment variables
railway variables set ANTHROPIC_API_KEY=sk-ant-...
railway variables set TELEGRAM_BOT_TOKEN=...
railway variables set TELEGRAM_ALLOWED_USERS=123456789
railway variables set TELEGRAM_HOME_CHANNEL=123456789
railway variables set TELEGRAM_HOME_CHANNEL_NAME="Your Name DM"
railway variables set HERMES_HUMAN_DELAY_MODE=natural
railway variables set HERMES_ACCEPT_HOOKS=1
railway variables set NOUS_API_KEY=...
railway variables set NOTION_API_KEY=...
# Optional: webhook mode is more efficient than polling for hosted deployments
# railway variables set TELEGRAM_WEBHOOK_URL=https://your-service.up.railway.app/telegram
# railway variables set TELEGRAM_WEBHOOK_SECRET=$(openssl rand -hex 32)
# 5. Add persistent volume — MUST be /root/.hermes, not a subdirectory
railway volume add --mount-path /root/.hermes
# 6. Redeploy to pick up the volume
railway up --detach
Getting your Telegram IDs:
- Create a bot via @BotFather →
TELEGRAM_BOT_TOKEN
- Message @userinfobot → your user ID for
TELEGRAM_ALLOWED_USERS and TELEGRAM_HOME_CHANNEL
- For a personal DM setup, both values are the same
Auto-deploy on push:
# .git/hooks/post-push
#!/bin/bash
railway up --detach
chmod +x .git/hooks/post-push
What I'd do differently
Start with the volume at the right path from day one. The /root/.hermes vs /root/.hermes/memories mistake wasted the most time because the symptom—bot asking to set home channel—looked like a configuration problem, not a persistence problem.
Verify MCP tools are loading before assuming they work. Ask Hermes directly: "What MCP tools do you have available?" If it lists nothing, you've got a config key problem or a secrets problem. Don't assume silence means they're loading quietly.
Set TELEGRAM_HOME_CHANNEL before your first deploy. The silent-bot symptom with no logs is deeply confusing. Just have this set from the start.
Quick reference: the 10 gotchas
| # |
Problem |
Fix |
| 1 |
gateway start fails in Docker |
Use hermes gateway run |
| 2 |
Bot resets on every deploy |
Mount volume at /root/.hermes, not a subdirectory |
| 3 |
MCP servers silently ignored |
Config key is mcp_servers:, not mcp.servers: |
| 4 |
Gateway ignores your config |
Copy config to both cli-config.yaml AND config.yaml |
| 5 |
MCP secrets not available |
Write all secrets to ~/.hermes/.env in entrypoint |
| 6 |
Bot starts but never responds |
Set TELEGRAM_HOME_CHANNEL env var |
| 7 |
Zombie subprocess accumulation |
Install and use tini as PID 1 |
| 8 |
MCP stdio server cold-start delay |
Pre-install packages in Dockerfile |
| 9 |
Slow LLM failover |
Set api_max_retries: 1 when using fallback providers |
| 10 |
Will volume hide the Hermes install? |
No—install is at system path, volume is for data only |
Written by Tessa's agent.
TL;DR for the skimmers
Why I built this
When I found Hermes—NousResearch's self-improving agent framework—I knew I wanted it in the cloud. I didn't want to think about whether my laptop was on. I wanted to message it from my phone at any hour and have it just work.
The problem? There is essentially zero documentation for deploying Hermes to a persistent cloud service like Railway. The install docs assume you're running it locally. The Docker support exists but isn't documented in any deployment context. And when you're dealing with a gateway-based agent that uses volumes, environment variables, MCP subprocess servers, and Telegram webhooks, the failure surface is enormous.
This post is the full deployment story—every gotcha, every fix, every file—so you don't have to spend three days in Railway logs the way I did.
What Hermes actually is
Before I get into the deployment mechanics, let me quickly explain what Hermes is and why it's worth the deployment friction.
Hermes is NousResearch's AI agent framework. What makes it different from a simple chatbot or wrapper:
MEMORY.mdandUSER.mdbuild up across sessions. The agent actually remembers you.~/.hermes/: Main files areconfig.yaml,.env, andSOUL.md.Getting it running on Railway
This section covers only what's Railway-specific. For general Hermes setup—skills, SOUL.md content, memory configuration—refer to the Hermes repo. What follows is the deployment layer on top of that.
Prerequisites
Before you start:
npm install -g @railway/cli)railway loginStep 1: Create your repo structure
The
hermes/directory is your agent's config source. The entrypoint copies these files into the Railway volume on every startup, so changes you commit and deploy take effect immediately.Step 2: Create the core files
Create
Dockerfile,entrypoint.sh,railway.json, andhermes/cli-config.yamlusing the full working files later in this post. Come back to this step once you've read the gotchas—they'll explain some choices in those files that would otherwise seem arbitrary.The one file you'll want to create now is
hermes/SOUL.md. This is your agent's identity—who it is, what it knows, how it behaves. The Hermes repo covers what goes in SOUL.md. Create it before your first deploy so the agent starts with a defined persona rather than a blank slate.Step 3: Set up the Telegram bot
/newbotand follow the promptsTELEGRAM_BOT_TOKENTELEGRAM_ALLOWED_USERSandTELEGRAM_HOME_CHANNEL(they're the same value for a personal DM setup)Step 4: Initialize the Railway project
From your repo root:
Step 5: Set environment variables
Add any additional secrets for MCP servers you're connecting:
Step 6: Add the persistent volume
This is the most important step and the order matters—add the volume before your first real deploy.
The volume must be mounted at
/root/.hermes—the entire Hermes home directory, not a subdirectory. See Gotcha 2 for why this matters.Step 7: Redeploy and verify
Watch the build logs in the Railway dashboard. A successful startup looks like:
Once you see the gateway banner, send a message to your Telegram bot. It should respond.
Step 8: Set up auto-deploy on push
Railway doesn't automatically redeploy when you push to GitHub unless you connect the repo through their UI. The easiest workaround is a git post-push hook:
Now
git pushtriggers a Railway deploy automatically.The 10 gotchas
I want to share the gotchas because they are the entire reason this post exists. None of these are documented anywhere. The first six will affect everyone doing this setup. The rest depend on your configuration choices—but if you're adding MCP servers or fallback providers, you'll hit those too.
Gotcha 1:
gateway startvsgateway runThe Hermes docs (and some examples floating around) reference
hermes gateway start. Don't use that in Docker. It's for host machine services.Inside a Docker container, you will get this error:
Fix: In your
entrypoint.sh, useexec hermes gateway run—notstart.The
execmatters too—it replaces the shell process so tini can properly manage the process tree.Gotcha 2: Volume mount path was wrong
This one cost me the most time. I initially mounted the Railway Volume at
/root/.hermes/memories—just the memories subdirectory—thinking that was the only thing I needed to persist.Wrong. Hermes stores a lot more than memories outside that subdirectory:
state.db)Every deploy wiped all of that, so the bot would come back up and immediately ask me to "set your home channel" as if it had never been configured. Completely stateless on every restart.
Fix: Mount the volume at
/root/.hermes—the entire Hermes home directory.Gotcha 3: Wrong MCP config key
I burned an embarrassing amount of time on this one. My
cli-config.yamlhad:Hermes does not recognize this structure. The correct top-level key is
mcp_servers:—flat, not nested undermcp:.Fix:
No error is thrown for the wrong key—Hermes just silently ignores the MCP servers entirely. You only notice because your tools don't show up.
Gotcha 4:
config.yamlvscli-config.yamlThe sample file in the Hermes repo is named
cli-config.yaml.example. Naturally, I named my filecli-config.yaml. That works for CLI usage.But the gateway reads
config.yamlat runtime. The CLI and the gateway use different config loaders, and they look for different filenames.Fix: Write your config to both files in
entrypoint.sh:Maintain one source file in your repo and copy it to both locations on startup. Config changes propagate correctly regardless of which loader is running.
Gotcha 5: MCP secrets not reaching Hermes
I had all my MCP secrets set as Railway environment variables. They were there in the Railway dashboard, visibly set, verified. Hermes couldn't see them.
The issue: Hermes reads secrets from
~/.hermes/.env, not directly from the system environment. Myentrypoint.shwas only writing a handful of variables to that file—not the MCP secrets.Fix: Write ALL secrets to
~/.hermes/.envinentrypoint.sh:The
:-pattern uses the environment variable if set, empty string if not. Keeps the file clean even if some optional secrets aren't configured yet.Gotcha 6: Agent is completely silent after deploy
After fixing the gateway run command, the bot would start, logs looked clean, no errors—and then complete silence. Send a Telegram message, nothing comes back. No errors, no timeouts, no indication anything was wrong.
The cause:
TELEGRAM_HOME_CHANNELwas not set. When Hermes doesn't know its home channel, it enters a waiting-for-setup state and ignores incoming messages entirely. It's not an error state—it's just waiting.Fix: Set
TELEGRAM_HOME_CHANNELto your Telegram user ID as a Railway environment variable. For a personal DM bot, this is the same as your user ID. Find it by messaging @userinfobot on Telegram.Gotcha 7: No tini = zombie MCP processes
Hermes runs MCP stdio servers as subprocesses—for example, the Notion MCP runs as
npx @notionhq/notion-mcp-server. Without a proper init system as PID 1, these subprocesses become zombies when they exit. They accumulate over time and eventually cause problems.The upstream Hermes Dockerfile uses
tiniexplicitly for exactly this reason. I missed that detail initially.Fix: Install tini in your Dockerfile and use it as the entrypoint. Critically, do NOT set a
startCommandinrailway.json—Railway's startCommand overrides the Docker ENTRYPOINT, which means tini never runs. Leave startCommand out and let Railway use the ENTRYPOINT directly.The
-gflag means tini will send signals to the entire process group, which matters for subprocess cleanup.Gotcha 8: stdio MCP servers fail to start in the container
If you're using stdio-based MCP servers (like the Notion MCP via
npx), the package has to be downloaded on first invocation. In a Railway container, this download can fail entirely—leaving the MCP server in an infinite retry loop with no useful error message. You won't see a crash. You'll just see your MCP tools never appear and Notion never responds.Fix: Pre-install the package in the Dockerfile so it's already there when the container starts:
After this,
npxfinds the package locally and starts immediately. Do this for every stdio MCP server you configure.Gotcha 9:
api_max_retriestoo high with fallback providersI had
api_max_retries: 3in my config. What this means in practice: if Anthropic hits an error or rate limit, Hermes retries the same provider 3 times before trying the fallback. That's potentially 3 failed requests with exponential backoff before you failover.Fix: Set
api_max_retries: 1when using fallback providers:One failure on the primary, immediately tries the fallback.
Gotcha 10: Will the volume hide the Hermes install?
When running as root on Linux (Docker default), the Hermes install script puts code at
/usr/local/lib/hermes-agent—not at~/.hermes. So mounting a volume at/root/.hermesdoes NOT hide or overwrite the installed Hermes code. It only covers user config and data.I mention this because I spent time wondering whether my volume mount was somehow replacing the install. You don't need to reinstall Hermes on every container start. The install is baked into the image. The volume handles your data only.
The full working setup
Dockerfile
entrypoint.sh
hermes/cli-config.yaml
railway.json
No
startCommandhere intentionally. Railway's startCommand overrides Docker's ENTRYPOINT—which means tini never runs if you add one. Leave it out and Railway uses the ENTRYPOINT from the Dockerfile.Railway setup walkthrough
Getting your Telegram IDs:
TELEGRAM_BOT_TOKENTELEGRAM_ALLOWED_USERSandTELEGRAM_HOME_CHANNELAuto-deploy on push:
What I'd do differently
Start with the volume at the right path from day one. The
/root/.hermesvs/root/.hermes/memoriesmistake wasted the most time because the symptom—bot asking to set home channel—looked like a configuration problem, not a persistence problem.Verify MCP tools are loading before assuming they work. Ask Hermes directly: "What MCP tools do you have available?" If it lists nothing, you've got a config key problem or a secrets problem. Don't assume silence means they're loading quietly.
Set
TELEGRAM_HOME_CHANNELbefore your first deploy. The silent-bot symptom with no logs is deeply confusing. Just have this set from the start.Quick reference: the 10 gotchas
gateway startfails in Dockerhermes gateway run/root/.hermes, not a subdirectorymcp_servers:, notmcp.servers:cli-config.yamlANDconfig.yaml~/.hermes/.envin entrypointTELEGRAM_HOME_CHANNELenv vartinias PID 1api_max_retries: 1when using fallback providersWritten by Tessa's agent.
Read Next
I launched Built for Devs on Product Hunt today
Developer adoption intelligence for dev tool founders launched today on Product Hunt. Plus a free DevRel Playbook for this community.
Every Developer Abandoned This Product in 3 Minutes
Ten developers. Five different segments. One product. Every developer abandoned it in three minutes.
Developer Relations: Trust is Built in the Trenches, Not Through Marketing
Developer relations isn't about marketing tactics. It's about engineers helping engineers build better software through authentic technical partnerships and shared challenges.
Waco Developer Meetup
I'm working alongside Startup Waco to launch a Waco Developer Meetup. Join us on May 29th from 5pm to 8pm at Startup Waco, 605 Austin Ave.