Author: savva

SAVVA posts are now discoverable on the open web — savva | savva.app

pasted-image-1777751961021.png
When you publish a post on SAVVA, you want people to find it. Not just
people who already know about SAVVA — people who type your topic into
Google, who share your link in a Telegram chat, who paste it into
Discord or X or WhatsApp. Until now, SAVVA's web client did one thing
for human readers and a much smaller thing for everyone else. That has
been completely rebuilt.

What this means for you, concretely

Before:

After:

A massive rebuild, not a tweak

SEO sounds like a checkbox. In practice, doing it right for a
decentralized platform means rethinking how the web client and the
backend talk to the world. SAVVA stores its content on IPFS, references
it via on-chain SavvaCIDs, and renders it client-side. None of that
maps cleanly to what a search engine expects to see at a URL.

We rebuilt the path that crawlers take through the platform from the
ground up. Bot traffic now reaches a server-rendered version of your
content with all the metadata search engines and link previewers need.
Human traffic still gets the fast SolidJS app — but now that app also
keeps its own meta tags up to date as you navigate, so when you share
a deep link from your phone's share sheet the recipient sees the right
preview even before SAVVA finishes loading.

All your domains, not just one

SAVVA is a multi-domain platform. The same protocol can host different
communities, each with their own brand, audience, and content corpus.
The new surface respects that: every domain gets its own sitemap,
robots.txt, canonical URLs, AI-crawler policy (you can opt out of
GPTBot, ClaudeBot, PerplexityBot per domain), and Google Search Console
verification. Nothing is hardcoded; new domains pick up the full SEO
treatment the moment they're configured.

A little of what's under the hood

Without going too deep, the platform now:

It's the kind of rework that's invisible when it works and unmissable
when it doesn't.

What you can do now

  1. Search for your own content in Google a week from now. The
    platform was just registered with Search Console; it takes a crawl
    cycle or two for results to appear.
  2. Share a post URL in a Telegram chat or post it on X. The preview
    should now show the right title, author, and image.
  3. Check your profile — paste your /0x... URL into the Facebook
    Sharing Debugger or Twitter Card Validator. You should see a proper
    Person card.
  4. Tell us what's broken. SEO is fiddly; if a particular post
    doesn't preview the way you'd expect, file an issue with the URL.

What's next

We're watching Google's coverage reports and crawl logs over the next
few weeks. There's a small backlog of polish items — locale-prefixed
URLs, broader image format support, comment pagination, a developer
docs section — that will land if traffic shows the demand.

For now: your content is discoverable. Go share it.


For administrators

If you run your own SAVVA-protocol node and want the SEO surface to work
on your domain, three things have to be true:

  1. The backend (savva-backend) is on the version that ships the
    /api/render, /api/robots.txt, and /api/sitemap*.xml endpoints.
    Confirm with curl -s https://<your-domain>/api/robots.txt once
    nginx is updated — you should see a per-domain robots.txt body, not
    nginx's default 404.
  2. Your domain has an entry in savva-backend.yml under domains:,
    and that key's name matches the set $default_domain "..." value
    in your nginx config.
  3. Your nginx config routes the right requests to the backend. The
    default config many of you started from doesn't — it has a 2018-era
    bot regex that misses every AI crawler, no rewrites for
    /robots.txt or /sitemap.xml, and (in some versions) a stale
    rewrite pattern that no longer matches the backend's expectations.

What to change

Three additions to your existing server block. You can copy the
example below as a starting point — the bits you must adapt to your
own deployment are flagged with <...> placeholders. Everything else
is the same across all SAVVA-protocol sites.

A. Add SEO discovery rewrites above your existing location /
block. Three small blocks that funnel /robots.txt and the sitemap
URLs into the backend:

location = /robots.txt {
    rewrite ^ /api/robots.txt?domain=$default_domain last;
}
location = /sitemap.xml {
    rewrite ^ /api/sitemap.xml?domain=$default_domain last;
}
location ~ ^/sitemap-.*\.xml$ {
    rewrite ^(/sitemap-[^?]+) /api$1?domain=$default_domain last;
}

B. Replace the bot detection inside location / with the modern
version. The old regex matches roughly seven crawlers from 2018; the
new one covers every major search engine, every modern AI crawler,
and every social-media unfurler we've seen in the wild:

location / {
    set $prerender 0;

    if ($http_user_agent ~*
      "googlebot|adsbot-google|mediapartners-google|googleother|bingbot|bingpreview|adidxbot|msnbot|slurp|yahoo|duckduckbot|baiduspider|baidu|yandex|yandexbot|seznambot|sogou|exabot|naverbot|yeti|applebot|petalbot|gigabot|ia_archiver|facebookexternalhit|facebot|meta-external|twitterbot|telegrambot|slackbot|discordbot|linkedinbot|whatsapp|pinterest|redditbot|skypeuripreview|flipboard|vkshare|tumblr|embedly|gptbot|chatgpt-user|oai-searchbot|claudebot|claude-web|anthropic-ai|perplexitybot|google-extended|applebot-extended|ccbot|bytespider|amazonbot|cohere-ai|mistralai-user|diffbot|youbot|firecrawl|omgilibot|developers\.google\.com") {
        set $prerender 1;
    }

    if ($args ~ "_escaped_fragment_") {
        set $prerender 1;
    }

    if ($http_user_agent ~ "Prerender") {
        set $prerender 0;
    }

    # Static files (JS bundles, images, fonts) bypass — not pages.
    if ($uri ~ \.[a-zA-Z0-9]+$) {
        set $prerender 0;
    }

    if ($prerender = "1") {
        rewrite (.*) /api/render$1?domain=$default_domain&prerender&$args last;
    }

    try_files $uri $uri/ /index.html;
}

The two important changes from older configs: rewrite (.*) /api/render$1
(NOT the older /api/render/$scheme://$host$uri form, which the backend
no longer accepts), and last (not break), so the rewritten URI
re-enters location matching and falls into your existing /api block —
inheriting its proxy_set_header setup automatically.

C. Don't touch any other location blocks: /assets/,
/domain_assets/, /temp_assets/, /public_files/, /reports/,
/default_connect.json (or wherever your client config is served),
your TLS settings, your roots — all unchanged.

Full working example

This is the actual savva.app config in production, with deployment-specific
bits flagged with <…> placeholders. Replace each placeholder with the
values for your own deployment:

Placeholder What to substitute
<your-domain> Your hostname, e.g. mysite.example
<your-domain-key> The key under domains: in your savva-backend.yml (must match)
/path/to/... Your actual filesystem paths
<your-ssl-bundle> Your SSL cert/key filenames
<your_backend_upstream> The nginx upstream name pointing at your backend (defined elsewhere in nginx.conf)
Chain IDs / RPCs in default_connect.json The chains your domain supports
# /etc/nginx/sites-enabled/<your-domain>
#
# Reference config for a SAVVA-platform site:
#   - Single-page app served from $root with /index.html fallback
#   - Crawler / unfurler / AI-bot detection → server-rendered HTML from /api/render
#   - SEO discovery routes (/robots.txt, /sitemap*.xml) sourced from /api
#   - Backend API + WebSocket proxy with optional /pls and /monad sub-routes
#   - CORS-enabled static asset directories

# WebSocket upgrade helper (used by the /api proxy below).
map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
}

# HTTP → HTTPS
server {
    server_name www.<your-domain> <your-domain>;
    listen      80;
    listen      [::]:80;
    return 301 https://$host$request_uri;
}

server {
    server_name www.<your-domain> <your-domain>;
    listen      443 ssl http2;
    listen      [::]:443 ssl http2;

    # Tells the backend which domain config to load (must match a key in
    # savva-backend.yml). Used by the SEO rewrites below as ?domain=…
    set $default_domain "<your-domain-key>";

    root  /path/to/your/web/static;
    index index.html;

    keepalive_timeout 86400s;

    ssl_certificate     /etc/ssl/<your-ssl-bundle>.crt;
    ssl_certificate_key /etc/ssl/<your-ssl-bundle>.key;
    ssl_protocols       TLSv1.2 TLSv1.3;
    ssl_ciphers         HIGH:!aNULL:!MD5;
    ssl_stapling        off;

    types {
        text/yaml              yaml yml;
        application/json       json;
        application/pdf        pdf;
        text/html              html htm;
        application/javascript js;
        text/css               css;
    }

    # ---- SEO discovery -------------------------------------------------
    # Crawlers expect these at the root; backend serves them under /api/.
    # Rewrites use `last` so the request re-enters location matching and
    # is handled by the /api proxy below (inheriting its proxy_set_header
    # setup). ?domain= is required because the backend resolves the
    # domain from the query string here, not from Host.
    location = /robots.txt {
        rewrite ^ /api/robots.txt?domain=$default_domain last;
    }
    location = /sitemap.xml {
        rewrite ^ /api/sitemap.xml?domain=$default_domain last;
    }
    location ~ ^/sitemap-.*\.xml$ {
        rewrite ^(/sitemap-[^?]+) /api$1?domain=$default_domain last;
    }

    # ---- SPA + bot detection -------------------------------------------
    # Humans get the SPA shell (index.html). Recognised bots/crawlers/
    # unfurlers get server-rendered HTML from /api/render.
    location / {
        set $prerender 0;

        # Search engines, social unfurlers, AI crawlers. Keep this list
        # roughly in sync with the backend's seo.Middleware regex; the
        # backend catches anything we miss here.
        if ($http_user_agent ~*
          "googlebot|adsbot-google|mediapartners-google|googleother|bingbot|bingpreview|adidxbot|msnbot|slurp|yahoo|duckduckbot|baiduspider|baidu|yandex|yandexbot|seznambot|sogou|exabot|naverbot|yeti|applebot|petalbot|gigabot|ia_archiver|facebookexternalhit|facebot|meta-external|twitterbot|telegrambot|slackbot|discordbot|linkedinbot|whatsapp|pinterest|redditbot|skypeuripreview|flipboard|vkshare|tumblr|embedly|gptbot|chatgpt-user|oai-searchbot|claudebot|claude-web|anthropic-ai|perplexitybot|google-extended|applebot-extended|ccbot|bytespider|amazonbot|cohere-ai|mistralai-user|diffbot|youbot|firecrawl|omgilibot|developers\.google\.com") {
            set $prerender 1;
        }

        # Legacy escaped-fragment hint (still emitted by some old tooling).
        if ($args ~ "_escaped_fragment_") {
            set $prerender 1;
        }

        # Defensive: don't loop on Prerender's own UA.
        if ($http_user_agent ~ "Prerender") {
            set $prerender 0;
        }

        # Static assets (JS bundles, images, fonts) bypass — not pages.
        if ($uri ~ \.[a-zA-Z0-9]+$) {
            set $prerender 0;
        }

        # `last` re-evaluates locations so this falls into /api below
        # and inherits its full proxy_set_header setup.
        if ($prerender = "1") {
            rewrite (.*) /api/render$1?domain=$default_domain&prerender&$args last;
        }

        try_files $uri $uri/ /index.html;
    }

    # Frontend bootstrap config — served inline so the SPA can pick up
    # per-domain settings without an extra round-trip.
    location /default_connect.json {
        add_header Content-Type application/json;
        return 200 '{
            "domain": "$default_domain",
            "chains": [
                {"chainId": <your-chain-id>, "rpc": "https://<your-domain>/<your-chain-prefix>/api/"}
            ]
        }';
    }

    # Empty block prevents fallthrough to `location /` (and its SPA
    # index.html rewrite); files under /assets/ are served as-is.
    location /assets/ {
    }

    # ---- Backend API + WebSocket proxy --------------------------------
    # Matches /api and /pls/api; both reach the main backend upstream.
    # Regex location so it wins over the prefix /monad/api below for
    # /api/* paths and so the SEO/SPA `last` rewrites land here.
    # If you only have one backend, drop the /pls/api alias and the
    # /monad/api block below.
    location ~ ^/(api|pls/api) {
        rewrite ^/pls/api(/.*)$ /api$1 break;

        proxy_http_version 1.1;
        proxy_pass         http://<your_backend_upstream>;

        proxy_set_header Host              $host;
        proxy_set_header X-Real-IP         $remote_addr;
        proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header Upgrade           $http_upgrade;
        proxy_set_header Connection        $connection_upgrade;

        proxy_read_timeout    86400s;
        proxy_connect_timeout 86400s;
        proxy_send_timeout    86400s;
        proxy_buffering       off;

        client_max_body_size 20M;
    }

    # Optional second backend on a different chain (delete if not needed).
    location /monad/api {
        rewrite ^/monad/api(/.*)$ /api$1 break;

        proxy_http_version 1.1;
        proxy_pass         http://<your_second_backend_upstream>;

        proxy_set_header Host              $host;
        proxy_set_header X-Real-IP         $remote_addr;
        proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header Upgrade           $http_upgrade;
        proxy_set_header Connection        $connection_upgrade;

        proxy_read_timeout    86400s;
        proxy_connect_timeout 86400s;
        proxy_send_timeout    86400s;
        proxy_buffering       off;

        client_max_body_size 20M;
    }

    # ---- Static asset directories (CORS-enabled, autoindex on) --------
    location /public_files/ {
        alias /path/to/your/public_files/;

        autoindex            on;
        autoindex_exact_size off;
        autoindex_localtime  on;

        add_header Access-Control-Allow-Origin  *                                                                                                            always;
        add_header Access-Control-Allow-Methods "GET,OPTIONS"                                                                                                always;
        add_header Access-Control-Allow-Headers "DNT,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization"      always;

        if ($request_method = OPTIONS) {
            add_header Access-Control-Max-Age 1728000                   always;
            add_header Content-Type           "text/plain; charset=UTF-8" always;
            add_header Content-Length         0                          always;
            return 204;
        }
    }

    location /domain_assets/ {
        alias /path/to/your/savva_data/domain_assets/;

        autoindex            on;
        autoindex_exact_size off;
        autoindex_localtime  on;
        autoindex_format     json;

        add_header Access-Control-Allow-Origin  *                                                                                                            always;
        add_header Access-Control-Allow-Methods "GET,OPTIONS"                                                                                                always;
        add_header Access-Control-Allow-Headers "DNT,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization"      always;

        if ($request_method = OPTIONS) {
            add_header Access-Control-Max-Age 1728000                   always;
            add_header Content-Type           "text/plain; charset=UTF-8" always;
            add_header Content-Length         0                          always;
            return 204;
        }
    }

    location /temp_assets/ {
        alias /path/to/your/savva_data/temp_assets/;

        autoindex            on;
        autoindex_exact_size off;
        autoindex_localtime  on;
        autoindex_format     json;

        add_header Access-Control-Allow-Origin  *                                                                                                            always;
        add_header Access-Control-Allow-Methods "GET,OPTIONS"                                                                                                always;
        add_header Access-Control-Allow-Headers "DNT,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization"      always;

        if ($request_method = OPTIONS) {
            add_header Access-Control-Max-Age 1728000                   always;
            add_header Content-Type           "text/plain; charset=UTF-8" always;
            add_header Content-Length         0                          always;
            return 204;
        }
    }

    # PDF reports — force download regardless of browser sniffing.
    location /reports/ {
        alias /path/to/your/savva_data/reports/;
        default_type application/octet-stream;
    }
}

Apply and verify

# 1. Test the syntax. Catches typos before they hurt.
sudo nginx -t

# 2. Reload (no dropped connections).
sudo systemctl reload nginx

# 3. Smoke tests — replace <your-domain> with your actual hostname.

# Bot path returns rendered HTML, not the SPA shell.
curl -sA "Googlebot" https://<your-domain>/ | head -10
# Expect: <!DOCTYPE html><html lang="en"><head>...<title>...</title>

# robots.txt comes from the backend.
curl -s https://<your-domain>/robots.txt
# Expect: User-agent: * / Disallow: /api/ ... / Sitemap: https://...

# Sitemap index.
curl -s https://<your-domain>/sitemap.xml | head -5
# Expect: <?xml version="1.0"...?><sitemapindex...

# Modern AI crawler also rendered (proves the new UA regex works).
curl -sA "Mozilla/5.0 (compatible; ClaudeBot/1.0; [email protected])" \
  https://<your-domain>/ | grep -E "og:title|<title>" | head -3

# Human path STILL gets the SPA shell (regression check).
curl -sA "Mozilla/5.0 Chrome/120" https://<your-domain>/ | head -5
# Expect: SPA shell, NOT bot-rendered HTML.

If any of these return the SPA shell when they shouldn't (or vice
versa), the most common causes are: backend not yet running the new
version; set $default_domain "..." value doesn't match a key in
savva-backend.yml; or your /api upstream isn't reachable from the
nginx host.

Notes on the multi-backend pattern

The example shows two backend upstreams (<your_backend_upstream> for
/api + /pls/api, <your_second_backend_upstream> for /monad/api)
because some SAVVA deployments span chains. With last in the bot
rewrite, SEO traffic always falls into the main /api block — exactly
what you want, even when you have multiple chains. If you only have one
backend, drop both the /pls/api alias and the entire /monad/api
location block.

If you specifically need bot traffic routed to a different backend than
your normal API traffic (rare — usually only when the two backends
serve different domain configs), use break instead of last in the
bot rewrite and add explicit proxy_set_header lines and
proxy_pass http://<that-backend> inside the if ($prerender = "1")
block.