SEO for Google and International Search
- Description: Webmaster-side workflow for Google, Bing, and other international engines — Google Search Console setup (domain vs URL prefix property, TXT-record verification via Cloudflare, sitemap submission, URL Inspection, Coverage/Performance reports, Crawl stats), Bing Webmaster Tools (import from GSC + IndexNow), Yandex Webmaster briefly, and the hreflang strategy for English-default bilingual sites.
- My Notion Note ID: K2B-8-3
- Created: 2026-05-23
- Updated: 2026-05-23
- License: Reuse is very welcome. Please credit Yu Zhang and link back to the original on yuzhang.io
Table of Contents
- 1. Engine Landscape Outside China
- 2. Google Search Console — Setup
- 3. Google Search Console — Day-to-Day
- 4. Google Search Console — Diagnostics
- 5. Bing Webmaster Tools
- 6. IndexNow — Faster Crawl Triggers
- 7. Yandex, DuckDuckGo, Brave
- 8. International / hreflang Strategy
- 9. Backlink Building for Tech Audiences
- 10. References
1. Engine Landscape Outside China
For an English-default personal/tech site targeting non-mainland audiences:
| Engine | Approx. share (en, 2026) | Webmaster console | Notes |
|---|---|---|---|
| ~89% global, ~87% en-US | Google Search Console | The only one you can't ignore. | |
| Bing | ~4% global, ~7% en-US (incl. ChatGPT/Copilot citations) | Bing Webmaster Tools | Imports GSC; effectively free coverage. |
| DuckDuckGo | ~1-2% | none | Pulls primarily from Bing's index. |
| Yahoo | ~1% | none (deprecated) | Pulls from Bing. |
| Yandex | dominant in RU, ~0.5% global | Yandex Webmaster | Worth setting up if RU audience matters. |
| Brave Search | <1%, growing | none | Independent index since 2022. |
Practical rule: optimize for Google + Bing (via GSC + BWT). Everything else either piggybacks on Bing or doesn't move the needle.
2. Google Search Console — Setup
URL: https://search.google.com/search-console
2.1 Property type
Two options:
- Domain property — covers apex + all subdomains + both HTTP/HTTPS. Requires a DNS TXT record. Preferred.
- URL prefix property — covers exactly one URL (e.g.
https://www.example.com/). Verifies via HTML file upload, HTML meta tag, Google Analytics, or Google Tag Manager. Faster but narrower.
Domain property wins because one verification covers example.com, www.example.com, future blog.example.com, etc.
2.2 Verification via Cloudflare
Two paths — the auto-integration is dramatically easier.
2.2.1 Cloudflare auto-TXT integration (recommended)
When you enter a Cloudflare-managed domain in Search Console's Domain-property setup, GSC detects the DNS provider and offers an Authorize button instead of a manual TXT recipe. Flow:
- Click Authorize → redirected to Cloudflare → grants Google permission to add the TXT
- Cloudflare auto-creates the TXT record (no manual DNS editing)
- GSC verifies in seconds → "Ownership verified"
The TXT it adds is dedicated to verification and doesn't touch other DNS records (A/CNAME for Vercel, MX for email, etc.). Safe to leave in place — if removed, ownership lapses.
2.2.2 Manual TXT record (fallback)
If Cloudflare integration isn't offered (older zones, missing permission, non-Cloudflare DNS):
Type: TXT
Name: @
Content: google-site-verification=<token-from-search-console>
TTL: Auto
Wait ~1-10 minutes for propagation, click Verify in Search Console. Cloudflare is fast → usually under 1 minute.
2.2.3 Backup verification
Cloudflare auto-TXT is the sole verification method when you use § 2.2.1. If the integration breaks or the TXT is accidentally deleted, ownership lapses. Add a backup via Settings → Ownership verification → Add another method → HTML tag. Save the token to NEXT_PUBLIC_GOOGLE_VERIFICATION env var; Next.js layout's metadata.verification.google reads it.
2.3 First-day actions
- Sitemaps → submission rules differ by property type:
- URL-prefix property → relative path
sitemap.xmlworks. - Domain property → must submit the full URL
https://example.com/sitemap.xml. Submitting the relative form returns "Invalid sitemap address" (the property covers all subdomains so GSC needs to know which one). - Status often shows "Couldn't fetch" on first submission for new sites — Google fetches asynchronously, typically resolves to "Success" within 1-24h. If still stuck after a day, hit Retry.
- URL-prefix property → relative path
- URL Inspection → paste the home page URL. Status will likely be "URL is not on Google" for a new site. Click Request indexing → manual queue, processed in ~1-7 days. Daily quota ~10-20 requests; prioritize home + top sections, let the rest come via sitemap.
- Settings → Ownership verification → confirm the TXT record is recorded; add a backup verification method (§ 2.2.3) if you used Cloudflare auto-integration.
3. Google Search Console — Day-to-Day
3.1 Coverage / Pages report
- Shows which URLs are indexed, not indexed (reasons given), and excluded (with reasons).
- Common "not indexed" reasons + their fixes:
- Discovered – currently not indexed → Google found the URL but hasn't crawled it. Build inbound links; wait.
- Crawled – currently not indexed → Google fetched but chose not to index. Usually quality/duplicate signal. Improve content, add structured data, check for near-duplicates elsewhere on the web.
- Duplicate without user-selected canonical → set
alternates.canonicalin Next.js metadata. - Page with redirect → expected for redirected URLs; only a problem if the canonical URL itself isn't indexed.
- Soft 404 → page returns 200 but looks empty/error-ish. Add real content or return actual 404.
- Blocked by robots.txt → check
app/robots.ts. The classic "I disallowed/in dev and forgot" bug.
3.2 Performance report
- Queries tab → what searches your pages appear for (impressions, clicks, position).
- Pages tab → which URLs got impressions/clicks.
- Filter by country, device, search appearance, date range.
- Empty for the first 1-2 weeks. Useful only after enough impressions accumulate.
3.3 Sitelinks search box
- For sites with substantial content, Google sometimes shows a search box directly in your SERP entry. Add
WebSite+SearchActionJSON-LD to opt in:
const jsonLd = {
'@context': 'https://schema.org',
'@type': 'WebSite',
url: 'https://example.com',
potentialAction: {
'@type': 'SearchAction',
target: 'https://example.com/search?q={search_term_string}',
'query-input': 'required name=search_term_string',
},
};
4. Google Search Console — Diagnostics
- Settings → Crawl stats → per-day request counts, response codes, file types (HTML, image, JS, CSS), and fetched-by-purpose (refresh, discovery). Drop in crawl rate after a deploy usually = newly returning 4xx or slow responses.
- URL Inspection → Test live URL → forces Googlebot to fetch the page right now and shows the rendered HTML it sees. Use this when "indexed" status disagrees with what you expect.
- Removals tool → emergency removal of a URL from SERP (24-hour effect; permanent removal still needs the page to 404/410 or have
noindex). - Security & Manual actions → almost always empty for personal sites, but check after any compromise.
- Core Web Vitals report → per-URL LCP / INP / CLS scores. Fix the URLs flagged as "Poor" first.
4.1 robots.txt and rendering tests
Older robots.txt Tester (google.com/webmasters/tools/robots-testing-tool) still works for testing whether a path is blocked. URL Inspection is now the primary entry point.
For JS-heavy pages, use URL Inspection → Test live URL → View tested page → HTML to confirm Googlebot sees server-rendered HTML (Next.js App Router with default SSR is fine). If you see a near-empty <body>, Googlebot is seeing only the JS shell — fix by ensuring SSR/SSG is enabled for that route.
5. Bing Webmaster Tools
URL: https://www.bing.com/webmasters
5.1 Why bother
- Powers DuckDuckGo, Yahoo, and large chunks of ChatGPT / Copilot / Perplexity citation lookups.
- Setup time: ~2 minutes via GSC import.
5.2 Setup via GSC import
If GSC is already verified:
- Bing Webmaster Tools → Import from Google Search Console → sign in with the same Google account → select properties → import.
- Sites and sitemaps copy across automatically.
Standalone setup is also available (XML file upload, meta tag, or CNAME).
5.3 What Bing offers beyond GSC
- Site Explorer → similar to GSC Coverage but with more aggressive crawl-error surfacing.
- Search Performance → query/page reports comparable to GSC Performance.
- URL submission → 10/day for manual submit, up to 10,000/day via IndexNow (see § 6).
- Backlinks explorer → richer link data than GSC offers for free.
- Markup Validator → tests structured data including OpenGraph and Twitter Card.
6. IndexNow — Faster Crawl Triggers
IndexNow is a protocol Bing started (Yandex now supports too) for proactively pinging engines on content change.
6.1 How it works
- Generate an API key (any UUID).
- Host
https://example.com/<key>.txtcontaining just the key — Bing verifies you own the domain. - POST to
https://api.indexnow.org/indexnowwith a JSON body listing changed URLs:
curl -X POST 'https://api.indexnow.org/indexnow' \
-H 'Content-Type: application/json' \
-d '{
"host": "example.com",
"key": "<your-key>",
"urlList": ["https://example.com/notes/new-note"]
}'
- One ping is shared across all participating engines (Bing, Yandex, Seznam, others) — no per-engine submission needed.
6.2 Wiring into the build
Add a vercel.json or GitHub Action that POSTs to IndexNow after deployment. Sketch:
# .github/workflows/indexnow.yml
on: { workflow_run: { workflows: ['Vercel Production Deploy'], types: [completed] } }
jobs:
ping:
runs-on: ubuntu-latest
steps:
- run: |
curl -X POST 'https://api.indexnow.org/indexnow' \
-H 'Content-Type: application/json' \
-d '{"host":"example.com","key":"${{ secrets.INDEXNOW_KEY }}","urlList":["https://example.com/"]}'
For finer granularity, generate the URL list from changed files in the deploy commit.
Google does not support IndexNow as of 2026 — Google's only public signal is the GSC URL Inspection "Request indexing" button.
7. Yandex, DuckDuckGo, Brave
- Yandex Webmaster (
webmaster.yandex.com) — register if you care about Russian traffic. Setup is similar to GSC: TXT or HTML verification, submit sitemap, monitor index status. IndexNow integration works. - DuckDuckGo — no webmaster console. DDG sources primarily from Bing's index; getting indexed in Bing covers DDG.
- Brave Search — independent index. No webmaster console as of 2026. They claim to index based on Common Crawl + their own crawler; no submission API.
- Kagi — paid search, no webmaster tools, no submission.
Tactically: if you're shipping IndexNow (§ 6) you already reach Bing + Yandex. Brave + Kagi are organic discovery only.
8. International / hreflang Strategy
For a bilingual EN / ZH site that wants both Google and Baidu coverage:
8.1 Same URL, language switcher (recommended)
- One canonical URL per page (e.g.
/notes/cpp/templates). - Language switcher in the page UI sets a
?lang=enor?lang=zhparam (or cookie / localStorage). hreflangannotations tell Google which version to serve per user locale:
export const metadata: Metadata = {
alternates: {
canonical: '/notes/cpp/templates',
languages: {
'en-US': '/notes/cpp/templates?lang=en',
'zh-CN': '/notes/cpp/templates?lang=zh',
'x-default': '/notes/cpp/templates',
},
},
};
x-defaulttells Google what to show when no language matches — for tech notes, usually the English version.- For Google + Bing this is enough. Baidu ignores hreflang — for serious Baidu ranking you need separately optimized zh-CN pages, see SEO Baidu And China.
8.2 Locale subpaths (alternative)
/en/notes/cpp/templatesand/zh/notes/cpp/templates— two separate URL spaces.- More SEO juice per language (Google treats each as a fully independent page).
- Significantly more routing/sitemap complexity. Worth it only if you're committing to substantial per-language content divergence.
8.3 Locale-specific domains
example.com(en) +example.cn(zh) — separate domains.- Strongest geo/language signal but doubles infrastructure. Required only when one locale is a separate business.
For a personal tech site, § 8.1 (same URL + language switcher) is the right default.
9. Backlink Building for Tech Audiences
Discovery accelerates with inbound links. Practical channels for English-speaking tech readers:
- Hacker News — submit a single high-quality post when launching. Don't farm.
- lobste.rs — invite-only, but submitted links by members rank well.
- r/programming, r/cpp, r/javascript, etc. — domain-relevant subreddits.
- dev.to / hashnode — republish (with canonical pointing to your site) for an extra discovery surface without duplicate-content cost.
- GitHub README — link from any popular repos you maintain.
- Conference talk slides — link to backing notes.
- Newsletters — guest posts or curated mentions (e.g., Bytes, Pointer, Console).
Each genuine inbound link is worth weeks of organic crawl-rate improvement. Avoid link farms, automated comment posting, and paid-link networks — those trigger manual actions in GSC.
10. References
- Google Search Console — https://search.google.com/search-console
- Google Search Central — Search Console help — https://support.google.com/webmasters
- Google Search Central — Get started with hreflang — https://developers.google.com/search/docs/specialty/international/localized-versions
- Bing Webmaster Tools — https://www.bing.com/webmasters
- Bing Webmaster — Getting started — https://www.bing.com/webmasters/help/getting-started-checklist-7b3eaf66
- IndexNow protocol — https://www.indexnow.org/documentation
- Yandex Webmaster — https://webmaster.yandex.com/
- web.dev — Core Web Vitals — https://web.dev/articles/vitals
- Schema.org WebSite + SearchAction — https://developers.google.com/search/docs/appearance/structured-data/sitelinks-searchbox