Lighthouse Journey #3 — Small Fix, Big Impact: The Lang Tag Mistake

7 回视听by Huynh

“One dash, one country, one big lesson.”

When we first added multilingual support to WakaranEng,
we did what most developers would probably do:

const locales = ['en', 'vi', 'ja', 'cn']

Short. Clean. Easy to read.
English, Vietnamese, Japanese, “Chinese”.

And for a while, it “worked”.
The pages rendered.
The routes changed.
The content switched language.

But Lighthouse — and modern SEO rules —
were not impressed.


🌐 The Problem: Pretty, but not Standard

On the surface, en, vi, ja, and cn looked fine.
They were convenient identifiers for routing and logic.

But there was one big problem:

They are not valid locale formats for language–region codes.

The web doesn’t just care about “English” or “Japanese”.
It cares about which English, which Japanese, which Chinese.

  • en → too generic
  • vi → not the full vi-VN
  • ja → not ja-JP
  • cn → not even a language code, should be zh-CN

Lighthouse started warning us about language and SEO:

  • Incorrect or incomplete lang attributes
  • Locale values not following standard format
  • Potential confusion for screen readers and search engines

From the user’s perspective, nothing looked broken.
From the browser and SEO perspective,
we were speaking in half-languages.


🩹 The Fix: Map Internal Locales to Real Ones

We didn’t want to throw away our simple internal codes.
They were already used across routing, components, and logic.

So instead of changing locales everywhere,
we added a clear mapping layer between internal codes and real-world formats.

1. Fixing <html lang> in layout.tsx

// layout.tsx
const htmlLang = locale === 'cn' ? 'zh-CN' : locale

But that was just the first step.
We later expanded this idea and made sure every locale
was converted properly whenever it touched the outside world.


2. Formatting dates with proper locale tags

For date formatting, we introduced a helper:

const getLocalizedDateFormat = (locale: string) => {
  switch (locale) {
    case 'en':
      return 'en-US'
    case 'cn':
      return 'zh-CN'
    case 'vi':
      return 'vi-VN'
    case 'ja':
    default:
      return 'ja-JP'
  }
}

Internally, we still think in en, vi, ja, cn.
But whenever we hand something to Intl.DateTimeFormat
(or anything that cares about real locale codes),
we give it a valid, explicit language–region.


3. Open Graph metadata in metadata.ts

Open Graph uses underscores instead of dashes,
so we added a dedicated mapping there too:

// metadata.ts
const ogLocale =
  locale === 'cn'
    ? 'zh_CN'
    : locale === 'ja'
    ? 'ja_JP'
    : locale === 'vi'
    ? 'vi_VN'
    : 'en_US'

Now social previews and link unfurls
carry the correct locale information as well.


4. SEO utilities and language mapping

In our SEO utilities, we also normalized language codes:

// SEOUtils.ts
const localeToLanguage: Record<string, string> = {
  en: 'en',
  vi: 'vi',
  ja: 'ja',
  cn: 'zh-CN', // Changed from 'zh' to 'zh-CN' for proper ISO format
}

It’s a tiny detail — one comment, one change —
but it makes our intent explicit:
Chinese content is in Simplified Chinese for mainland China (zh-CN),
not just some generic “zh”.


✅ The Outcome: Warnings Gone, Intent Clear

After these changes, the Lighthouse warnings about language and SEO disappeared.

  • The <html lang> attribute became valid and specific.
  • Dates and formats used proper BCP 47 locale tags.
  • Open Graph metadata matched the real locale of the page.
  • SEO and accessibility audits stopped complaining about language issues.

We didn’t rewrite the site.
We didn’t touch the design.
We didn’t change our content.

All we did was respect the standards
and tell the browser exactly which language we were speaking.

“Sometimes the web doesn’t punish you for being wrong —
it punishes you for being vague.”


🧠 Why This Matters More Than It Looks

This wasn’t just about pleasing Lighthouse.

  • Screen readers rely on lang to choose pronunciation and voice.
  • Search engines use locale to decide where and how to serve your page.
  • AI systems and LLM-based search use language metadata
    to cluster, rank, and interpret content.

By moving from en to en-US, cn to zh-CN,
we weren’t just fixing a warning.

We were saying:

“We know exactly who this page is for, and in which language.”

That’s respect — for users, for tools, and for the future web.


🔜 What’s Next — The Final Chapter of This Series

The next article will be the last entry in this Lighthouse Journey:

Lighthouse Journey #4 – Writing for AI Search Engines

In that final post, we’ll put everything together:

  • How we structured content, metadata, alt text, and summaries
  • How we think about AI crawlers and LLM-based search, not just classic SEO
  • And the final Lighthouse results for wakaran-eng.com
    after all the changes from this series

We started with warnings and confusion.
We’ll end with a clear picture of where the site now stands —
not as a “perfect” website,
but as an honest, well-understood one.


🌱 Reflection

We like small fixes like this.
They fit the spirit of WakaranEng perfectly:

  • Notice something subtle.
  • Understand why it matters.
  • Fix it clearly.
  • Share the lesson.

This wasn’t a dramatic refactor.
It was just a quiet upgrade in how we describe ourselves to the web.

“Clarity is also a feature —
and sometimes, it’s only one dash away.”

推荐博客