Can Yandex Process Cyrillic On All Site Elements?

Recently I wrote a post about how IDN TLDs are processed by Google and Yandex, and this led to questions being sent in asking around Cyrillic usage in other site elements, such as URLs, XML Sitemaps and the Robots.txt file.

Cyrillic domain names and URI paths are indexed and processed the same as Latin domains and URI paths, however, Cyrillic cannot be used as a replacement for Latin in:

  • The robots.txt file
  • Server HTTP-Headers
  • XML Sitemap files

Punycode is used to parse domain names, and page URIS are recorded in the encoding corresponding to the encoding of the current site structure.

It is recommended to use the same encoding for the pages of the site and the Cyrillic addresses in its structure.

For example, the link <a href = “/basket” /> on the page with the encoding set to UTF-8, Yandex bot will save it in this encoding, which means it should be available at “/% D0% BA% D0% BE% D1% 80% D0% B7% D0% B8% D0% BD% D0% B0.”

Dan Taylor
Dan Taylor is an experienced SEO consultant and has worked with brands and companies on optimizing for Russia (and Yandex) for a number of years. Winner of the inaugural 2018 TechSEO Boost competition, webmaster at HreflangChecker.com and Sloth.Cloud, and founder of RussianSearchNews.com.