Chrome crashes when opening non-ASCII URL

In my last post I promised to write about log viewing tools, but since then I encountered a new issue with Chrome crashing when opening non-ASCII URL on Android, and I wanted to let you know about it ASAP.

Recently I’ve noticed that one of our tests, which opens a page with non-ASCII characters in URL (so called IDN – International Domain Name), started failing on all or almost all devices. Investigation showed, that when such URL is opened via am start command, Chrome simply crashes during start up. For example, URL like this would cause the crash: adb shell am start -a android.intent.action.VIEW -d 日本語.jp -n com.android.chrome/com.google.android.apps.chrome.Main

It’s worth mentioning, that the crash occurs only when Chrome was not already running in background or foreground, but doesn’t occur if there was a running instance.

I strongly suspect that the cause of this issue is a faulty Chrome update (I’m observing it on v103), but since I haven’t tried to downgrade the Chrome I don’t have a 100% confidence in this (I have 99.9%, though).

So how this issue can be remedied?

First of all, let me say that enclosing the URL in single or double quotes, or adding “https://” in front of it does not solve the issue.

An obvious solution would be to first launch Chrome using the command above without “-d” option, and only then open the URL.

Another solution is to convert URL to punycode before passing it to command. For those who don’t know, punycode is a way to encode non-ASCII URLs (IDN) into ones containing only ASCII characters and by doing so make it compatible with DNS. Most or all browsers perform this conversion automatically. For example, “日本語.jp” in punycode will look like “xn--wgv71a119e.jp”. If this punycode encoded string is passed to am start command instead of the original URL, then Chrome won’t crash.

For encoding the URL, you can use either one of the many online converters (e.g. www.punycoder.com) or a library in your programming language. E.g. for Java the conversion will look like IDN.toASCII("日本語.jp", IDN.ALLOW_UNASSIGNED), but keep in mind that if the url passed to this method contains a protocol part (i.e. “http/https://”), than it will also be converted, which is makes the resulting url invalid. So for Java, it is better to extract domain part and convert only it:

/**
 * Convert Unicode to punycode
 */
public static String normalizeHostname(String url) {

    Pattern pattern = Pattern.compile("(https?://)?([^:^/]*)(:\\d*)?(.*)?");
    Matcher matcher = pattern.matcher(url);
    matcher.find();
    String protocol = Objects.toString(matcher.group(1), "");
    String domain = Objects.toString(matcher.group(2), "");
    String port = Objects.toString(matcher.group(3), "");
    String uri = Objects.toString(matcher.group(4), "");

    return protocol + IDN.toASCII(domain, IDN.ALLOW_UNASSIGNED) + port + uri;
}

Hope, this will help someone.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.