Page MenuHomePhabricator

[Spike, 2hrs]: Can we find the source of the iPhone 13 bug Safari impacting mobile users?
Closed, ResolvedPublicSpike

Description

There is a quite frequent bug in iPhone 13 Safari

The user agent Mozilla/5.0 (iPhone; CPU iPhone OS 13_4_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1 Mobile/15E148 Safari/604.1 is generating the majority of bugs to WebClientError as shown by Turnilo

Screen Shot 2020-05-08 at 8.55.09 AM.png (1×1 px, 201 KB)

We should see if we can get to the bottom of this error by using a real device, running through smoke tests and monitoring the web console/ network tab for errors to get a sense of the impact. 2hrs playing with an iPhone.

In addition to this I suggest we use User-notice to reach out to community to see if anyone using an iphone on mobile has encountered any bugs recently.

Event Timeline

Restricted Application changed the subtype of this task from "Task" to "Spike". · View Herald TranscriptMay 8 2020, 4:16 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Where do you want it reported if users do have problems? Is there a talk page we can direct them to? A lot of non-technical editors are confused by Phab.

Well I found an error. I don't think it's responsible for the vast majority of errors, but maybe it's related.

Using turnilo, I saw that most of the errors are from anonymous users. So as an anon I was clicking around the mobile site, opened up VE, and while on the "Edit without logging in" warning screen, I clicked the add-link button in the toolbar. Right about then VE started doing it's own logging, but the first event was sent to statsv, with the following payload:

ve.mobile.performance.system.apiLoad: 536ms
MediaWiki.minerva.WebClientError.anon: 1c

In a url like this: https://en.m.wikipedia.org/beacon/statsv?ve.mobile.performance.system.apiLoad=536ms&MediaWiki.minerva.WebClientError.anon=1c

Looking at the console, there did appear to be a localStorage related exception.

[Log] Exception in store-localstorage-update: (load.php, line 2)
[Warning] QuotaExceededError: The quota has been exceeded. (load.php, line 2)
setItem
flushWrites — load.php:1147
(anonymous function) — load.php:1175

I did a screen-capture of this event, since I was able to replicate it a couple of times.

https://drive.google.com/file/d/1DJSr2WdXz9GTC-pAfA031YhDWIvFK05O/view?usp=sharing

I guess it's also worth mentioning that (AFAIR) the localStorage quota on iOS is 5MB per site, so I can image it's pretty common for it to fill up.

Added to Tech/News/2020/21. Let me know if you need it to be more specific.

Ran select referer, count(*) as c from webrequest where day = 14 and month = 5 and year = 2020 and hour = 0 and uri_path LIKE '%beacon%' and uri_query LIKE "%WebClientError%" AND user_agent LIKE "%CPU iPhone OS 13_4_1 like Mac OS X)%" group by referer sort by c asc;

Lots of errors occurring on these pages:

https://en.m.wikipedia.org/wiki/Barry_Zito
https://en.m.wikipedia.org/wiki/Michael_Jordan
https://en.m.wikipedia.org/wiki/Kandi_Burruss
https://en.m.wikipedia.org/wiki/Bow_Wow_(rapper)
https://en.m.wikipedia.org/wiki/The_Masked_Singer_(American_season_3)

@Jdrewniak the localStorage theory is interesting. I was surprised to see

Object.keys(localStorage.getItem('expandedSections')).length

returned 2848 items.

We could test the localStorage theory by clearing this out periodically:
e.g.

try { if ( Object.keys(localStorage.getItem('expandedSections')).length > 10 ) { localStorage.removeItem('expandedSections'); } } catch (e) {}

This should likely be using session storage on the long term..

So I dove a little deeper into the localStorage theory.

On a random en.m.wikipedia.org page, I set a breakpoint on localStorage.setItem(key, data); to see what was being set, and..

Screen Shot 2020-05-18 at 11.30.24.png (1×1 px, 995 KB)

(that's 5.5mb)

The mobile site seems to be tracked by the new error logger which includes stack traces.

Filtering by iOS

Thanks for sharing that. Most of these seem to be errors with translate wiki specific to mediawiki.org as well as the issues T253045 and T253047

@Jdrewniak thanks for investigating. I'm getting an error on every page view after maxing out localStorage so I'm sure that's the reason. Have opened up T253084

[Log] Exception in store-localstorage-update: (load.php, line 2)
[Warning] QuotaExceededError: The quota has been exceeded. (load.php, line 2) […]

This "error" is a debugging message from the resourceloader.exception topic. These are generally harmless. They are simply saying that LocalStorage is full which, if true, is expected.

The bug can be replicated by running the code

try{
    var i = 0;
    while ( true ) { i++;
        localStorage.setItem('test', new Array(i * 100000).join('a'));
    }
}catch(error){
    alert("test stopped at i: " + i);
}

to fill localStorage.

If, after the above loop, you run one more of those setItem() statements you'll find the same QuotaExceededError error as the one that RL logged.

We have numerous features that utilize localStorage. There is for example the persisting of expanded sections in MobileFrontend mentioned earlier in this task. There are also numerous parts of core, and other extensions that use it. These features generally use the mw.storage abstraction which silently ignores all LocalStorage-related errors (including QuotaExceededError). These can happen for any number of reasons. Including, for example, when a browser's "private browsing" mode is active (which means setItem, or even getItem, can cause an exception).

In ResourceLoader, I choose to log this exception to the conisole (rather than hide it entirely), via the resourceloader.exception topic. When investigating other problems, it can be useful to know that this situation was encountered by the user.

The error in question appears to be in the startup module.
[…] we should filter it out in the Minerva error counting by checking the error source.

The startup module is where the code lives that maintains the optimised/unfragmented module cache store. The condition reached here, code-named store-localstorage-update, means that ResourceLoader has cleared its portion of LocalStorage automatically and thus not added any module caching (falling back to HTTP browser cache).

If mw.storage had a debug mode, then interacting with any other feature that uses localStorage would produce the same warning in the console.

As this one is reported via resourceloader.exception (and not global.errror) it means it was correcly caught and dealt with. In the new client error infra, we only track global.error. I have recommended in the past to ignore these other ones for Minerva. Worth reconsidering I suppose :)

we are seeing resource loader exceptions on every page view on iOS 13 and the number of errors from iOS in our error counting is climbing by the minute.

Do you mean that the above is reproducible consistency on any page view, or do you mean that localStorage is in fact full for all iOS users after merly loading a page on the mobile site? If the latter, then there is something to investigate further.

As this one is reported via resourceloader.exception (and not global.errror) it means it was correcly caught and dealt with. In the new client error infra, we only track global.error. I have recommended in the past to ignore these other ones for Minerva. Worth reconsidering I suppose :)

👆I'd be interested to see if that brings down this rate as well.

from errorLogging.js#15

if ( config.get( 'wgMinervaCountErrors' ) ) {
	// track RL exceptions
	trackSubscribe( 'resourceloader.exception', countError );
	// setup the global error handler
	trackSubscribe( 'global.error', countError );
}
Jdlrobson claimed this task.

Do you mean that the above is reproducible consistency on any page view, or do you mean that localStorage is in fact full for all iOS users after merly loading a page on the mobile site? If the latter, then there is something to investigate further.

On the iOS device I'm testing it was incredibly easy to fill up localStorage - it only has 5mb and then it doesn't clean itself up meaning I'm getting a resourceloader.exception on every page view. There has no visible impact on the user.

👆I'd be interested to see if that brings down this rate as well.

That was my intention in T253084.

Am closing out this task as it seems we have some possible answers to the spike.