The Format constraint had to be removed (see T101467), but it should be checked somehow nevertheless.
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Implement Format constraint with SPARQL | mediawiki/extensions/WikibaseQualityConstraints | master | +214 -47 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Lucas_Werkmeister_WMDE | T102752 [RFC] Workaround for checking the format constraint | |||
Resolved | Lucas_Werkmeister_WMDE | T169966 Add setting for turning off the Format constraint |
Event Timeline
@LucasWerkmeister interesting question. There's two things we need to check I guess:
- Write the SPARQL query and see if it performs properly.
- See if that doesn't open us to the same issues as before.
In general, Blazegraph has timeouts and memory limits on queries, and does not use PCRE engine (it uses java.util.regex AFAIK). But in theory there could be some problem there. Since it's just generic query, that problem would be present regardless of constrains, though, so we should not be a concern.
So, I'd write the queries and test if they perform well, and if so, I think it's ok to add query constraints for this one.
The template already includes a link to wdqs. It works for most constraints.
I think I had to add some LUA to make it work.
Change 363605 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseQualityConstraints@master] Implement Format constraint with SPARQL
Change 363605 merged by jenkins-bot:
[mediawiki/extensions/WikibaseQualityConstraints@master] Implement Format constraint with SPARQL
Done.
Admin’s note: if this turns out to cause problems, it can be disabled with
$wgWBQualityConstraintsCheckFormatConstraint = false;
@Lucas_Werkmeister_WMDE:
Can you insert a screenshot of the gadget in the issue on top?
Note: In the current version, if the check is not satisfied, the user gets shown a regex.
Basic Problems:
- We can't expect users to know regex (of our 5 example users/personas, only 1 or 2 know what it is.
- Even if you know regex, they are hard to read even for experienced people
So, usability heuristics to apply here:
- "Match between system and the real world" (we should use concepts familiar to the user)
- "Consistency and standards" – our other constraint infos are pretty well to understand, this one is not
- "Help users recognize, diagnose, and recover from errors: Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution."
For the latter, we don't satisfy any of the user needs. We should:
- Say that it did not match [whateveritchecksfor], so like "The check for a URL failed.
- say what the problem is: It seems that your url does not have an "https://" in the begin
- suggest a fix, like "Try to add http:// or https:// in the beginning, if the URLs are otherwise correct"
So the error message would be: "Your input was checked and was not recognized as a URL
Maybe the qualifier "syntax clarification" could be displayed.
Sample from local dialing code (P473):
"string combining digits, spaces, - (All else excluded, such as: ,/;()+ )"
Let’s move the discussion over to T170374: Format constraint UX, because this task has already been repurposed enough as it is.