Page MenuHomePhabricator

Wikipedia main content loss of sources because of reverts, try to preserve them
Open, Needs TriagePublic

Description

For years, the difficulty to edit articles is a known recurrent problem, because some patrollers revert without seeing the value of non-perfect new edits.
A part of this problem is the loss of sources.

All evaluations bellow are statistic indications, never a judgement and only humans can decide.

Wikipedia sources are the base of wikipedia then we can search a way to estimate and preserve them.
In each domain, science, litterature, politic, sport... we can/could estimate:

  • original repositories (reviews, papers, authors, ...)
  • comment repositories (reviews, papers, authors, ...)
  • editing users: how many source by edit? Which means sources quality?
  • patrollers: how many keep of ligth or absent sources? Which means sources quality?
  • patrollers: how many lost of good sources? Which means sources quality?

We can/could statistically estimate new citations from:

  • from behind estimations
  • classic official estimations (from public citations, nobel price...)
  • previews number of uses in wikipedia for each repository, author...
  • previews edit from a user for each repository, author...
  • previews uses in wikipedia from number of uses

We can/could display the statistic quality of sources:

  • to help patrollers in estimation of the sources and the editing user
  • in watchlist : short display (one or some numbers about source and user)
  • in history diff edit : detailed display

We can/could automatize the check of links to sources and alert to repair them:

  • alert the user who gave the source
  • alert some patrollers who have work on the page

Event Timeline

Hi @Rical. Please associate at least one project with this task to allow others to find this task when searching in the corresponding project(s). Thanks. (For general information, see How to report a bug.)
Also, please provide a description of the problem that you would like to see solved and that this task is about. Currently, "Estimate and preserve our sources" is vague and it is impossible to define when this task is "solved". Also, who is "our" in "our sources"?

Rical renamed this task from Estimate and preserve our sources to Wikipedia main content losts sources because too reverts, try to preserve them.Jul 23 2016, 5:33 PM
Rical added a project: Community-Tech.
Rical updated the task description. (Show Details)

@Rical: What do you expect from the Community-Tech team when it comes to this task? I'm asking as you added them to this task.
I'm thinking that this extremely broad topic should be first discussed on Analytics and Research mailing lists, in order to agree on approaches, and to break down ideas into actionable tasks.

Thanks to help me, I'm a new bee about projects! You have my confidence to act for me if I mistake!

Untagging Analytics - we're an infrastructure team and this looks like a research topic. The Snuggle project might be of interest.

I'm a new bee about projects! When I posted this task, it was "mine".
Now it is enough explicit to become "our".
I don't find the "Snuggle" project. If anybody knows adequate project/s, please, she or he can change it/them himself.

@Rical: See https://en.wikipedia.org/wiki/Wikipedia:Snuggle

Assuming that you'fd like to work on this yourself, I'm assigning this task to you and assuming that you plan to clarify and decrease the scope of this task.

This comment was removed by Rical.
Rical reopened this task as Open.EditedAug 12 2016, 9:18 AM

I delete my previous comment as too ligth, and I apologize to you and your good comment.

Thanks to show me the very usefull Snuggle tool. I didn't imagine that a such tool could exist. It reveals the complexity to estimate each of our actions.
Perhaps a Snuggle tool based on 30 last edits from each user could be usefull. But the number of users is very higher and a selection could be wellcome.

Assign this task to me: My life has learnt to me that I am a technician, not a manager. But I can collect and organize ideas. I mainly work on T135845 and I want first finalize it.

I agree this task seems too large but we must first think about many aspects, and only later select what to do. A search of "source quality" gives some tasks about close subjects: T108292, T28426.

A new technology will help to estimate contribution quality predictions and user intent predictions.
See T137824 and Edit Review Improvements/New filters for edit review.
It could probably efficiently help to estimate collected sources at edits times from all contributors in wikipedia.

For an easier implementation ORES provides the machine learning as a service.

Sources are they already estimated in ORES?

The project New filters for edit review will support this task as of June 2018.

As of June 2018, these filters only concern the usual modifications.
The suport of the sources will be available later.

Aklapper renamed this task from Wikipedia main content losts sources because too reverts, try to preserve them to Wikipedia main content loss of sources because of reverts, try to preserve them.Mar 6 2020, 3:47 PM
Aklapper removed Rical as the assignee of this task.
Aklapper updated the task description. (Show Details)

@Rical: I have reset the assignee of this task because there has not been progress lately (please correct me if I am wrong!).
Resetting the assignee avoids the impression that somebody is already working on this task. Thanks for your understanding.