Wikipedia:Bots/Requests for approval/YiFeiBot 2
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Zhuyifei1999 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 05:20, Tuesday, September 3, 2019 (UTC)
Function overview: Wikipedia:WikiProject Guild of Copy Editors/Requests archival
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python: pywikibot
Source code available: toolforge:yifeibot/enwiki_gce_archive.py.txt
Links to relevant discussions (where appropriate): Wikipedia_talk:WikiProject_Guild_of_Copy_Editors/Requests#Bot_integration? and Wikipedia_talk:WikiProject_Guild_of_Copy_Editors/Requests#Feedback_requested_on_proposed_bot
Edit period(s): Hourly scan, usually none matches the archival criteria anyways
Estimated number of pages affected: If something is archived, it is usually two pages, the request page, and the archive year subpage like Wikipedia:WikiProject Guild of Copy Editors/Requests/Archives/2019. In the case when requests are submitted from two different years, then two archive year subpages are affected. There is no upper bound in code to the number of archive year subpages affected; however, the affected pages must have already been created.
Namespace(s): Wikipedia
Exclusion compliant (Yes/No): Yes, this is handled ungracefully (unhandled exception)
Function details: The bot performs actions in two stages, the parsing and the archiving.
Parsing:
- The bot shall read Wikipedia:WikiProject_Guild_of_Copy_Editors/Requests and find all sections.
- For each section it records:
- The level 2 section header line number, the level 3 section header (as the target page of 'copy editing')
- The start and end line numbers
- Any capitalized, word-seperated, 'copy edit purpose' acronyms
- The first line with both a link to user page / user talk page and a timestamp, as the requester and request time. This line must occur before the status templates
- Any line with {{done}} or {{partly done}} with a link to user page / user talk page and a timestamp, as the list of copy editors
- Any line with {{done}} or {{declined}} or {{withdrawn}} with a timestamp, as the status of the section and the copy edit completion date
Archiving:
- For each section that was parsed:
- The section much have been 'marked for archival' in its status
- Check that the completion time must be at least a day ago
- Find the quarter year table in the relevant archive page
- Find a position within the table where the archive row should be added, sorted by the date of request, and break ties with the last addition last
- Insert the table row into position
- Remove the level 3 section from source page, and if level 2 section contains no other level 3 sections, the level 2 section header is removed. It is assumed that the level 2 section contains no other contents prior to starting a level 3 section.
- Save any modified pages.
The example edits of this bot: Special:Diff/913781416 Special:Diff/913781433. The one day cooldown was disabled during this run.
I, Zhuyifei1999, provide the code and the running environment for this bot. Bobbychan193 shall be the point of contact for the 'functionality' of this bot. --Zhuyifei1999 (talk) 05:20, 3 September 2019 (UTC)[reply]
Discussion
edit- CC people who have participated in the discussion. @Baffle gab1978, Reidgreg, Dhtwiki, Masumrezarock100, and Miniapolis: --Zhuyifei1999 (talk) 05:20, 3 September 2019 (UTC)[reply]
- Great job! It handled the test cases well, and I like how the copy editor can place the purpose acronym(s) on the done line (in case the requester didn't use a valid acronym). One thing I forgot to mention in the earlier discussion is the possibility of a request by an IP editor. These are pretty rare (and may require some manual tweaking on the archive table so the requester column isn't too wide). If the bot has difficulty handling an IP rather than a registered username, or if the bot can't parse enough data from a section, it might be best for the bot to not attempt an archive of that section and a human editor can do it manually. – Reidgreg (talk) 14:40, 3 September 2019 (UTC)[reply]
- It should handle IP requests fine, since it matches user talk links, as long as they are signed. By too wide you mean those IPv6 addresses right? Would you give an example 'truncated name' and the criteria which the name should be truncated?
- From the 2019 archives in June, there are three requests (each declined) from 2405:201:8803:5F9D: D957:B329:DFE0:49D2. Rather than truncating, a space was added in the pipe to allow a line wrap. – Reidgreg (talk) 14:38, 4 September 2019 (UTC)[reply]
- Sorry for the late response (busy with IRL stuffs), both are done. The replacement will only happen for a full IPv6 username with all 8 nonempty 'hex values' next to colons. Diffs: Special:Diff/914260446 Special:Diff/914260476 --Zhuyifei1999 (talk) 05:58, 6 September 2019 (UTC)[reply]
- From the 2019 archives in June, there are three requests (each declined) from 2405:201:8803:5F9D: D957:B329:DFE0:49D2. Rather than truncating, a space was added in the pipe to allow a line wrap. – Reidgreg (talk) 14:38, 4 September 2019 (UTC)[reply]
- The current algorithm for determining the requester is to find the first line with both a link and a timestamp. This could overlap with the line from the copy editor. I'll fix that tomorrow so that the requester line must come before a status template line --Zhuyifei1999 (talk) 05:02, 4 September 2019 (UTC)[reply]
- It should handle IP requests fine, since it matches user talk links, as long as they are signed. By too wide you mean those IPv6 addresses right? Would you give an example 'truncated name' and the criteria which the name should be truncated?
- Great job! It handled the test cases well, and I like how the copy editor can place the purpose acronym(s) on the done line (in case the requester didn't use a valid acronym). One thing I forgot to mention in the earlier discussion is the possibility of a request by an IP editor. These are pretty rare (and may require some manual tweaking on the archive table so the requester column isn't too wide). If the bot has difficulty handling an IP rather than a registered username, or if the bot can't parse enough data from a section, it might be best for the bot to not attempt an archive of that section and a human editor can do it manually. – Reidgreg (talk) 14:40, 3 September 2019 (UTC)[reply]
- Looks all good! Asking for attention of BAG. {{BAG assistance needed}} Masum Reza📞 15:09, 5 September 2019 (UTC)[reply]
- filed two days ago, please be patient. Primefac (talk) 01:03, 6 September 2019 (UTC)[reply]
- Sorry, I don't really know how these things work. Masum Reza📞 12:50, 14 September 2019 (UTC)[reply]
- filed two days ago, please be patient. Primefac (talk) 01:03, 6 September 2019 (UTC)[reply]
- Approved for trial (100 edits or 14 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Primefac (talk) 12:45, 14 September 2019 (UTC)[reply]
- Sorry, I did not see this message due to IRL work. I will be starting the run tonight --Zhuyifei1999 (talk) 21:47, 18 September 2019 (UTC)[reply]
- Running every hour at 52 minute mark. I don't believe it will hit 100 edits with this timeframe of 14 days. @Primefac: Shall I run till October 3 or September 28 for test run? --Zhuyifei1999 (talk) 03:55, 19 September 2019 (UTC)[reply]
- It's one or the other, whichever happens first. The day length is to see how often it happens, the edit limit is to keep the numbers from getting too big. Primefac (talk) 13:53, 19 September 2019 (UTC)[reply]
- @Twofingered Typist and Baffle gab1978: While the bot is being test running, would you mind not performing manual archivals? --Zhuyifei1999 (talk) 06:21, 22 September 2019 (UTC)[reply]
- Noted; I've changed a partly done template to Not done to avoid archiving. Thanks for your hard work, Zhuyifei1999. Cheers, Baffle☿gab 19:31, 22 September 2019 (UTC)[reply]
- @Baffle gab1978: I don't think partly done actually triggers archival. In theory, it's simply something that the bot takes into account when archiving (to list multiple contributors). So, I would change it back. Bobbychan193 (talk) 19:55, 22 September 2019 (UTC)[reply]
- I confirm ^ --Zhuyifei1999 (talk) 01:05, 23 September 2019 (UTC)[reply]
- @Baffle gab1978: I don't think partly done actually triggers archival. In theory, it's simply something that the bot takes into account when archiving (to list multiple contributors). So, I would change it back. Bobbychan193 (talk) 19:55, 22 September 2019 (UTC)[reply]
- Noted; I've changed a partly done template to Not done to avoid archiving. Thanks for your hard work, Zhuyifei1999. Cheers, Baffle☿gab 19:31, 22 September 2019 (UTC)[reply]
Trial complete. 14 days since {{BotTrial}}. I will check the edits soon. --Zhuyifei1999 (talk) 00:49, 29 September 2019 (UTC)[reply]
- 20 edits 10 archivals. I could not see anything that is obviously wrong. Please correct me if I'm wrong. --Zhuyifei1999 (talk) 03:19, 29 September 2019 (UTC)[reply]
- The trial is done, though I'm not sure if it was supposed to go until 9/28 or 10/3. Bobbychan193 (talk) 20:28, 30 September 2019 (UTC)[reply]
- It looks good to me. I've also checked the edits, and the bot neatly handled one case I hadn't anticipated. It summarized the essential data onto the archive table, sorted by request date, with no human cleanup required. Considering how sloppy the input data is, it's impressive to see the bot produce nice clean output. – Reidgreg (talk) 19:47, 2 October 2019 (UTC)[reply]
- I'm happy with the trial results; the bot has performed its tasks well without problems. Could it leave a more helpful edit summary on the Requests page and the archive page? Something like; "Bot: Archiving requests for The Moon, The Earth and The Sun" would be great. That would help identify edits in page history; for example, if the requester fails to use a recognized acronym (GAN), or if the bot or a human does something unexpected. Cheers, Baffle☿gab 01:34, 3 October 2019 (UTC)[reply]
- Ok, I'll implement this either tomorrow or over this weekend. I can also add the purpose to the edit summary, something like 'Bot: Archiving requests for The Moon (GAN), The Earth, and The Sun (declined)' (where The Earth was done without 'purpose' found). Would that be better? --Zhuyifei1999 (talk) 03:08, 3 October 2019 (UTC)[reply]
- That would be useful; thank you. I'm thinking of ways we search and identify particular edits from the page history. People don't always edit the ways we'd like and we don't always notice when things go wrong, so anything that helps with problem-solving is going to be useful. Thank you. :) Cheers, Baffle☿gab 06:09, 3 October 2019 (UTC)[reply]
- Ok I did that. Need a test run. --Zhuyifei1999 (talk) 02:16, 7 October 2019 (UTC)[reply]
- {{BAG assistance needed}} I think it's been a week. Bobbychan193 (talk) 05:33, 8 October 2019 (UTC)[reply]
- @Zhuyifei1999 and Bobbychan193: Just to be clear: you are requesting an extended trial, correct? --TheSandDoctor Talk 01:24, 12 October 2019 (UTC)[reply]
- @TheSandDoctor: Yes. --Zhuyifei1999 (talk) 04:08, 12 October 2019 (UTC)[reply]
- @Zhuyifei1999 and Bobbychan193: Just to be clear: you are requesting an extended trial, correct? --TheSandDoctor Talk 01:24, 12 October 2019 (UTC)[reply]
- {{BAG assistance needed}} I think it's been a week. Bobbychan193 (talk) 05:33, 8 October 2019 (UTC)[reply]
- Ok I did that. Need a test run. --Zhuyifei1999 (talk) 02:16, 7 October 2019 (UTC)[reply]
- That would be useful; thank you. I'm thinking of ways we search and identify particular edits from the page history. People don't always edit the ways we'd like and we don't always notice when things go wrong, so anything that helps with problem-solving is going to be useful. Thank you. :) Cheers, Baffle☿gab 06:09, 3 October 2019 (UTC)[reply]
- I'm happy with the trial results; the bot has performed its tasks well without problems. Could it leave a more helpful edit summary on the Requests page and the archive page? Something like; "Bot: Archiving requests for The Moon, The Earth and The Sun" would be great. That would help identify edits in page history; for example, if the requester fails to use a recognized acronym (GAN), or if the bot or a human does something unexpected. Cheers, Baffle☿gab 01:34, 3 October 2019 (UTC)[reply]
- Approved for extended trial (100 edits or 14 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. whichever comes first. @Zhuyifei1999: if you would like different terms for the extended trial, please let me know. --TheSandDoctor Talk 06:15, 12 October 2019 (UTC)[reply]
- @TheSandDoctor: Thanks for granting the extended trial. I've pinged all of the GOCE coordinators in a discussion at the GOCE talk page. (Feel free to visit or chime in.) Bobbychan193 (talk) 07:09, 12 October 2019 (UTC)[reply]
- Thanks. I re-added it to cron --Zhuyifei1999 (talk) 07:23, 12 October 2019 (UTC)[reply]
Trial complete. 68 edits I'll list the 'interesting edits' soon ;) --Zhuyifei1999 (talk) 22:50, 27 October 2019 (UTC)[reply]
- Ok so after the two archivals Bobbychan suggested off-wiki that the edit summaries should adjust for plural, so I did that.
- Special:Diff/921333083 Special:Diff/921333106 and Special:diff/923192504 Special:Diff/923192523 are double archivals, with the last one demonstrates the automatic removal of the month h2 section.
- Special:Diff/921154641 Special:Diff/921154650 archives a title with a link with italics in it. The summary is copied to as-is, but the italic wikitext notation was not parsed. I'm not sure if wikitext syntax should be removed; it's probably more work than it is worth.
- There were no declines, withdrawals, or partly-dones during this run as far as I know.
- --Zhuyifei1999 (talk) 00:18, 28 October 2019 (UTC)[reply]
- Thanks for another successful trial run; I'm pleased with the informative edit summaries, which are helpful to view in my watchlist. I'm also pleased with the bot removal of the empty month's section. Hopefully it won't be long before it's an indefinite arrangement, assuming the other coordinators are happy. Cheers, Baffle☿gab 07:01, 28 October 2019 (UTC)[reply]
Approved. Trial looks good to me. Under normal circumstances, I would prefer to leave the close for someone else. However, given the backlog, lack of recent BAG activity (myself included), and the fact that this task is uncontroversial and based on how well the trial went, I am inclined to make an exception for this. As per usual, if amendments to - or clarifications regarding - this approval are needed, please start a discussion on the talk page and ping. --TheSandDoctor Talk 04:43, 12 November 2019 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.