User talk:Bearcat/Archive 95
This is an archive of past discussions with User:Bearcat. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 90 | ← | Archive 93 | Archive 94 | Archive 95 | Archive 96 | Archive 97 | Archive 98 |
Please stop removing Draft cats
I know you are just trying to help, but please stop removing Draft categories as you did in this edit. This merely results in unnecessary extra work for one or more editors: one to tag the Draft as uncategorized, one to remove the uncategorized tag and add the categories back in again, and one to explain the situation. Instead, just leave the categories in the Draft where they are, and surround them with {{Draft categories| ... }}
. Thanks, Mathglot (talk) 16:21, 18 May 2024 (UTC)
As I keep telling people, for a variety of reasons it takes a significantly longer amount of time to disable categories than it does to merely remove them:
- sometimes the page is in bad categories that it wouldn't belong in even if it were in mainspace, which makes more cleanup work down the line if they're just left there,
- sometimes the page is in redlinked categories that don't even exist at all, which makes more cleanup work down the line if they're just left there,
- novice draft creators who don't know standard Wikipedia process don't always place the categories where they're normally expected to be, or sometimes place more than one separate cluster of categories in more than one different place in the draft, meaning that they end up having to be searched for.
So, for all of those reasons, removing the categories from a page takes only a few seconds per page, while disabling them can take up to two or three minutes per page. Sure, that isn't a very big burden if you're dealing with just one or two categorized drafts at a time — but if I'm dealing with the comprehensive weekly system report of categorized drafts, there are hundreds of pages to deal with all at once. Which means it's already a two or three hour job even just to get through it the quicker way, and would become a 12 to 24 hour job if I did it the longer way.
So, sure, to the outside observer it doesn't seem like that big a burden to invest the extra time into disabling the categories instead of removing them, because the outside observer is only seeing one or two pages that they were interested in — but to the editor who actually has to power through a batch of hundreds of categorized drafts in one shot, the extra time accumulates into a much, much bigger burden than they can actually be expected to volunteer to take on. So, sure, if I come across one or two pages in isolation, disabling categories instead of removing them isn't that much of an imposition on my time — but if I'm having to get through hundreds of pages at once, it adds up to a tremendous imposition on my time. Bearcat (talk) 16:51, 18 May 2024 (UTC)
- Understood; I get it; it definitely takes longer that way. but there’s a problem, because just removing them, creates triple the problem for other editors, who are adding in a no-categories template, followed by another editor removing the template again and putting the categories back in and has to add the draft categories template, followed in this case, at least, by an explanation that have been removed and draft-protecting them. Also, if it takes less time to just remove them, which is definitely true, it takes a whole lot more to replace them again, and protect them with the draft categories template, which is what I’ve been doing.
- I apologize for the couple dozen notifications you got today about the removals, and it may be best to go into your settings and disable notifications for me so you don’t get these notifications all the time when I restore the draft-protected categories.
- But honestly, removing the categories isn’t really an improvement in many cases, and if it’s too burdensome to go through them and draft protect them, then maybe the best thing is not to remove them at all. Given how repetitive and pattern-based the problem is, I think there may be a better solution for everyone involving a user script. If we go to the the script requests page, we can request a script to be written that will just do this automatically. Then neither of us or anyone else will ever have to bother with this issue again. As things stand now, I just see this as a giant waste of your time, and mine, on something that ought to be done by bot and could be automated fairly easily, thus freeing us up to do something more productive. Thanks, Mathglot (talk) 20:00, 18 May 2024 (UTC)
- Oh, trust me, I've asked for a bot to look after that before. Not all, certainly, but a lot of drafts have categories on them because they were created in mainspace and then got sandboxed by another editor for lacking sourcing or whatever, without that other editor taking the extra step of disabling the categories in the process — so since there's already a bot that detects such moves and tags them as {{drafts moved from mainspace}} anyway, I've asked if that bot could automatically disable any active categories that are on the draft at the same time, so that we could invest a whole lot less time into draft category cleanup, but that bot's maintainer refused. So I tried asking if another bot could be set loose on the task, only to have no action taken on that request at all.
- Of course, another alternative would be if the report were generated daily, so that instead of updating once a week with a couple hundred categorized drafts all in one shot, it was immediately catching 20 to 30 categorized drafts every 24 hours, so that they could be dealt with promptly in a shorter amount of time per day — but it doesn't, so we're left with having to work within the constraints of the report as it exists.
- But also, AFC approval doesn't happen on entire batches of hundreds of drafts all at once — each AFC reviewer approves or rejects one page at a time, and lots of drafts don't get approved at all — so any "extra" time that has to be put into categorizing the approved draft amounts to a couple of minutes at most, while the extra time that has to be put into polluted-category cleanup can add up to half or more of an entire day. So as long as pollution cleanup remains a task that human editors have to spend time doing, the editors who do it have to be able to get it done in the shortest amount of time possible.
- I mean, I'd dearly love to not have to spend nearly as much time on it anymore, but it's not a task that can just be ignored outright — it's a thing I do because it has to get done, not a thing I do because I think it's fun or enjoyable. So until we do have the tools in place to automate it, I have to be able to minimize the amount of time that actually has to be spent on it, because getting drafts out of categories they shouldn't be in takes hours and hours longer to deal with fixing than putting pages into categories in the first place does. Bearcat (talk) 20:36, 18 May 2024 (UTC)
- Interesting to hear te history behind it. I’ll see if I can get some movement going on a bit, whether the same one with a new task, or a brand new one. Thanks for the detailed explanation, that gives me a better window into what is going on. Mathglot (talk) 12:26, 19 May 2024 (UTC)
- I just undid some more of these. Can we please hold off on this kind of edit until we get a bot going or something? It’s not helpful. Thanks, Mathglot (talk) 21:24, 22 May 2024 (UTC)
- No, we can't, because the pages absolutely cannot stay in categories for any length of "meantime". It's not a "leave it for weeks and weeks of no action pending an eventual alternative solution" job, it's a "drafts must be immediately taken out of categories the moment they're found" job. And again, there is no rule that categories on draft pages must always just be disabled and can never be removed outright, so I'm not doing anything improper or inappropriate. Bearcat (talk) 21:35, 22 May 2024 (UTC)
- You just keep doing them and I just keep undoing them which seems like a giant waste of both of our time, but I can’t keep up with you because you are too fast for me, and I’m thumb-typing on a mobile device which makes it doubly hard.
- Are you using AWB or some tool-assist to do this? I’m not an AWB user, but as I understand, it is very good at recognizing patterns and making alterations based on them, and this seems tailor-made for that sort of thing, so instead of deleting the categories, you could just have AWB wrap them in {{Draft categories}} which will be the best of both worlds. I’m sure we could get someone with AWB experience to help with setting this up, if you don’t know how to do it. The bot alternative would be even better as it would solve the problem at bot-speed, and then none of these would ever turn up in whatever feed you are using again, freeing you up to do more productive things. Mathglot (talk) 15:52, 23 May 2024 (UTC)
- Since there is a rule that drafts can't be in categories, but there isn't a rule that any categories that are found on drafts have to just be disabled rather than removed, that means that getting drafts out of categories is a thing that has to be done while going around restoring disabled categories to drafts that had categories removed from them is not a thing that has to be done. So the problem here isn't that I'm doing anything incorrect or improper, it's that you're making unnecessary busywork for yourself on a task that isn't necessary at all.
- AWB isn't a viable alternative, either. There's no way to make an AWB-compatible text file out of the polluted categories report, for one thing — it lists the affected categories, not the individual draft pages in them, so the only way to find the pages is to check each category one by one to search for pages whose titles begin with "Draft:", and there's no way to generate a text file of the individual pages for an AWB batch. (The polluted categories report additionally doesn't know how to distinguish "categories for articles that also have drafts in them" from "categories for drafts that also have articles in them" — it just checks and lists every category that mixes both drafts and mainspace articles regardless of which type of content is supposed to be in that category, while it's a human editor who has to determine that the problem in some case is a misfiled article rather than a misfiled draft — so even if it were possible to generate an AWBable batch report out of it, that would still catch some pages that don't actually need any action and miss some pages that do.)
- AWB additionally doesn't offer me any way to verify whether any given category is a redlink that doesn't exist, or a duplicate-categorized parent of another category that the page is already also in, or a category that the page wouldn't even belong in at all anyway — all of those are types of categories that have to come completely off the page even if you're just disabling other categories, because leaving them there just creates other cleanup work in the future if the page ever gets mainspaced without those redlinked or duplicate categories having been removed. So those are things I have to catch and fix at the same time as any other DRAFTNOCAT cleanup, not things that can just be left on the page for a future round of other cleanup — but AWB offers me no way to determine whether any given category exists or not, no way to determine whether any given category is appropriate or not, no way to determine whether any given category is a parent of another category that's also on the same page already, and on and so forth. So even if there were a way to generate an AWBable text file out of the polluted categories report, AWB still wouldn't be the way to go with it, because AWB doesn't feature any of the tools I would need to fix any of the other problems that I also have to check every page for at the same time.
- But ultimately, the only rule is that draft pages can't be in categories — there is no rule stating that categories on draft pages must always just be disabled rather than removed, and some types of categories (redlinks, duplicates, ones that the page would never belong in regardless) have to be removed outright even if other categories are merely being disabled. Getting drafts out of categories is an essential task, while going around readding and disabling categories that other people have removed from drafts is not an essential task. So if you're finding it a burden, consider that it's not essential in the first place, and thus is a burden you don't actually have to take on — disabling categories on drafts is not a requirement, it's merely one of two options alongside removal as another option, and there just isn't any rule that disabling is required and removal is forbidden. So going around readding disabled versions of categories that other editors have removed from drafts just isn't a job that needs to be done in the first place, because "disable and do not remove" is simply not a rule. Bearcat (talk) 16:34, 23 May 2024 (UTC)
- No, we can't, because the pages absolutely cannot stay in categories for any length of "meantime". It's not a "leave it for weeks and weeks of no action pending an eventual alternative solution" job, it's a "drafts must be immediately taken out of categories the moment they're found" job. And again, there is no rule that categories on draft pages must always just be disabled and can never be removed outright, so I'm not doing anything improper or inappropriate. Bearcat (talk) 21:35, 22 May 2024 (UTC)
- I just undid some more of these. Can we please hold off on this kind of edit until we get a bot going or something? It’s not helpful. Thanks, Mathglot (talk) 21:24, 22 May 2024 (UTC)
- Interesting to hear te history behind it. I’ll see if I can get some movement going on a bit, whether the same one with a new task, or a brand new one. Thanks for the detailed explanation, that gives me a better window into what is going on. Mathglot (talk) 12:26, 19 May 2024 (UTC)
I am going to make a script request for someone to build a script to add the draft categories protection. Can you please describe the procedure you use now to delete categories? I want to make sure that the script will be at least as easy and fast or faster than what you are doing now. Thanks, Mathglot (talk) 08:47, 26 May 2024 (UTC)
- If this was a script that I still had to visit each categorized draft to manually invoke, then there's no way it can possibly save me any significant mount of time. As I explained above, the polluted categories report just produces a list of the implicated categories, and I have to visit each category to find the draft — which is precisely why it's already a two or three hour job even doing it the quickest possible way, and there's no way that a category-disabling script that I had to manually invoke could reduce that amount of upfront time enough to make just disabling the categories a more time-efficient alternative than clicking a few minus signs in HotCat.
- I have to click on each individual category listed in the report, and do one of several different things to find the draft: the category might have only a few pages in it, in which case the draft is easy to find just by eyeballing the list. It might have enough pages in it that I have to pull up my browser's search-in-page function to look for "Draft:". It might have more than 200 pages in it, so that I have to use the category's next button to search more than one page's worth of category entries. It might have so many pages in it that I have to do a full-on incategory search with Wikipedia's search function. So reducing the amount of time involved in the job isn't just a matter of making it easier to disable the categories after the draft has been found — it also requires finding ways to reduce the number of categorized drafts that even have to be found in the first place.
- The only thing that would actually reduce the amount of time I have to spend working through the polluted categories report in any sense would be a bot going through Category:All content moved from mainspace to draftspace and disabling any categories on its own before they ever get picked up by the polluted categories report in the first place. This obviously wouldn't catch every categorized draft across the board, because that isn't the only way that drafts become categorized, but it would catch a lot of them, and thus save time because the report itself would be reduced in size.
- But if a solution would leave the report itself unchanged in size, and required me to continue searching each category to find the individual drafts and deal with their categories manually, then there's no possible way for any such script to be a quicker alternative than HotCat if the script wasn't doing anything about the upfront "finding the drafts in the first place" time as well. Bearcat (talk) 12:52, 26 May 2024 (UTC)
- I think I understood the essential points there, so let me run with it and see what happens. I’ll ping you from wherever, so you can follow along. Might not be right away as I’m mobile. Hopefully something will come out of this to make your life easier, at least in this point. Mathglot (talk) 14:05, 26 May 2024 (UTC)
Bots devoted to this
I think you can save yourself the time you spend doing these as a manual effort, as bots PrimeBOT 2 and DannyS712 bot 3 are both devoted to this task and do it automatically. I believe that PrimeBOT might no longer be active but DannyS bot is, so it should handle all of the cases you are currently doing manually. If you think it is missing some, can you please leave some examples? Thanks, Mathglot (talk) 00:35, 4 June 2024 (UTC)
- I have no sense of how many categorized drafts DannyS712 bot catches, but I can state categorically that it doesn't catch them all — if it did, then Wikipedia:Database reports/Polluted categories (2) would always be empty, because if the bot were catching all the categorized drafts then there wouldn't be any categories with drafts in them for that report to detect. The fact that it isn't empty, and in fact inevitably catches a couple of hundred categories with drafts in them each time it runs means that the bot is not catching all categorized drafts, so anything that shows up in that report is something I do have to handle manually because a bot failed to catch it, so every single category that ever shows up in that report at all is always an example of at least one categorized draft (and sometimes several) that the bot failed to catch. So luckily I don't need to compile a list of missed drafts, because that report already constitutes a list of missed drafts. Bearcat (talk) 02:20, 4 June 2024 (UTC)
- How do you know the bot failed to catch them all? If you are deleting them before the bot gets there, then it won't find those, but maybe it would have a few minutes or hours later. Although bots are fast, they are not infinitely fast, and if you are looking at the category at the same time the bot is, you will still see whichever drafts the bot hasn't gotten around to yet. Maybe it would've handled them shortly after you got there, making your effort unnecessary. Or, maybe your efforts are vital and the bot simply cannot handle that level of volume and would've quickly become overwhelmed but for your efforts at keeping the numbers down. Shouldn't we find out which scenario is accurate? If it's the first one, then you could save yourself the effort and let the bot do it, while you could spend your time doing something more fun, productive, or something a bot simply could not do. Mathglot (talk) 04:40, 4 June 2024 (UTC)
- I checked the edit histories of several articles the last time I did a run, precisely because of this conversation — and many of the pages I dealt with had been in the categories for an entire week, because they had been placed in categories on May 26 or 27. The report isn't catching categories that have had drafts in them for a few minutes, such that the bot might come along in just five more minutes — it's catching categories that have had drafts in them for days and days, because the bot never came along at all.
- The thing is, if you read that report for DannyS712 bot, it's not programmed to just scour draftspace looking for any and all categorized drafts across the board — it's programmed to monitor one specific maintenance category, Category:AfC submissions with categories, that only catches pages that have both categories and the AFC submission template on them at the same time. If a draft page doesn't have the AFC submission template on it, then it won't be in that maintenance category, and thus the bot won't ever see or catch it at all. Again, I don't monitor the bot's edit activity, so I don't know how many drafts it typically catches in a day — but it only catches categories on drafts that have the AFC submission template on them, and does not catch categories on drafts that don't have the AFC submission template on them, and a lot of drafts don't have that template on them and thus will never be caught by the bot. Bearcat (talk) 10:18, 4 June 2024 (UTC)
- That is really good information. I'm not sure why the bot is programmed that way and I will see if it's possible to track down why it was done that way. Maybe the bot can be adjusted to do what you suggest, and then we won't have that problem anymore. Also, not sure why PrimeBot was deactivated, if that's what it was, and maybe it can just be restarted. I'll check. Mathglot (talk) 16:02, 4 June 2024 (UTC) Follow-up here. Mathglot (talk) 17:31, 4 June 2024 (UTC)
- And here, at the bot Talk page. Mathglot (talk) 09:00, 24 June 2024 (UTC)
- The bot's modification request is now approved, so there has been progress, but we're not there yet. I'm continuing to monitor and will keep you informed of future developments. Mathglot (talk) 22:38, 20 July 2024 (UTC)
- How do you know the bot failed to catch them all? If you are deleting them before the bot gets there, then it won't find those, but maybe it would have a few minutes or hours later. Although bots are fast, they are not infinitely fast, and if you are looking at the category at the same time the bot is, you will still see whichever drafts the bot hasn't gotten around to yet. Maybe it would've handled them shortly after you got there, making your effort unnecessary. Or, maybe your efforts are vital and the bot simply cannot handle that level of volume and would've quickly become overwhelmed but for your efforts at keeping the numbers down. Shouldn't we find out which scenario is accurate? If it's the first one, then you could save yourself the effort and let the bot do it, while you could spend your time doing something more fun, productive, or something a bot simply could not do. Mathglot (talk) 04:40, 4 June 2024 (UTC)