Commons:Bots/Requests/DougBot

Operator: User:Doug(talk contribs)

Bot's tasks for which permission is being sought:

  1. Categorization: I frequently run across categorization issues, such as overcategorization, undercategorization, and incorrect categories. As an example, Category:Nietzsche's_Werke has subcats for vols. III, VI, and VIII, but not for vol. I. Over 600 files (the jpg pages of the book) belong in Vol.I. Vols. III and VI were also incorrectly in Category:Œuvres de Nietzsche, a category best used for the french translations of his works; I manually fixed those as they were only a few. I will always investigate thoroughly before making changes. I will move a few of the elements of Vol. I to a subcat that as part of my test run. Later, after I have moved the entire group, I will set them up with a {{Book}} template witha {{BookNaviBar}}, so that they can be easily navigated in jpg form - however, doing this to only a sample set would be problematic.
  2. Tag maintenance: As an example, using the same files as above, most of the works in Category:Nietzsche's_Werke and it's subcats have the license tag {{PD-old-70}}, most of the works in German by Nietzsche should be converted to {{PD/1923|1900}} (or {{PD-old-100}}) to indicate world wide public domain status. Additionally, most have either {{Info}} or no information at all beyond a license. I will (eventually) replace {{Info}} if it's there, putting {{Book}} with {{Creator}} and {{BookNaviBar}} on each (via a custom template as was done by User:Inductiveload at upload on Category:Utopia,_More,_1518). However, only partially applying these would be problematic and would create a mess if it weren't finished in a single run for all 600 plus pages so I will do a smaller test replace run replacing only a small number of {{PD-old-70}} tags with {{PD-old-100}}. I will also plan to use the bot in a manual/bot assisted mode to replace {{Info}} with {{Book}} on other works after close investigation.
Most of these will be works that are in use or have been uploaded for use on a wikisource project. Because I frequently work with such files as part of my work on wikisource projects, I am well familiar with the normal set ups and I have been active in the improvement of {{Book}} and {{Creator}}. I am experienced in copyrights. DougBot currently runs formatting jobs on several wikisources and is flagged on en.ws and la.ws. I also run a second bot, s:Interwiki-Bot, a wikisource interwiki language link bot that runs on all wikisource projects and is flagged on all wikisource projects that provide for such flags.


Automatic or manually assisted:Automatic/supervised

Edit type (e.g. Continuous, daily, one time run):Intermittent. Bots runs when a job is identified.

Maximum edit rate (eg edits per minute):Approximately 2 edits per minute unless flagged/Approximately 6 edits per minute if flagged.

Bot flag requested: (Y/N):Doesn't matter to me; bot will make sporadic runs ranging from a few dozen to several hundred edits at a time - the example above is over 600 files. These may be infrequent but will be annoying in the recent changes and can be run much faster with a flag.

Programming language(s):Python (PWB)


User:Doug(talk contribs) 09:49, 12 September 2011 (UTC)[reply]

Discussion

  Support, it seems to be ok. -- Basilicofresco (msg) 10:04, 4 December 2011 (UTC)[reply]

I read the proposal, but you are right a test run is needed. Doug, could you please make a mixed test run of about 20 pages? -- Basilicofresco (msg) 18:59, 5 December 2011 (UTC)[reply]
Yes, sorry. I got tied up on some other things after I had planned out a large project for the above and never executed a test. I had several in mind, let me see if I can remember where I was and run a test within the next couple days. Thanks for the poke.--User:Doug(talk contribs) 19:22, 5 December 2011 (UTC)[reply]
I started to simply change the cat and template but then decided to combine the addition of {{Book}}. I ran several careful tests both on a private test wiki and in userspace as well as making dry runs in file space and I found several bugs in replace.py (which I've reported and am looking at to see if I can figure them out - one was causing multiple matches even with dotall:true). At the same time, relooking at this reminded me that I wanted to create {{Nietzsche's Werke, I}}, which I did, and apply it to each of the pages which combines several of the issues above into a single edit. I am running the bot on the first 20 pages per request (pages 001 and 003-021; 002 was done manually to check the function of the template parameters), the first couple of pages were done twice because I had already started them with the lest robust solution; I need to write a new script to add the parameters to make the pages navigable, but I can finish the recatting and add the parameters later, depending on how soon I get that done. Running at 90sec throttle.--User:Doug(talk contribs) 12:06, 7 December 2011 (UTC)[reply]
  Done Now that I have worked through this, I foresee this being the general solution in many such cases. There will be authors with individual jpgs or djvus that need copyright tags corrected and in some cases this may be appropriate for the bot but a lot of those will be just as well done by hand. However, for sets of jpgs like these the autocatting template will work best. I expect to process about 1500 pages of Nietzsche in the German version alone as even the properly categorized ones should need the correct copyright tag, a book template, and could be converted to autocat. I will then move on to the French and English versions noted above. Once these are autocatted, they will be much easier to rearrange, simply by changing the cat in the template.--User:Doug(talk contribs) 12:18, 7 December 2011 (UTC)[reply]
  • Why doesn't the template take the page number as a parameter? In the current state, nowhere on the file except in the filename is the page number mentioned. If the page numbers were put into a carefully constructed template, it would make it possible to navigate page by page through the book? --99of9 (talk) 01:22, 22 May 2012 (UTC)[reply]
That can be done and I intended to do it but decided to leave it to the end, can't quite recall why this moment, not really a bot issue though and easy to fix after the fact with a single edit but I'll put it on my list to look back at. It may have been because these need to be moved, I'll look.--User:Doug(talk contribs) 02:30, 3 June 2012 (UTC)[reply]

Approved Seeing no further issues have arisen, I'm closing this request. While there are still some improvements possible for those templates, they can be implemented separately. --99of9 (talk) 23:48, 9 October 2012 (UTC)[reply]