Commons:Bots/Requests/UploadStatsBot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

UploadStatsBot (talk · contribs)

The first bot on Commons driven by node.js?

Operator: Rillke (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

Automatic or manually assisted: Automatic, running on wmf-labs.

Edit type (e.g. Continuous, daily, one time run): daily, currently at 03:20 UTC

Maximum edit rate (e.g. edits per minute): No throttle, just following mw:API:Etiquette. Sheduled by xcrontab, job submitted to grid engine.

Bot flag requested: (Y/N): N

Programming language(s): JavaScript. Running on Node.js - source code will be available on GitHub soon (it's on labs, /data/project/upload-stats-bot/)

Q & A

But node.js is slow!
True, it isn't the fastest but I am not using jsdom. Still, >1G memory is required. The nice part however is that the grid engine can decide when to execute the job. And once node is loaded, it is executing in a comparable speed with other interpreted languages.
Are you running heavy SQL queries?
No. The query getting the upload count takes (execute+fetch) < 0.1s -- This is faster than some queries run by MediaWiki when navigating to special pages.
Can the bot be used to vandalize pages?
First, someone has to insert the template. Then, the bot will replace that page by the template + upload count. However, there are some heuristics to prevent vandalism like checking the page size before and if it's too huge, the page is skipped. In other words users who would be otherwise blocked by AbuseFilter can't use it as a tool for page-blanking.
Is there a limit of pages updated by the bot?
Currently, the bot's execution time is limited to 90s and there is a limit of 500 pages.

Rillke(q?) 23:38, 9 April 2014 (UTC)[reply]

Discussion

Currently suffers from some badtoken errors (so not all pages are updated with every run), I'll have to investigate that. -- Rillke(q?) 23:38, 9 April 2014 (UTC)[reply]
Fixed. The issue was that tool-labs has such a fast connection to the servers that, if you request multiple edit tokens, they are not replicated before the next sever is replying to your request, consequently they are all different and only one of them is accepted when it comes to using them. -- Rillke(q?) 08:52, 10 April 2014 (UTC)[reply]
Looks OK for me. --EugeneZelenko (talk) 14:21, 10 April 2014 (UTC)[reply]

If there are no objections, I think task should be approved. --EugeneZelenko (talk) 14:32, 12 April 2014 (UTC)[reply]