User:Donald Trung/Minimalist categorisation (my philosophy on categorisation and why I do it)

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
User 👥 Talk 💬 Chinese cash coins 🀄 French Indo-Chinese banknotes 💴 Chinese charms, amulets, and talismans 🪙 Nguyễn Dynasty documents 📜 Numismatic books 📚 Weird stuff 😵 Articles 📝 Links 🔗

This page 📃 is an essay on categorisation in Wikimedia Commons and why I prefer to categorise the way I do.

I believe in what I like to call "Categorist minimalism", ironically this stems from my utter hatred of minimalisms, you see when it comes to writing ✍🏻 both Wikipedia articles and writing ✍🏻 file descriptions I believe that more information is always better, I used to hate stubs (but then again its better to have some information than to have none), and I always try to put as much information as possible into my works uncovering every nook and cranny I could find in the pursuit of knowledge. Categories don’t serve the same purpose, in fact having an abundance of different categories on one (1) file 📁 is more likely an indicator that there is a shortage of categories to describe a certain subject than anything else. My philosophy on "minimalism" is in fact "maximalism" where it concerns categories themselves. As time ⌚ passes by I’ve changed my stance on categories on Wikimedia Commons several times and will explain this evolution and the reasoning behind this.

Local Vs. National Vs. Superlocal[edit]

See also: User:Donald Trung/Why you should photograph every day subjects.

Now before I will explain my history with categorisation on Wikimedia Commons let me take the Frisian city 🏙 of Leeuwarden as an example. As a higher concentration of Wikimedia Commons media gets created and/or imported a thing I like to call (and literally just made up” called “category density” goes up. This is only logical, this aids both with the discoverability of a certain file 📁 within a tree 🌳 of categories and to make sure that the “master categories” won’t suffer from overpopulation. Now as Leeuwarden became “the European 🏰 capital city of culture” for 2018 this generated an explosion 💥 of Wikimedians going to Leeuwarden, Frisia ((West-)Frisian: Ljauwert, Fryslân), in fact this doesn’t just limit itself to hobby photographers from Wikimedia Commons as Flickr photographers and plenty of other photographers will also take this opportunity to take pictures there. This leads to the creation of what I like to call (and also just made this term up in order to explain these phenomena) “superlocal categories”. There are “universal categories” like let’s say “Category:Perfume shops” (I write ✍🏻 this without internet so don’t expect me to research actual category names), images in these categories could be taken anywhere on the planet 🌍, or even in the universe 🌌 as a perfume shop 🏪 on Mars will still be able to me added to this category without issue. Then we have “Contentital categories” like “Category:Perfume shops in Asia” where all Asian perfume shops from Ankara to Surabaya could be placed, subordinate to this category are “National categories” like “Category:Perfume shops in Saudi Arabia”, “Category:Perfume shops in India”, “Category:Perfume shops in Vietnam”, Etc. (not all countries are in one continent like Russia or the Netherlands, but usually most countries are) after that come “Local categories” like “Category:Perfume shops in Hanoi” or “Category:Perfume shops in Haiphong”. These categories are sometimes supplemented by “superlocal categories” such as those of individual perfume shops, often a local brand.

Now Wikimedia Commons has absolutely no notability standards, “spammer-photographers” (or “spam-photographers” or as one (1) word “spamphotographers”) are completely indistinguishable from “regular photographers”. As long as the categories aren’t for selfies of non-notable people that don’t include notable people (this is why the Nipponese Dog 🐶 has his own category) or non-notable groups of people such as only a local band (but there performances are acceptable if it’s in the category for the building they're performing 🎭 in), in general you can take a picture 📷 of any non-musician doing a non-pornographic action 🎬. Or just of your own genitalia as that stereotype of Wikimedia Commons remains true (although I personally don’t have anything against it). If you want to create a category for a small local shop 🏪 no-one in that hamlet has even heard of in the middle of a hamlet no-one else has ever heard of you can, seriously go ahead and do it as it might me the only internet presence that shop might ever have as many of these shops aren’t even on Facebook and / or Google Maps. In fact I remember having created a category specific for a second-hand shop in a small Groninger village that went bankrupt in that same year. The images of that shop are probably the only evidence of that place’s existence after the new owners renovate the building. The notability standards for Wikimedia Commons are almost non-existent.

Now that we’ve established what constitutes “notability” on Wikimedia Commons it would be quite easy to fill up various categories. It’s quite baffling that Wikimedia Commons doesn’t have a photographer-spammer problem where people keep uploading lots of pictures 📷 of their own shops in order to create categories for them to promote them, maybe this is because Wikimedia Commons files 📁 and categories might end up higher in the search 🔍 results of third party search 🔎 engines like Google, Microsoft Bing, Ecosia, Etc. Also this would mean that they won't have “full” copyright © to images of their own establishments, but it’s propaganda nonetheless because if someone will type in the name of their village let’s say “Hamletshiresby” and then they will find “Category:Otto Bucksmeyer’s Grains and Corns shop 🏪” right there which might inspire tourists to find it or even the (many) locals who’ve never heard of it. Also this is great for future 🔮 and/or Archivists who want to have good images to use, so in a couple of decades these images might have historical value. Despite Wikimedia Commons’ potential to be the largest Wikimedia project it actually comes second to Wikidata due to the investments of the Wikimedia Foundation as well as the Great amount of Bot-generated items on it. Wikimedia Commons would easily be able to “beat” Wikidata if it would utilise redirects, translations, Flickr, and many other possibilities. It might also be due to a lack of exposure, bad reputation (?that porn site where Wikipedia gets its images from”), and/or the fact that Wikimedia Commons has a very strict free-only culture. Nonetheless the exposure of Leeuwarden to “the international scene” helped it progress its category tree 🌳 and it sprang up so many branches that as a category tree 🌳 it now looks like “a miniature of a National category tree 🌳”. Usually this tends to be reserved 🈯 for larger cities like The Hague, Amsterdam, and Rotterdam. Another thing that can spark such an explosion 💥 of photography in a particular area is having a highly active Wikimedia Commonist(?) living in the area, some rather remote villages in the world 🗺 seem to be oddly “overrepresented” which is simply to thank due to a hobbyist photographer living there with too much free time on his/her hands 👐🏻.

Larger settlements (almost) always have categories like “Category:20XX in XXX” which can best be described as both “dump categories” (where large amounts of otherwise uncategorised photographs get “dumped in”) and “provisional categories” (where the items are uploaded in bulk before they’re given better categories), now to me these “annual categories” provide great insight for future 🔮 historians as they won’t need to take images individually and look for the dates. Comparably if a settlement has a very high number of Wikimedia Commonists(?) and/or visitors like Paris these categories will even be monthly like “Category:January 20XX in XXX”. This is only natural as categories are based on the number of images available on the subject, if there are only 3 (three) images of a subject it wouldn't make much sense to create a sub-category for every file 📁 unless they would have their own Wikipedia articles or you’re planning to add a (large) number of files 📁 in the future 🔮.

My evolution (since 2017)[edit]

Generally I always avoided creating new categories, maybe that’s because at Wikipedia I never really created any categories nor did I really care much for them, at a Wikipedia categories are handy for more general things but are hopelessly behind navigation templates and simply clicking on related subjects, sure Wikimedia Commons can use more navigational templates however these would be hopelessly interior to well-organised category trees, as I started adding more images my stance on what Wikimedia Commons categories are, what they represent, and how to utilise them has changed significantly. Now to explain why I utilise categories the way I use them now I would first 🥇 have to explain how I historically uploaded images to Wikimedia Commons and how I viewed categories then.

Honestly I kind of viewed them (Wikimedia Commons categories) as a nuisance at first 🥇, they were the “mandatory” thing you were “forced” to fill in while uploading and were “forced” to familiarise yourself with before you could even upload a single image. I did eventually research them but then found that some category trees were way better organised than others, many times categories seemed to overlap each other where sometimes entire categories seemed to also have been a part of a similar category that was not the same. In the beginning I didn't create “my own categories” simply because I expected to see “the regular police 👮‍♀️” of those categories fight any change like they do at a Wikipedia, however after creating my first 🥇 few categories I noticed that this wasn’t the case. At first I created rather general master categories and rarely created sub-categories but as I noticed that I uploaded a lot of images with some rather high overlaps I started creating a few more categories but never “too much”, even though in retrospect I should've. To use a comparison I wanted to upload images of Chinese Republican banknotes issued between 1912 and 1949 all into the master category in order to correctly count it against the Shanghai Encyclopedia’s, this turned out to be a major mistake as I ended up with almost 3000 (three-thousand) images into this master category (which as of writing ✍🏻 this on October 26th, 2018 has 0 (zero)) which proved to be more of a hassle than any form of benefit as searching 🔍 for any specific banknote forced me to use the search 🔎 engine, imagine that a category gets so overpopulated that you can’t even find anything that’s at the first 🥇 few images. Now as all catalogues organised them by issuing office 🏢 I decided that I should follow this as closely as I could. The same applied to Chinese cash coins, at first I didn't touch any of the categories I was working with but once I was allowed to contribute to the encyclopedia “anyone can contribute to” (Wikipedia) again I noticed how inefficient it is to have an overpopulated master category and started dividing Chinese cash coins into their categories based on their inscription as opposed to keep using the categories “User:Baomi” created many years ago. It’s a shame that I realised this rather later than earlier but as I am in another of my “Wikipedia dark ages” where I essentially write ✍🏻 almost nothing I won't see how people who practically use images utilise them so using master categories as basically “dump categories” is a habit I discontinue to create a more branched out category tree 🌳 for the subjects I work with. Good categorisation can help a lot with discoverability and underestimating this can prevent people from contributing.

A good example of how different I organise my own uploads is probably by looking at the two ✌🏻 (2) Groninger settlements I mostly photograph. In the city 🏙 of Winschoten, Oldambt I decided not to make too much categories, I organise most of my images into “Category:2018 in Winschoten” after another user 👤 created this category, I’ve added over a hundred 💯 (100) images of shops in Winschoten but rather than creating a separate “Category:Shops in Winschoten” I kept most of these in either “Category:Buildings in Winschoten” or their relevant “Category:XXX shops in the Netherlands”, I did this with a clear motivation to fill those specific categories, I treated the accumulation of educational resources as competitive where my goal 🥅 is for “Category:[Subject] in the Netherlands” to contain more files 📁 than “Category:[Subject] in [Another country]”, my geography was my limitation and my goal 🥅 here. I can’t say that I don’t do this anymore as usually I only add images to “Category:Shops in [Human settlement]” if I can’t find “a national category” to correspond it with. I eventually added so many images to these “master categories” that I discouraged myself from categorising them as it would be “too much work” because I uploaded too many of them in these “master categories”. However after I started photographing the Groninger village of Oude Pekela I had a realisation (which I will expand upon in the section "Competitive categorisation" below) and started creating pages like “Category:Shops in Oude Pekela”, “Category:Flower shops in Oude Pekela”, as well as “Category:De Helling (Oude Pekela)”, categories about its bridges, events, Etc. This is because I learned that files 📁 are more easily discovered when they’ve been more properly organised. Not being dependent on the existing categories to upload images gives you way more space 🚀 to upload images that fall outside of the current categories.

Other than my own photographs I tend to organise other images in more categories too such as imports from other websites with compatible copyright © licenses, in fact most of the current (as of October 29th, 2018) categories of individual Chinese cash coin inscriptions on Wikimedia Commons were created by me. This is also partially because I believe that “the next wave of Chinese cash coin Wikipedia articles” created by “the next generation of Chinese cash coin-writers” will be based on “individual” inscriptions as most overviews are already written and as I’ve created a general list 📃 of Chinese cash coins organised by their inscriptions. I also started creating more categories for subjects that are already represented on a Wikipedia but don't have their own categories on Wikimedia Commons as many subjects that I photograph have had Wikipedia articles for many years but didn't have any images on Wikimedia Commons until I created them and then uploaded them.

"Minimalist (file) categorisation"[edit]

Now let’s talk about what I would like to call “Minimalist categorisation” and how it works, I find that a lot of users 👥 add sometimes twenty (20) or more categories to a single image, they do this to try and represent every subject depicted in the image, this could be related to the location, lighting, subjects depicted in/mentioned by the image, companies associated with the image, Etc. Well, the thing is that this will create a large number of overlaps between categories, let’s use the fictional hamlet of “Floresby” and a user uploads an image to the categories “Category:Floresby”, “Category:Shops in the United Kingdom”, and “Category:Skippy ball shops”, these are three (3) categories that could all be featured in one, while I believe in “Minimalist categorisation” for the images themselves I believe in “Maximalist categorisation” for categories, the image could fit in a single category called “Category:Skippy ball shops in Floresby” which itself is subordinate to “Category:Shops in Floresby” and “Category:Skippy ball shops in the United Kingdom (by city/settlement)”, the former belongs to the category tree 🌳 of “Category:Floresby” while the other two categories are represented in the latter. This allows for the creation of more categories which could then be filled with images from other settlements.

Wikimedia Commons wasn't built in a day and creating more specific categories will also help more novice users 👥 organise their files 📁 better. But let’s talk about why these new categories are actually more helpful than simply adding a lot of categories to a single image. Discoverability is quite important because if an image is “impossible” to find it's very unlikely to ever get used, consequently users 👥 will look for already existing images before they’ll create a new one and if they can’t find the category they’re looking for they’ll assume that it doesn’t exist (even if it does). Maybe the searcher looked at “Category:Shops in the United Kingdom by type” and never even bothered to check “Category:Skippy ball shops”. There are essentially two ways to search 🔎 in Wikimedia Commons, through the built-in search 🔎 engine 🚂 or through categories. A lot of files 📁 are badly named and many people never find what they were searching 🔍 for in the first place, categories could (and should) offer a solution to this but many users 👥 simply don’t utilise them enough because they think 🤔 that adding more categories is better while adding more specific categories could be the best solution.

Not all categories will be filled in one day, but if plenty of people create master categories without ever subdividing those categories a year and a few fanatical uploaders later you’ll find a category with over 3000 (three-thousand) images and only a handful of sub-categories. This creates what I'd like to call “quantitative fatigue” where users 👥 want to look for a specific image to use will not be able to find what they want because there are simply “too much” images to choose from and the plethora of choice ends up hurting the platform rather than helping it. Now I have a solution to this, simply create categories like “Category:Valued images of XXX” and “Category:Images of XXX by XXX XXX” as well as “Category:Images of XXX taken in 20XX”, now these are what I call “the unnatural sub-categories”, these are the sub-categories that aren’t based on actual categories of the subject at hand ✋🏻 but could help stop 🤚🏻 the overpopulation of a certain category as many images are bound to come from certain sources and differentiating between high quality images and “the rest” could make it easier for potential users 👥 of the images to select one or more that would be of better value to them.

Another method to organise them would be if it concerns an interior or an exterior, if it is in one place or the other, the more categories exist the easier it is to organise and/or find any particular images.

"Competitive categorisation"[edit]

I previously used to engage in what I would like to call “competitive uploading based on category”, which isn’t based on upload competitions as I always use a mobile telephone 📱 as camera 📷, but based on comparing “Category:XXX shops in XXX” with “Category:XXX shops in XXX” (as in another place) and then trying to upload as many images as possible to the category I can contribute to by adding the greatest quantity of files 📁 to it, now this has caused a lot of “quantitative fatigue” for several image categories I photographed for and rather than purely looking at the number of files 📁 in “the main category” I now also look at the number of files 📁 in the sub-categories and/or the number of sub-categories themselves. Because I was looking at the forest I was unable to see the trees 🌳.

I personally believe that there should be a tool 🔬 that would allow a user 👤 to see how much images are in a particular category AND all of its sub-categories. And as I realised that having categories that are “too full” are very (re-)user 👤 unfriendly I started organising works rather than just filling 🥧 them. I hope 🤞🏻 that this essay has given anyone interested plenty of insight in how I categorise files 📁 and why I do it the way that I do it.

Original publication 📤[edit]

Sent 📩 from my Microsoft Lumia 950 XL with Microsoft Windows 10 Mobile 📱. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 10:13, 29 October 2018 (UTC)