July 2023 – W. Smith

I have to admit that I’m a digital packrat. According to my backup service, I have 18.4 million files. Some of these are system files or installed applications that I won’t get rid of. However, often they are unattributed or are one of a dozen files with the same name. Even more wasteful, many of these are identical variants of the same information.

I have a script that scans my drives regularly. It makes list of the files on the system. The scans include the size of each file, its timestamp, the path to the file and its filename. It’s comprehensive and automated (and a little inefficient).

However, it is simple to search. I can ‘grep’ a file name and find where its copies are located. In a recent mission of search and destroy, one file, ynew.exe, had 817 (!) copies. There were various builds of that file but many were exact copies of each other. Some of the files have been lurking since 2012. I didn’t realize how out-of-control things have become until I started developing a “de-replicate” tool.

Back to the topic of provenance, ynew is easy for me to identify since I wrote it myself. On the other hand, my font files are unattributed. When I move from one OS to another, I’ve usually copied the font files and installed them on the new system. Again, some of them are easy to recognize. “Bills-Font-5.ttf” came from a font generator service that took my handwriting and turned it into font. I’m not sure of the origin of others. Are they available on google fonts or only from a sketchy site that offers free font downloads? I don’t know whether I’ve already met the license requirements for using in a publication or commercially.

The metadata for many files is lost except for its timestamp and size. The timestamp can be a hint and the size is a shortcut to identify obviously different files.

.PDFs can also be mysterious. Where did I get this research paper on hypernatremia? And why? Who wrote this essay on creativity and when? Most of the mp3s that are enjoy are labeled (and replicated). Most have album and artist in the path and in the audio’s metadata. Other audio files aren’t as organized. The provenance of these files can be a mystery. Although music files usually have metadata inside them, I don’t know how to automatically evaluate such a massive set of files. Even images offer a jumbled mess of information. Photoshop and Lightroom just make it worse.

Trying to get ahead of the problem, I’m making metadata files surround the poetry that I have written. When I post it to Patreon, I’m also build metadata files to tie together drafts, links to the posts and details about the images I’ve used in the posts.

Provenance is a useful concept for museums and other cultural institutions. Their records of provenance provide a chain of documentation connecting the original creator with the current item.

With technology, that chain can become tenuous. I would like to simplify the chaos on my system, but I haven’t won the war and have only fought a few tentative skirmishes. I hope it’s worth it.

If someone violates community guidelines, they can be given a suspension or ban.

A suspension is a temporary hold. It could be a warning that is meaningful for someone who is pushing the boundaries too far and needs a digital rap on the knuckles with a ruler. A ban, forbidding access to the site, can be appropriate for people who are malicious and harming a web site’s community. Once they are pushed out, over time, their influence will be fade.

Black hole moderation is more demonstrative way of rejecting an account. To implement a black hole moderation decision, all content created by the user will be erased from the site. The incentive for such users to leave a “mark on the trees” will be eliminated.

Although black hole moderation could be a disincentive to bad actors, it might be painful to the site. It would not be something done lightly. For example, content that is copied into a reply or reposted might also be deleted. Technical solutions for such a search and destroy mission would be interesting to develop.

In a simple example, black hole moderation on DeviantArt would remove all of the user’s messages, art and interaction. On a message board, the hosts would remove all of the user’s messages and interaction with other users. Very difficult examples of a black hole moderation would occur on crowdsourced sites like Wikipedia and Fandom.com.

Adding black hole moderation to social media sites might be more useful and less difficult.

The purpose of the black hole moderation is to give a disincentive to the troll who incessantly adds bad content that doesn’t quite breach community standards but when taken as a whole is harmful.

If privacy of an individual can justify the right to be forgotten, black hole moderation embodies the right to forget.

W. Smith

Month: July 2023

Provenance: Where did that come from?!

Black hole moderation

Share this:

Share this: