The firehose… what a Genius!


The Firehose feature of genius.com is cool. It is a (figurative) firehose spewing out all of the activity on the site. It reminds me of Tweetdeck. To sample the firehose, you need to be logged in.

I see that some of the user names are (more than a little) sketchy. (Other sites I visit would ban them day zero.) I find the site pretty light-hearted—‘bad’ people get put in a penalty box (à la hockey) and people who are banned often get a chance to leave a parting message (which is just goofy).

Commenting in the forums requires more IQ than I have right now. Right now my project (quest) is to get all of the metadata for the group Saga corrected. Right now everything Saga is listed under the rapper Saïga. The real Saga from Canada is nowhere to be found. I guess I’m passionate about Saga…. having over 30 discs on my shelf.

There are plenty of lyrics websites. The closest one to Genius that I’ve found is Song Meanings. Both Genius and Song Meanings are crowd-sourced. But, Genius is hands down the winner.

The artists are part of the Genius community and often contribute textually or with song breakdown videos. Linkin Park Breaks Down “Good Goodbye” On Genius’ Video Series ‘Verified’ was one of the first I saw, before Chester Bennington’s suicide.

My vote of confidence for Genius is that I prefer to add site:genius.com when I’m using a search engine.

XMLIN, XMLOUT

tShirt "I Heart XML"
In addition to the standard STDIN, STDOUT and STDERR I/O streams available to processes, the concept of XMLIN and XMLOUT streams could be very useful. I’m not talking about XML pipelines such as described in https://www.w3.org/TR/xml-pipeline/ I don’t want to make tools that are programmed to manipulate all XML but rather use XML as an encoding of the data that is more flexible than flat files.

In a pipeline of processes, passing the information as structured data would allow the pipes to analyze information in a more intelligent manner. I would treat the XMLIN/OUT streams as a sequence of XML documents. The goal would be to help in processing complex data.

I have growing tables of user properties from deviantart. I could create filters to extract fields of the data and format them as XML. This formatted data could feed more specialized tools later in the pipeline.

For example, I could extract the number of works that each user has over time. The first filter would extract all of the records for each user. The next level in the pipeline would perform a regression analysis and the final tool would encode the information graphically.

Information formatted as XML allows more generic tools. They will be able to ignore information that the an earlier tool produced more easily. Tags that are unexpected will be invisible. When formatting the information as flat text through STDIN and STDOUT, the tool has to know to ignore column 3 and 5 more specifically. XML can make the data more robust.
XML license plate
One reason for logically separating STDIN/OUT from XMLIN/OUT is that filters may work in either mode. If an standard tool is useful, it will use the data from STDIN. It won’t negate the more general purpose formatting from XMLIN. Alternatively, STDIN could pass configuration information while XMLIN is data based.

Implementing two input streams and two output streams in a way that is cooperative is a technical challenge. Opening the extra pipes is easy, but the tools could easily deadlock if it was done incorrectly.

Embedding the XML and free format output as an escaped sections of the normal STDIN/OUT pipe is another possibility.

In a less ambitious project, XMLIN/OUT would just be conceptual and the filter apps would have flags indicating which mode to use.

Original image: P9150078. By Steve Singer [Image license]
Original image: XML. By lambdageek [Image license]