Windy Winter and Artificial Intelligence

A magnifying glass searching
I’ve noticed recently that it seems to be more windy here than in the past. Over the weekend, we had a strong wind storm with 60+ mph wind gusts. Yet I don’t think that the storm is proof. Just as a bitterly cold week doesn’t invalidate the process of climate change, a single wind storm doesn’t make it more windy. But I suspected changes before the weekend.

I mentioned my observation to a friend and he suggested I could use an AI tool to analyze the situation. I’m not convinced that artificial intelligence will help me get a better result.

The research in “Global trends in wind speed and wave height” by I. R. Young, S. Zieger, and A. V. Babanin indicates that it is an interesting topic. That paper in could help me identify the most useful questions. It also indicates that analyzing climate is not easy.

For my own situation, I need to find a source of historical wind speeds. Weather Underground has some data starting about 1940, but it isn’t easy to access. windfinder.com sells hourly data going back to 1999. I didn’t check the pricing of their data.

Another issue is what should I measure? Weather Underground has the maximum speed each day which is a good start, but may not answer my question. Windfinder has hourly data which is a finer granularity and might be more useful.

There would be a few more decisions I would need to make before I get an answer. When do I want to break between a historical base statistic to compare to recent data or should I look for a trend in the wind speeds? The number of data points in both groups can affect the statistical validity of my results.

Before I start looking for tools to confirm my observation, I need to make a lot of decisions. I don’t think an artificial intelligence tool will help me decide these prerequisites.

Artificial Intelligence is a trendy hammer, but not every problem is a nail.

The firehose… what a Genius!

The Firehose feature of genius.com is cool. It is a (figurative) firehose spewing out all of the activity on the site. It reminds me of Tweetdeck. To sample the firehose, you need to be logged in.

I see that some of the user names are (more than a little) sketchy. (Other sites I visit would ban them day zero.) I find the site pretty light-hearted—‘bad’ people get put in a penalty box (à la hockey) and people who are banned often get a chance to leave a parting message (which is just goofy).

Commenting in the forums requires more IQ than I have right now. Right now my project (quest) is to get all of the metadata for the group Saga corrected. Right now everything Saga is listed under the rapper Saïga. The real Saga from Canada is nowhere to be found. I guess I’m passionate about Saga…. having over 30 discs on my shelf.

There are plenty of lyrics websites. The closest one to Genius that I’ve found is Song Meanings. Both Genius and Song Meanings are crowd-sourced. But, Genius is hands down the winner.

The artists are part of the Genius community and often contribute textually or with song breakdown videos. Linkin Park Breaks Down “Good Goodbye” On Genius’ Video Series ‘Verified’ was one of the first I saw, before Chester Bennington’s suicide.

My vote of confidence for Genius is that I prefer to add site:genius.com when I’m using a search engine.

XMLIN, XMLOUT

In addition to the standard STDIN, STDOUT and STDERR I/O streams available to processes, the concept of XMLIN and XMLOUT streams could be very useful. I’m not talking about XML pipelines such as described in https://www.w3.org/TR/xml-pipeline/ I don’t want to make tools that are programmed to manipulate all XML but rather use XML as an encoding of the data that is more flexible than flat files.

In a pipeline of processes, passing the information as structured data would allow the pipes to analyze information in a more intelligent manner. I would treat the XMLIN/OUT streams as a sequence of XML documents. The goal would be to help in processing complex data.

I have growing tables of user properties from deviantart. I could create filters to extract fields of the data and format them as XML. This formatted data could feed more specialized tools later in the pipeline.

For example, I could extract the number of works that each user has over time. The first filter would extract all of the records for each user. The next level in the pipeline would perform a regression analysis and the final tool would encode the information graphically.

Information formatted as XML allows more generic tools. They will be able to ignore information that the an earlier tool produced more easily. Tags that are unexpected will be invisible. When formatting the information as flat text through STDIN and STDOUT, the tool has to know to ignore column 3 and 5 more specifically. XML can make the data more robust.

One reason for logically separating STDIN/OUT from XMLIN/OUT is that filters may work in either mode. If an standard tool is useful, it will use the data from STDIN. It won’t negate the more general purpose formatting from XMLIN. Alternatively, STDIN could pass configuration information while XMLIN is data based.

Implementing two input streams and two output streams in a way that is cooperative is a technical challenge. Opening the extra pipes is easy, but the tools could easily deadlock if it was done incorrectly.

Embedding the XML and free format output as an escaped sections of the normal STDIN/OUT pipe is another possibility.

In a less ambitious project, XMLIN/OUT would just be conceptual and the filter apps would have flags indicating which mode to use.

Original image: P9150078. By Steve Singer [Image license]
Original image: XML. By lambdageek [Image license]

Share this:

Share this:

Share this: