Teasing Out Top Daily Topics with GDELT’s Television Explorer

Earlier this year, the GDELT Project released their Television Explorer that enabled API access to closed-caption tedt from television news broadcasts. They’ve done an incredible job expanding and stabilizing the API and just recently released “top trending tables” which summarise what the “top” topics and phrases are across news stations every fifteen minutes. You should… Continue reading

from Teasing Out Top Daily Topics with GDELT’s Television Explorer

Advertisements

Unbottling “.msg” Files in R

There was a discussion on Twitter about the need to read in “.msg” files using R. The “MSG” file format is one of the many binary abominations created by Microsoft to lock folks and users into their platform and tools. Thankfully, they (eventually) provided documentation for the MSG file format which helped me throw together… Continue reading

from Unbottling “.msg” Files in R

R⁶ — Exploring macOS Applications with codesign, Gatekeeper & R

(General reminder abt “R⁶” posts in that they are heavy on code-examples, minimal on expository. I try to design them with 2-3 “nuggets” embedded for those who take the time to walk through the code examples on their systems. I’ll always provide further expository if requested in a comment, so don’t hesitate to ask if… Continue reading

from R⁶ — Exploring macOS Applications with codesign, Gatekeeper & R

Reading PCAP Files with Apache Drill and the sergeant R Package

It’s no secret that I’m a fan of Apache Drill. One big strength of the platform is that it normalizes the access to diverse data sources down to ANSI SQL calls, which means that I can pull data from parquet, Hie, HBase, Kudu, CSV, JSON, MongoDB and MariaDB with the same SQL syntax. This also… Continue reading

from Reading PCAP Files with Apache Drill and the sergeant R Package

Ten-HUT! The Apache Drill R interface package — sergeant — is now on CRAN

I’m extremely pleased to announce that the sergeant package is now on CRAN or will be hitting your local CRAN mirror soon. sergeant provides JDBC, DBI and dplyr/dbplyr interfaces to Apache Drill. I’ve also wrapped a few goodies into the dplyr custom functions that work with Drill and if you have Drill UDFs that don’t… Continue reading

from Ten-HUT! The Apache Drill R interface package — sergeant — is now on CRAN

R⁶ — Disproving Approval

I couldn’t let this stand unchallenged: The new Rasmussen Poll, one of the most accurate in the 2016 Election, just out with a Trump 50% Approval Rating.That’s higher than O’s #’s!— Donald J. Trump (@realDonaldTrump) June 18, 2017 Ramussen makes their Presidential polling data available for both 🍊 & O. Why not compare their ratings… Continue reading

from R⁶ — Disproving Approval

Replicating the Apache Drill ‘Yelp’ Academic Dataset Analysis with sergeant

The Apache Drill folks have a nice walk-through tutorial on how to analyze the Yelp Academic Dataset with Drill. It’s a bit out of date (the current Yelp data set structure is different enough that the tutorial will error out at various points), but it’s a great example of how to work with large, nested… Continue reading

from Replicating the Apache Drill ‘Yelp’ Academic Dataset Analysis with sergeant