Friday, March 02, 2012

Improving Twitter signal-to-noise ratio with TTYtter extensions

I've been using TTYtter as my primary interface to Twitter for almost a year, and it is still great. In fact, the new releases have made it even better. And the depth of configurability is amazing.

One Twitter user that I follow posts some very cool stuff, but also lots of side-chatter that doesn't interest me. Twitter lets users turn off retweets on a per-user basis, but not the messages to other users (indicated by the presence of the "@" symbol). TTYtter's extensibility makes this easily remedied with a short bit of Perl code that is run during the processing of each Twitter status.
$handle = sub {
my $ref = shift;
my $text = &descape($ref->{'text'});
my $name = &descape($ref->{'user'}->{'name'});
return 0 if (($text =~ /@/i) && ($name =~ /Mr. Noisy/i));

&defaulthandle($ref); #actually print the Twitter message
return 1;
};

The first two bolded lines extract the status text and the name of the person who posted it. The third line checks whether the text is from Mr. Noisy and whether it contains an "@". If so, the function returns a 0 before the code that prints that message to the screen is executed.

Then when I run TTYtter, I include the "-exts=..." flag among the command line arguments:
ttytter -exts="/Users/surly/lib/denoise.pl"
And all of Mr. Noisy's Twitter postings that contain the "@" symbol are filtered out.

Another option is to only include postings from a given user when they receive a certain number of retweets. Here, the bolded lines also show how to use the user's actual Twitter handle (because it's the more common and convenient name to use):

$handle = sub {
my $ref = shift;
my $text = &descape($ref->{'text'});
my $name = &descape($ref->{'user'}->{'name'});
return 0 if (($text =~ /@/i) && ($name =~ /Mr. Noisy/i));
my $rtcount = &descape($ref->{'retweet_count'});
my $sname = &descape($ref->{'user'}->{'screen_name'});
return 0 if (($rtcount < 10) && ($sname =~ /verboseguy/i));

&defaulthandle($ref); #actually print the Twitter message
return 1;
};


I worked out most of this from the TTYtter page on extensions and advanced topics. A partial list of fields that you can extract and filter on are listed in the Twitter documentation on the JSON data representation of a single status object. There is a lot of metadata that could be conceivably used for filtering.

Simpler filtering on the text of Twitter statuses can be carried out using the command-line -filter argument, but these extensions allow far more sophisticated ways of controlling the flow of Twitter information to your screen.
Other extensions that I like:
  • deshortify.pl - which deobfuscates all of those "t.co" hyperlinks.