Saturday, January 23, 2010

Looking for good text analytics packages to review

Project 1 for my Text Analytics class is to either find some related article and give a 10-15 presentation on it for the class, or to do a quick demo of some software package other than what we're using for class (which is SPSS Text Analytics for Surveys).  So far, I've downloaded GATE 5.1 and RapidMiner 5.  I can't really do too much puttering around with them yet; I need to learn what I'm doing before I can ever figure out how to properly use them.  Luckily, I was able to sign up for a presentation date late in February, which gives me plenty of time to play around with it and have something constructive to show the class. 

This also helps out my overall interest in nonprofit analytics, especially for smaller organizations.  If we're gathering all this unstructured data, what if there's a wealth of information there that we're not yet putting together?  And what if the excuse for that is, we can't afford the pricey packages out there now?  For that reason, I wanted to make sure that what I demo'd is free, or at least exceedingly inexpensive (well...also because I'm a broke grad student.  Either it's free--and preferably open source--or I keep on looking).

Anyone have experience with any of these packages I mentioned, or have a suggestion to add to the mix?

Also, I found a couple of articles about text analytics in general:
  • Taco Bell Takes Heat Over 'Drive-Thru Diet' Menu -The quote from this particularly made me laugh:  "Prior to launch, posts were 73% positive, putting it ahead of beloved chains like Subway, Wendy's and Domino's. Words associated with the brand online were "love," "delicious," and "favorite." Postings are now 67% positive, putting Taco Bell behind White Castle, Blimpie and Arby's, which rank among the category's lower tier. Now three of the words most closely associated with Taco Bell and its campaign have been "fat," "stop," and "joke."

Tuesday, January 19, 2010

The new semester is going to have me pretty busy

Between my grad assistantship work for 2 different profs, my duties as MSIS rep for the Graduate Business Association, and my classes, I'll be pretty darned busy until May.  My courses are:

Information Systems Analysis and Design
Cyber Security Tech Factors
Java Development
Text Analytics

I'm most interested in the text analytics class, since I'll finally get hands on experience with SPSS, and the whole topic is a favorite of mine.  Should be a fun few months.