Hadoop to analyze President Barack Obama’s State of the Union Address Speeches



The State of the Union is the address presented by the President of the United States to a joint session of the United States Congress, typically delivered annually. The goal of this project is to use Hadoop’s map reduce programming model to analyze all the speeches delivered by the current president of united states Mr. Barack Obama and find out the most commonly used words while filtering for some common stop words. And present the results using histogram or similar graph.


Speech Data


Stop Words



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s