The Project is Purely Text Analysis. Preferably in Jupyter Notebook Python version 3.6. (All tasks in seperate notebook cells)
1. Load some Data in the form of Tweets (Already have the text from Tweets)
2. Seperate Tweets into dates: I mean date of publishing
3. For all the Tweets Published in a given day;
A. Remove the URLS
Calculate the pairwise Cosine Similarity . Save the data in a csv file.
Calculate the pairwise Jaccrad Similarity. Save the data in a csv file.
Calculate the TF_IDF Similarity. Save the data in a csv file.
B. Break the Tweet into Tokens
Remove stop words
Count the Frequency of top 10 occuring keyword for every day. Save the data into a csv file
Use Kmeans clustering to cluster word tokens...Save number of Clusters and Words in
Visualize Clusters and the words that are in that clusters in the form of word bubble
Skills Required: Python, Scikit, Text processing with python, Machine Learning
13 freelancers are bidding on average $56 for this job
Hello Im a SOFTWARE [url removed, login to view] expert in those language in which u need work,i did many task in those language.i can provide u quality of work.i hope u will consider me for ur task,Im waiting. Relevant Skills and Exp Περισσότερα