Reference

TwitterAnalysis

Main class - It inherits TwitterGraphs, TwitterDB, and TwitterTopics classes.

setConfigs([type_of_graph, is_bot_Filter, …])

Configure the current object settings to drive the automation of the analysis files

concat_edges(G)

Aux function to concatenate edges to help filter in mongoDB

build_db_collections([inc, bots_ids_list_file])

This method is in charge of extracting, cleaning, and loading the data into all the collections in MongoDB.

plot_graph_contracted_nodes(G, file)

Method to compress and plot graph based on the graph reduction settings that can be updated using the setConfigs method.

export_mult_types_edges_for_input([…])

This method will export edges from mongoDB data that can be used to create graphs.

nodes_edges_analysis_files(G, path)

Given a graph G, it exports nodes with they degree, edges with their weight, and word clouds representing the nodes scaled by their degree

lda_analysis_files(path[, startDate_filter, …])

Creates topic model files.

ht_analysis_files(path[, startDate_filter, …])

Creates hashtag frequency files.

words_analysis_files(path[, …])

Creates words frequency files.

time_series_files(path[, startDate_filter, …])

Creates timeseries frequency files.

graph_analysis_files(G, path[, gr_prefix_nm])

Plot graph analysis files for a given graph G.

edge_files_analysis(output_path)

Automated way to generate all analysis files.

get_time_series_df([ht_arr, …])

Method to query data in mongoDB for timeseries analysis given certain filters.

eda_analysis()

Method to print a summary of the initial exploratory data analysis for any dataset.

print_top_nodes_cluster_metrics(G, …[, …])

Calculates clustering metrics for top degree nodes

print_commty_cluster_metrics(G[, comm_att, …])

Calculates clustering metrics for top degree nodes

ht_connection_files(path[, …])

plot_top_ht_timeseries(top_no_start, …[, …])

plot_timeseries(df, arr_columns, file)

TwitterDB

TwitterDB class

setFocusedDataConfigs(strFocusedTweetFields, …)

Twitter documents have an extensive number of fields.

loadDocFromFile(directory)

This method will load tweet .json files into the DB (tweet collection) It goes through all .json files in the directory and load them one by one.

search7dayapi(consumer_key, consumer_secret, …)

Send requests to the 7-Day search API and save data into MongoDB

searchPremiumAPI(twitter_bearer, api_name, …)

Send requests to the Premium search API and save data into MongoDB

create_bat_file_apisearch(mongoDBServer, …)

The method will create two files, one python script containing the code necessary to make the requests, and a .bat file that can be used to schedule the call of the python script.

loadFocusedData(inc)

Method to load focused data into mongoDB based on the configurations set on setFocusedDataConfigs.

loadUsersData(inc, user_type_filter)

Method to load user data into mongoDB.

loadTweetHashTags(inc)

Method to load hashthas in a separate collection in mongoDB.

loadTweetConnections(inc)

Method to load tweet connection in a separate collection in mongoDB.

loadTweetHTConnections(inc)

Method to load hashtag connection in a separate collection in mongoDB.

loadWordsData(inc)

Method to load the tweet words in a separate collection in mongoDB.

loadAggregations(aggType)

Method to load addtional aggregated collection to MongoDB It creates the tweetWords collection.

set_bot_flag_based_on_arr(bots_list_id_str)

Method to update MongoDb collection with a flag identifieng is a user is a bot or not.

cleanTweetText(text)

Method used to clean the tweet message.

exportData(exportType, filepath, inc[, …])

Method to export the data from MongoDb into text files based on certain filters.

queryData(exportType, filepath, inc[, …])

Method to query the data from MongoDb.

TwitterGraphs

loadGraphFromFile(edge_file)

plot_graph_att_distr(G, att[, title, …])

plot_disconnected_graph_distr(G[, file, …])

contract_nodes_commty_per(G, perc[, …])

plotSpringLayoutGraph(G, v_graph_name, …)

largest_component_no_self_loops(G)

export_nodes_edges_to_file(G, node_f_name, …)

create_node_subgraph(G, node)

get_top_degree_nodes(G, top_degree_start, …)

calculate_spectral_clustering_labels(G, k[, …])

calculate_spectral_clustering(G[, k, …])

calculate_louvain_clustering(G)

calculate_separability(G_Community, G_All)

calculate_density(G_Community)

calculate_average_clustering_coef(G_Community)

calculate_cliques(G)

calculate_power_nodes_score(G[, top_no])

calculate_average_node_degree(G)

print_cluster_metrics(G_Community, G_All[, …])

eigenDecomposition(af_matrix[, bln_plot, topK])

remove_edges(G, min_degree_no)

remove_edges_eithernode(G, min_degree_no)

print_Measures(G[, blnCalculateDimater, …])

TwitterTopics

get_coh_u_mass()

get_coh_c_v()

get_docs_from_file(file_path)

clean_docs(doc[, delete_numbers, …])

train_model(topic_docs, num_topics, model_name)

train_model_from_file(file_path, num_topics, …)

plot_topics(file_name, no_of_topics[, …])

read_freq_list_file(file_path[, delimiter])

plot_top_freq_list(fr_list, top_no, ylabel)

plot_word_cloud(fr_list[, file, …])