ta.TwitterAnalysis.setConfigs¶

TwitterAnalysis.setConfigs(type_of_graph='user_conn_all', is_bot_Filter=None, period_arr=None, create_nodes_edges_files_flag='Y', create_graphs_files_flag='Y', create_topic_model_files_flag='Y', create_ht_frequency_files_flag='Y', create_words_frequency_files_flag='Y', create_timeseries_files_flag='Y', create_top_nodes_files_flag='Y', create_community_files_flag='N', create_ht_conn_files_flag='Y', num_of_topics=4, top_no_word_filter=None, top_ht_to_ignore=None, graph_plot_cutoff_no_nodes=500, graph_plot_cutoff_no_edges=2000, create_graph_without_node_scale_flag='N', create_graph_with_node_scale_flag='Y', create_reduced_graph_flag='Y', reduced_graph_comty_contract_per=90, reduced_graph_remove_edge_weight=None, reduced_graph_remove_edges='Y', top_degree_start=1, top_degree_end=10, period_top_degree_start=1, period_top_degree_end=5, commty_edge_size_cutoff=200)[source]¶

Configure the current object settings to drive the automation of the analysis files

Parameters

type_of_graph ((Optional)) – This setting defines the type of graph to analyze. Six different options are available: user_conn_all, user_conn_retweet, user_conn_quote, user_conn_reply, user_conn_mention, and ht_conn. (Default=’user_conn_all’)
is_bot_Filter ((Default=None))
period_arr ((Optional)) – An array of start and end dates can be set so that the pipeline creates a separate analysis folder for each of the periods in the array. (Default=None)
create_nodes_edges_files_flag ((Optional)) – If this setting is set to ‘Y’, the pipeline will create two files for each graph and sub-graph. One file with the edge list, and one with the node list and their respective degree.(Default=’Y’)
create_graphs_files_flag ((Optional)) – If this setting is set to ‘Y’, the pipeline will plot the graph showing all the connections. (Default=’Y’)
create_topic_model_files_flag ((Optional)) – If this setting is set to ‘Y’, the pipeline will create topic discovery related files for each folder. It will create a text file with all the tweets that are part of that folder, it will also train a LDA model based on the tweets texts and plot a graph with the results. (Default=’Y’)
create_ht_frequency_files_flag ((Optional)) – If this setting is set to ‘Y’, the pipeline will create hashtag frequency files for each folder. It will create a text file with the full list of hashtags and their frequency, a wordcloud showing the most frequently used hashtags, and barcharts showing the top 30 hashtags. (Default=’y’)’
create_words_frequency_files_flag ((Optional)) – If this setting is set to ‘Y’, the pipeline will create word frequency files for each folder. It will create a text file with a list of words and their frequency, a wordcloud showing the most frequently used words, and barcharts showing the top 30 words. (Default=’Y’)
create_timeseries_files_flag ((Optional)) – If this setting is set to ‘Y’, the pipeline will create timeseries graphs for each folder representing the tweet count by day, and the top hashtags frequency count by day. (Default=’Y’)
create_top_nodes_files_flag ((Optional)) – If this setting is set to ‘Y’, the pipeline will create separate analysis folders for all the top degree nodes. (Default=’Y’)
create_community_files_flag ((Optional)) – If this setting is set to ‘Y’, the pipeline will use the louvain method to assign each node to a community. A separate folder for each of the communities will be created with all the analysis files. (Default=’N’)
create_ht_conn_files_flag ((Optional)) – If this setting is set to ‘Y’, the pipeline will plot hashtag connections graphs. This can be used when user connections are being analyzed, but it could still be interesting to see the hashtags connections made by that group of users. (Default=’Y’)
num_of_topics ((Optional)) – If the setting CREATE_TOPIC_MODEL_FILES_FLAG was set to ‘Y’, then this number will be used to send as input to the LDA model. If no number is given, the pipeline will use 4 as the default value. (Default=4)
top_no_word_filter ((Optional)) – If the setting CREATE_WORDS_FREQUENCY_FILES_FLAG was set to ‘Y’, then this number will be used to decide how many words will be saved in the word frequency list text file. If no number is given, the pipeline will use 5000 as the default value. (Default=None)
top_ht_to_ignore ((Optional)) – If the setting CREATE_HT_CONN_FILES_FLAG was set to ‘Y’, then this number will be used to choose how many top hashtags can be ignored. Sometimes ignoring the main hashtag can be helpful in visualizations to discovery other interesting structures within the graph. (Default=None)
graph_plot_cutoff_no_nodes ((Optional)) – Used with the graph_plot_cutoff_no_edges parameter. For each graph created, these numbers will be used as cutoff values to decide if a graph is too large to be plot or not. Choosing a large number can result in having the graph to take a long time to run. Choosing a small number can result in graphs that are too reduced and with little value or even graphs that can’t be printed at all because they can’t be reduce further. (Default=500)
graph_plot_cutoff_no_edges ((Optional)) – Used with the graph_plot_cutoff_no_nodes parameter. For each graph created, these numbers will be used as cutoff values to decide if a graph is too large to be plot or not. Choosing a large number can result in having the graph to take a long time to run. Choosing a small number can result in graphs that are too reduced and with little value or even graphs that can’t be printed at all because they can’t be reduce further. (Default=2000)
create_graph_without_node_scale_flag ((Optional)) – For each graph created, if this setting is set to ‘Y’, the pipeline will try to plot the full graph with no reduction and without any logic for scaling the node size. (Default=’N’)
create_graph_with_node_scale_flag ((Optional)) – For each graph created, if this setting is set to ‘Y’, the pipeline will try to plot the full graph with no reduction, but with additional logic for scaling the node size. (Default=’Y’)
create_reduced_graph_flag ((Optional)) – For each graph created, if this setting is set to ‘Y’, the pipeline will try to plot the reduced form of the graph. (Default=’Y’)
reduced_graph_comty_contract_per ((Optional)) – If the setting CREATE_REDUCED_GRAPH_FLAG was set to ‘Y’, then this number will be used to reduce the graphs by removing a percentage of each community found in that particular graph. The logic can be run multiple times with different percentages. For each time, a new graph file will be saved with a different name according to the parameter given. (Default=90)
reduced_graph_remove_edge_weight ((Optional)) – If the setting CREATE_REDUCED_GRAPH_FLAG was set to ‘Y’, then this number will be used to reduce the graphs by removing edges that have weights smaller then this number. The logic can be run multiple times with different percentages. For each time, a new graph file will be saved with a different name according to the parameter given. (Default=None)
reduced_graph_remove_edges ((Optional)) – If this setting is set to ‘Y’, and the setting *CREATE_REDUCED_GRAPH_FLAG was set to ‘Y’, then the pipeline will continuously try to reduce the graphs by removing edges of nodes with degrees smaller than this number. It will stop the graph reduction once it hits the the values set int the GRAPH_PLOT_CUTOFF parameters. (Default=’Y’)
top_degree_start ((Optional)) – If the setting CREATE_TOP_NODES_FILES_FLAG was set to ‘Y’, then these numbers will define how many top degree node sub-folders to create. (Default=1)
top_degree_end ((Optional)) – If the setting CREATE_TOP_NODES_FILES_FLAG was set to ‘Y’, then these numbers will define how many top degree node sub-folders to create. (Default=10)
period_top_degree_start ((Optional)) – If the setting CREATE_TOP_NODES_FILES_FLAG was set to ‘Y’, then these numbers will define how many top degree node sub-folders for each period to create. (Default=1)
period_top_degree_end ((Optional)) – If the setting CREATE_TOP_NODES_FILES_FLAG was set to ‘Y’, then these numbers will define how many top degree node sub-folders for each period to create. (Default=5)
commty_edge_size_cutoff ((Optional)) – If the setting textit{CREATE_COMMUNITY_FILES_FLAG} was set to ‘Y’, then this number will be used as the community size cutoff number. Any communities that have less nodes then this number will be ignored. If no number is given, the pipeline will use 200 as the default value. (Default=200)

Examples

…:

>>> setConfigs(type_of_graph=TYPE_OF_GRAPH,
>>>             is_bot_Filter=IS_BOT_FILTER,
>>>             period_arr=PERIOD_ARR,
>>>             create_nodes_edges_files_flag=CREATE_NODES_EDGES_FILES_FLAG,
>>>             create_graphs_files_flag=CREATE_GRAPHS_FILES_FLAG,
>>>             create_topic_model_files_flag=CREATE_TOPIC_MODEL_FILES_FLAG,
>>>             create_ht_frequency_files_flag=CREATE_HT_FREQUENCY_FILES_FLAG,
>>>             create_words_frequency_files_flag=CREATE_WORDS_FREQUENCY_FILES_FLAG,
>>>             create_timeseries_files_flag=CREATE_TIMESERIES_FILES_FLAG,
>>>             create_top_nodes_files_flag=CREATE_TOP_NODES_FILES_FLAG,
>>>             create_community_files_flag=CREATE_COMMUNITY_FILES_FLAG,
>>>             create_ht_conn_files_flag=CREATE_HT_CONN_FILES_FLAG,
>>>             num_of_topics=NUM_OF_TOPICS,
>>>             top_no_word_filter=TOP_NO_WORD_FILTER,
>>>             top_ht_to_ignore=TOP_HT_TO_IGNORE,
>>>             graph_plot_cutoff_no_nodes=GRAPH_PLOT_CUTOFF_NO_NODES,
>>>             graph_plot_cutoff_no_edges=GRAPH_PLOT_CUTOFF_NO_EDGES,
>>>             create_graph_without_node_scale_flag=CREATE_GRAPH_WITHOUT_NODE_SCALE_FLAG,
>>>             create_graph_with_node_scale_flag=CREATE_GRAPH_WITH_NODE_SCALE_FLAG,
>>>             create_reduced_graph_flag=CREATE_REDUCED_GRAPH_FLAG,
>>>             reduced_graph_comty_contract_per=REDUCED_GRAPH_COMTY_PER,
>>>             reduced_graph_remove_edge_weight=REDUCED_GRAPH_REMOVE_EDGE_WEIGHT,
>>>             reduced_graph_remove_edges=REDUCED_GRAPH_REMOVE_EDGES_UNTIL_CUTOFF_FLAG,
>>>             top_degree_start=TOP_DEGREE_START,
>>>             top_degree_end=TOP_DEGREE_END,
>>>             period_top_degree_start=PERIOD_TOP_DEGREE_START,
>>>             period_top_degree_end=PERIOD_TOP_DEGREE_END,
>>>             commty_edge_size_cutoff=COMMTY_EDGE_SIZE_CUTOFF
>>>             )