ta.TwitterAnalysis.build_db_collections¶
-
TwitterAnalysis.build_db_collections(inc=100000, bots_ids_list_file=None)[source]¶ This method is in charge of extracting, cleaning, and loading the data into all the collections in MongoDB.
- Parameters
inc ((Optional)) – used to determine how many tweets will be processed at a time - (Default=100000). A large number may cause out of memory errors, and a low number may take a long time to run, so the decision of what number to use should be made based on the hardware specification.
bots_ids_list_file ((Optional)) – a file that contains a list of user ids that are bots. It creates flags in the MongoDB collection to indentify which tweets and user are in the bots list. - (Default=None)
Examples
Load all data into all collections in MongoDB:
>>> inc = 50000 >>> build_db_collections(inc)