twython - simple python script using too much cpu -

- June 15, 2012

i told off vps python script using cpu (apparently script utilising entire core few hours).

my script uses twython library stream tweets

def on_success(self, data):      if 'text' in data:         self.counter += 1         self.tweetdatabase.save(tweet(data))          #we want commit when have batch         if self.counter >= 1000:             print("{0}: commiting {1} tweets".format(datetime.now(), self.counter))             self.counter = 0             self.tweetdatabase.commit()

tweet class that's job throw away meta data tweet not need:

class tweet():      def __init__(self, json):          self.user = {"id" : json.get('user').get('id_str'), "name" : json.get('user').get('name')}         self.timestamp = datetime.datetime.strptime(json.get('created_at'), '%a %b %d %h:%m:%s %z %y')         self.coordinates  = json.get('coordinates')         self.tweet = {                         "id" : json.get('id_str'),                         "text" : json.get('text').split('#')[0],                         "entities" : json.get('entities'),                         "place" :  json.get('place')                      }          self.favourite = json.get('favorite_count')         self.retweet = json.get('retweet_count')

it has __str__ method return super compact string representation of object

the tweetdatabase.commit() saves tweets file while tweetdatabase.save() saves tweet list:

def save(self, tweet):     self.tweets.append(tweet.__str__())  def commit(self):     open(self.path, mode='a', encoding='utf-8') f:         f.write('\n'.join(self.tweets))      self.tweets = []

whats best way keep cpu low? if sleep losing tweets time program spent not listening twitters api. dispite tried sleeping second after program writes file did nothing bring cpu down. record saving file every 1000 tweets on once minute.

many thanks

try checking if need commit first in on_success(). then, check if tweet has data want save. might want consider race conditions on self.counter variable, , should have update self.count wrapped in mutex or similar.

Search This Blog

KBPS

twython - simple python script using too much cpu -

Comments

Post a Comment

Popular posts from this blog

node.js - StackOverflow API not returning JSON -

python - Subclassed QStyledItemDelegate ignores Stylesheet -

java - HttpClient 3.1 Connection pooling vs HttpClient 4.3.2 -