try catch or type conversion performance in julia - (Julia 73 seconds, Python 0.5 seconds) -


i have been playing julia because seems syntactically similar python (which like) claims faster. however, tried making similar script have in python tesing numerical values within text file uses function:

function isfloat(s)     try:         float64(s)         return true     catch:         return false     end end 

for reason, takes great deal of time text file reasonable amount of rows of text (~500000).
why be? there better way this? general feature of language can understand apply other languages?

here 2 exact scripts ran times reference:

python: ~0.5 seconds

def is_number(s):     try:         np.float64(s)         return true     except valueerror:         return false  start = time.time() file_data = open('smw100.asc').readlines() file_data = map(lambda line: line.rstrip('\n').replace(',',' ').split(), file_data)  bools = [(all(map(is_number, x)), x) x in file_data] print time.time() - start 

julia: ~73.5 seconds

start = time() function isfloat(s)     try:         float64(s)         return true     catch:         return false     end end x = map(x-> split(replace(x, ",", " ")), open(readlines, "smw100.asc"))  u = [(all(map(isfloat, i)), i) in x]  print(start - time()) 

note can use float64_isvalid function in standard library (a) check whether string valid floating-point value , (b) return value.

note colons (:) after try , catch in isfloat code wrong in julia (this pythonism).

a faster version of code should be:

const isfloat2_out = [1.0] isfloat2(s::string) = float64_isvalid(s, isfloat2_out)  function foo(l)     x = split(l, ",")     (all(isfloat2, x), x) end  u = map(foo, open(readlines, "smw100.asc")) 

on machine, sample file 100,000 rows , 10 columns of data, 50% of valid numbers, python code takes 4.21 seconds , julia code takes 2.45 seconds.


Comments

Popular posts from this blog

python - Subclassed QStyledItemDelegate ignores Stylesheet -

java - HttpClient 3.1 Connection pooling vs HttpClient 4.3.2 -

SQL: Divide the sum of values in one table with the count of rows in another -