try catch or type conversion performance in julia - (Julia 73 seconds, Python 0.5 seconds) -
i have been playing julia because seems syntactically similar python (which like) claims faster. however, tried making similar script have in python tesing numerical values within text file uses function:
function isfloat(s) try: float64(s) return true catch: return false end end
for reason, takes great deal of time text file reasonable amount of rows of text (~500000).
why be? there better way this? general feature of language can understand apply other languages?
here 2 exact scripts ran times reference:
python: ~0.5 seconds
def is_number(s): try: np.float64(s) return true except valueerror: return false start = time.time() file_data = open('smw100.asc').readlines() file_data = map(lambda line: line.rstrip('\n').replace(',',' ').split(), file_data) bools = [(all(map(is_number, x)), x) x in file_data] print time.time() - start
julia: ~73.5 seconds
start = time() function isfloat(s) try: float64(s) return true catch: return false end end x = map(x-> split(replace(x, ",", " ")), open(readlines, "smw100.asc")) u = [(all(map(isfloat, i)), i) in x] print(start - time())
note can use float64_isvalid function in standard library (a) check whether string valid floating-point value , (b) return value.
note colons (:
) after try
, catch
in isfloat
code wrong in julia (this pythonism).
a faster version of code should be:
const isfloat2_out = [1.0] isfloat2(s::string) = float64_isvalid(s, isfloat2_out) function foo(l) x = split(l, ",") (all(isfloat2, x), x) end u = map(foo, open(readlines, "smw100.asc"))
on machine, sample file 100,000 rows , 10 columns of data, 50% of valid numbers, python code takes 4.21 seconds , julia code takes 2.45 seconds.
Comments
Post a Comment