Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have to check presence of millions of elements (20-30 letters str) in the list containing 10-100k of those elements. Is there faster way of doing that in python than set() ?

import sys
#load ids
ids = set( x.strip() for x in open(idfile) )

for line in sys.stdin:
    id=line.strip()
    if id in ids:
        #print fastq
        print id
        #update ids
        ids.remove( id )
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.0k views
Welcome To Ask or Share your Answers For Others

1 Answer

set is as fast as it gets.

However, if you rewrite your code to create the set once, and not change it, you can use the frozenset built-in type. It's exactly the same except immutable.

If you're still having speed problems, you need to speed your program up in other ways, such as by using PyPy instead of cPython.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...