Skip to content

Latest commit

 

History

History
23 lines (18 loc) · 493 Bytes

reservoir-sampling-a-list.md

File metadata and controls

23 lines (18 loc) · 493 Bytes

Reservoir sampling a list

Getting k random elements from a list of unknown size:

import random
def random_subset( iterator, K ):
    result = []
    N = 0

    for item in iterator:
        N += 1
        if len( result ) < K:
            result.append( item )
        else:
            s = int(random.random() * N)
            if s < K:
                result[ s ] = item

    return result

Got it from here