This repo offers implementations of a couple of random sampling algorithms -- simple reservoir sampling, slightly optimized reservoir sampling, and weighted reservoir sampling. Reservoir sampling is an algorithm that randomly selects k samples from a collection of n elements and is a technique best used when the collection is large enough to exceed memory limits.
To learn more: check out my blog post!
To install the binaries, run the following command
$ go get -u github.com/epicchewy/sampling/cmd/...
This repo can also be imported as a package
$ go get -u github.com/epicchewy/sampling
This repo provides two binaries (ssampling
and rsampling
) that users can run out of box. It can also be imported as a package and extended.
Installing the binaries:
$ go get -u github.com/epicchewy/sampling/cmd/...
Running the sampler:
$ seq 1 100 | rsampling --samples 5
27
37
85
80
45
Import the package to use in a project
$ go get -u github.com/epicchewy/sampling
Sample Usage:
import "github.com/epicchewy/sampling"
func main() {
var items []int
for i := 0; i < 100; i++ {
items = append(items, i)
}
// use available iterators or pass in a custom one that implements the Iterator interface
itr := sampling.NewIntInterator(items)
r := sampling.NewSimpleReservoir(10, itr)
samples := r.Sample()
}