Implementing sample_n and sample_frac

We could implement a chunk wise sample_n / sample_frac with:

``` r
library(tidyverse)
big <- rerun(1000, iris) %>% bind_rows()
path <- tempfile()
write_csv(big, path)

library(chunked)
sample_n.chunkwise <- function(.data, size){
  cmd <- lazyeval::lazy(sample_n(.data, size))
  chunked:::record(.data, cmd)
}

read_csv_chunkwise(path) %>% 
  sample_n(1) %>% 
  collect() 
```
The sample would be done in each chunk that way. 

What do you think about that? 
If it sounds like a good idea, let me know and I'll send you a PR. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing sample_n and sample_frac #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Implementing sample_n and sample_frac #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions