Skip to content

Commit 8466ce0

Browse files
committed
as_duckplyr_tibble()
1 parent 7fa1894 commit 8466ce0

10 files changed

+71
-27
lines changed

NAMESPACE

+1
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,7 @@ export(anti_join)
107107
export(any_of)
108108
export(arrange)
109109
export(as_duckplyr_df)
110+
export(as_duckplyr_tibble)
110111
export(as_tibble)
111112
export(between)
112113
export(bind_cols)

NEWS.md

+1
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
## Features
66

77
- `df_from_file()` and related functions support multiple files (#194, #195), show a clear error message for non-string `path` arguments (#182), and create a tibble by default (#177).
8+
- New `as_duckplyr_tibble()` to convert a data frame to a duckplyr tibble (#177).
89
- Support descending sort for character and other non-numeric data (@toppyy, #92, #175).
910
- Avoid setting memory limit (#193).
1011
- Check compatibility of join columns (#168, #185).

R/as_duckplyr_df.R

+10-7
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,23 @@
11
#' Convert to a duckplyr data frame
22
#'
3-
#' For an object of class `duckplyr_df`,
3+
#' @description
4+
#' These functions convert a data-frame-like input to an object of class `"duckpylr_df"`.
5+
#' For such objects,
46
#' dplyr verbs such as [mutate()], [select()] or [filter()] will attempt to use DuckDB.
57
#' If this is not possible, the original dplyr implementation is used.
68
#'
9+
#' `as_duckplyr_df()` requires the input to be a plain data frame or a tibble,
10+
#' and will fail for any other classes, including subclasses of `"data.frame"` or `"tbl_df"`.
11+
#' This behavior is likely to change, do not rely on it.
12+
#'
13+
#' @details
714
#' Set the `DUCKPLYR_FALLBACK_INFO` and `DUCKPLYR_FORCE` environment variables
815
#' for more control over the behavior, see [config] for more details.
916
#'
1017
#' @param .data data frame or tibble to transform
1118
#'
12-
#' @return An object of class `"duckplyr_df"`, inheriting from the classes of the
13-
#' `.data` argument.
19+
#' @return For `as_duckplyr_df()`, an object of class `"duckplyr_df"`,
20+
#' inheriting from the classes of the `.data` argument.
1421
#'
1522
#' @export
1623
#' @examples
@@ -36,7 +43,3 @@ as_duckplyr_df <- function(.data) {
3643
class(.data) <- c("duckplyr_df", class(.data))
3744
.data
3845
}
39-
40-
default_df_class <- function() {
41-
class(new_tibble(list()))
42-
}

R/as_duckplyr_tibble.R

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
#' as_duckplyr_tibble
2+
#'
3+
#' `as_duckplyr_tibble()` converts the input to a tibble and then to a duckplyr data frame.
4+
#'
5+
#' @return For `as_duckplyr_df()`, an object of class
6+
#' `c("duckplyr_df", class(tibble()))` .
7+
#'
8+
#' @rdname as_duckplyr_df
9+
#' @export
10+
as_duckplyr_tibble <- function(.data) {
11+
# Extra as.data.frame() call for good measure and perhaps https://github.com/tidyverse/tibble/issues/1556
12+
as_duckplyr_df(as_tibble(as.data.frame(.data)))
13+
}

R/io-.R

+4
Original file line numberDiff line numberDiff line change
@@ -78,3 +78,7 @@ duckplyr_df_from_file <- function(
7878
out <- df_from_file(path, table_function, options = options, class = class)
7979
as_duckplyr_df(out)
8080
}
81+
82+
default_df_class <- function() {
83+
class(new_tibble(list()))
84+
}

README.Rmd

+7-7
Original file line numberDiff line numberDiff line change
@@ -75,8 +75,8 @@ conflict_prefer("filter", "dplyr")
7575

7676
There are two ways to use duckplyr.
7777

78-
1. To enable duckplyr for individual data frames, use `duckplyr::as_duckplyr_df()` as the first step in your pipe, without attaching the package.
79-
1. By calling `library(duckplyr)`, it overwrites dplyr methods and is automatically enabled for the entire session without having to call `as_duckplyr_df()`. To turn this off, call `methods_restore()`.
78+
1. To enable duckplyr for individual data frames, use `duckplyr::as_duckplyr_tibble()` as the first step in your pipe, without attaching the package.
79+
1. By calling `library(duckplyr)`, it overwrites dplyr methods and is automatically enabled for the entire session without having to call `as_duckplyr_tibble()`. To turn this off, call `methods_restore()`.
8080

8181
The examples below illustrate both methods.
8282
See also the companion [demo repository](https://github.com/Tmonster/duckplyr_demo) for a use case with a large dataset.
@@ -85,20 +85,20 @@ See also the companion [demo repository](https://github.com/Tmonster/duckplyr_de
8585

8686
This example illustrates usage of duckplyr for individual data frames.
8787

88-
Use `duckplyr::as_duckplyr_df()` to enable processing with duckdb:
88+
Use `duckplyr::as_duckplyr_tibble()` to enable processing with duckdb:
8989

9090
```{r}
9191
out <-
9292
palmerpenguins::penguins %>%
9393
# CAVEAT: factor columns are not supported yet
9494
mutate(across(where(is.factor), as.character)) %>%
95-
duckplyr::as_duckplyr_df() %>%
95+
duckplyr::as_duckplyr_tibble() %>%
9696
mutate(bill_area = bill_length_mm * bill_depth_mm) %>%
9797
summarize(.by = c(species, sex), mean_bill_area = mean(bill_area)) %>%
9898
filter(species != "Gentoo")
9999
```
100100

101-
The result is a data frame or tibble, with its own class.
101+
The result is a tibble, with its own class.
102102

103103
```{r}
104104
class(out)
@@ -137,7 +137,7 @@ Use `library(duckplyr)` or `duckplyr::methods_overwrite()` to overwrite dplyr me
137137
duckplyr::methods_overwrite()
138138
```
139139

140-
This is the same query as above, without `as_duckplyr_df()`:
140+
This is the same query as above, without `as_duckplyr_tibble()`:
141141

142142
```{r echo = FALSE}
143143
Sys.setenv(DUCKPLYR_FALLBACK_COLLECT = 0)
@@ -206,7 +206,7 @@ Sys.setenv(DUCKPLYR_FALLBACK_COLLECT = "")
206206

207207
```{r}
208208
palmerpenguins::penguins %>%
209-
duckplyr::as_duckplyr_df() %>%
209+
duckplyr::as_duckplyr_tibble() %>%
210210
transmute(bill_area = bill_length_mm * bill_depth_mm) %>%
211211
head(3)
212212
```

README.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -41,28 +41,28 @@ Or from [GitHub](https://github.com/) with:
4141

4242
There are two ways to use duckplyr.
4343

44-
1. To enable duckplyr for individual data frames, use [`duckplyr::as_duckplyr_df()`](https://duckdblabs.github.io/duckplyr/reference/as_duckplyr_df.html) as the first step in your pipe, without attaching the package.
45-
2. By calling [`library(duckplyr)`](https://duckdblabs.github.io/duckplyr/), it overwrites dplyr methods and is automatically enabled for the entire session without having to call `as_duckplyr_df()`. To turn this off, call `methods_restore()`.
44+
1. To enable duckplyr for individual data frames, use [`duckplyr::as_duckplyr_tibble()`](https://duckdblabs.github.io/duckplyr/reference/as_duckplyr_tibble.html) as the first step in your pipe, without attaching the package.
45+
2. By calling [`library(duckplyr)`](https://duckdblabs.github.io/duckplyr/), it overwrites dplyr methods and is automatically enabled for the entire session without having to call `as_duckplyr_tibble()`. To turn this off, call `methods_restore()`.
4646

4747
The examples below illustrate both methods. See also the companion [demo repository](https://github.com/Tmonster/duckplyr_demo) for a use case with a large dataset.
4848

4949
### Usage for individual data frames
5050

5151
This example illustrates usage of duckplyr for individual data frames.
5252

53-
Use [`duckplyr::as_duckplyr_df()`](https://duckdblabs.github.io/duckplyr/reference/as_duckplyr_df.html) to enable processing with duckdb:
53+
Use [`duckplyr::as_duckplyr_tibble()`](https://duckdblabs.github.io/duckplyr/reference/as_duckplyr_tibble.html) to enable processing with duckdb:
5454

5555
<pre class='chroma'>
5656
<span><span class='nv'>out</span> <span class='o'>&lt;-</span></span>
5757
<span> <span class='nf'>palmerpenguins</span><span class='nf'>::</span><span class='nv'><a href='https://allisonhorst.github.io/palmerpenguins/reference/penguins.html'>penguins</a></span> <span class='o'><a href='https://magrittr.tidyverse.org/reference/pipe.html'>%&gt;%</a></span></span>
5858
<span> <span class='c'># CAVEAT: factor columns are not supported yet</span></span>
5959
<span> <span class='nf'><a href='https://dplyr.tidyverse.org/reference/mutate.html'>mutate</a></span><span class='o'>(</span><span class='nf'><a href='https://dplyr.tidyverse.org/reference/across.html'>across</a></span><span class='o'>(</span><span class='nf'><a href='https://tidyselect.r-lib.org/reference/where.html'>where</a></span><span class='o'>(</span><span class='nv'>is.factor</span><span class='o'>)</span>, <span class='nv'>as.character</span><span class='o'>)</span><span class='o'>)</span> <span class='o'><a href='https://magrittr.tidyverse.org/reference/pipe.html'>%&gt;%</a></span></span>
60-
<span> <span class='nf'>duckplyr</span><span class='nf'>::</span><span class='nf'><a href='https://duckdblabs.github.io/duckplyr/reference/as_duckplyr_df.html'>as_duckplyr_df</a></span><span class='o'>(</span><span class='o'>)</span> <span class='o'><a href='https://magrittr.tidyverse.org/reference/pipe.html'>%&gt;%</a></span></span>
60+
<span> <span class='nf'>duckplyr</span><span class='nf'>::</span><span class='nf'><a href='https://duckdblabs.github.io/duckplyr/reference/as_duckplyr_tibble.html'>as_duckplyr_tibble</a></span><span class='o'>(</span><span class='o'>)</span> <span class='o'><a href='https://magrittr.tidyverse.org/reference/pipe.html'>%&gt;%</a></span></span>
6161
<span> <span class='nf'><a href='https://dplyr.tidyverse.org/reference/mutate.html'>mutate</a></span><span class='o'>(</span>bill_area <span class='o'>=</span> <span class='nv'>bill_length_mm</span> <span class='o'>*</span> <span class='nv'>bill_depth_mm</span><span class='o'>)</span> <span class='o'><a href='https://magrittr.tidyverse.org/reference/pipe.html'>%&gt;%</a></span></span>
6262
<span> <span class='nf'><a href='https://dplyr.tidyverse.org/reference/summarise.html'>summarize</a></span><span class='o'>(</span>.by <span class='o'>=</span> <span class='nf'><a href='https://rdrr.io/r/base/c.html'>c</a></span><span class='o'>(</span><span class='nv'>species</span>, <span class='nv'>sex</span><span class='o'>)</span>, mean_bill_area <span class='o'>=</span> <span class='nf'><a href='https://rdrr.io/r/base/mean.html'>mean</a></span><span class='o'>(</span><span class='nv'>bill_area</span><span class='o'>)</span><span class='o'>)</span> <span class='o'><a href='https://magrittr.tidyverse.org/reference/pipe.html'>%&gt;%</a></span></span>
6363
<span> <span class='nf'><a href='https://dplyr.tidyverse.org/reference/filter.html'>filter</a></span><span class='o'>(</span><span class='nv'>species</span> <span class='o'>!=</span> <span class='s'>"Gentoo"</span><span class='o'>)</span></span></pre>
6464

65-
The result is a data frame or tibble, with its own class.
65+
The result is a tibble, with its own class.
6666

6767
<pre class='chroma'>
6868
<span><span class='nf'><a href='https://rdrr.io/r/base/class.html'>class</a></span><span class='o'>(</span><span class='nv'>out</span><span class='o'>)</span></span>
@@ -211,7 +211,7 @@ Use [`library(duckplyr)`](https://duckdblabs.github.io/duckplyr/) or [`duckplyr:
211211
<span><span class='c'>#&gt; <span style='color: #00BB00;'>✔</span> Overwriting <span style='color: #0000BB;'>dplyr</span> methods with <span style='color: #0000BB;'>duckplyr</span> methods.</span></span>
212212
<span><span class='c'>#&gt; <span style='color: #00BBBB;'>ℹ</span> Turn off with `duckplyr::methods_restore()`.</span></span></pre>
213213

214-
This is the same query as above, without `as_duckplyr_df()`:
214+
This is the same query as above, without `as_duckplyr_tibble()`:
215215

216216
<pre class='chroma'>
217217
<span><span class='nv'>out</span> <span class='o'>&lt;-</span></span>
@@ -298,7 +298,7 @@ The first time the package encounters an unsupported function, data type, or ope
298298

299299
<pre class='chroma'>
300300
<span><span class='nf'>palmerpenguins</span><span class='nf'>::</span><span class='nv'><a href='https://allisonhorst.github.io/palmerpenguins/reference/penguins.html'>penguins</a></span> <span class='o'><a href='https://magrittr.tidyverse.org/reference/pipe.html'>%&gt;%</a></span></span>
301-
<span> <span class='nf'>duckplyr</span><span class='nf'>::</span><span class='nf'><a href='https://duckdblabs.github.io/duckplyr/reference/as_duckplyr_df.html'>as_duckplyr_df</a></span><span class='o'>(</span><span class='o'>)</span> <span class='o'><a href='https://magrittr.tidyverse.org/reference/pipe.html'>%&gt;%</a></span></span>
301+
<span> <span class='nf'>duckplyr</span><span class='nf'>::</span><span class='nf'><a href='https://duckdblabs.github.io/duckplyr/reference/as_duckplyr_tibble.html'>as_duckplyr_tibble</a></span><span class='o'>(</span><span class='o'>)</span> <span class='o'><a href='https://magrittr.tidyverse.org/reference/pipe.html'>%&gt;%</a></span></span>
302302
<span> <span class='nf'><a href='https://dplyr.tidyverse.org/reference/transmute.html'>transmute</a></span><span class='o'>(</span>bill_area <span class='o'>=</span> <span class='nv'>bill_length_mm</span> <span class='o'>*</span> <span class='nv'>bill_depth_mm</span><span class='o'>)</span> <span class='o'><a href='https://magrittr.tidyverse.org/reference/pipe.html'>%&gt;%</a></span></span>
303303
<span> <span class='nf'><a href='https://rdrr.io/r/utils/head.html'>head</a></span><span class='o'>(</span><span class='m'>3</span><span class='o'>)</span></span>
304304
<span><span class='c'>#&gt; The <span style='color: #0000BB;'>duckplyr</span> package is configured to fall back to <span style='color: #0000BB;'>dplyr</span> when it encounters an</span></span>

man/as_duckplyr_df.Rd

+17-4
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/df_from_file.Rd

+4-2
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
test_that("as_duckplyr_tibble() works", {
2+
expect_s3_class(as_duckplyr_tibble(tibble(a = 1)), "duckplyr_df")
3+
expect_equal(class(as_duckplyr_tibble(tibble(a = 1))), c("duckplyr_df", class(tibble())))
4+
5+
expect_s3_class(as_duckplyr_tibble(data.frame(a = 1)), "duckplyr_df")
6+
expect_equal(class(as_duckplyr_tibble(data.frame(a = 1))), c("duckplyr_df", class(tibble())))
7+
})

0 commit comments

Comments
 (0)