Practical 2

1 Using across()

In this practical, you will learn how to:

  • Apply functions across multiple columns using across()
  • Perform row-wise calculations using c_across()

These exercises will use functions and datasets from the tidyverse package.

library(tidyverse)
data(starwars)

We’ll start with an example using the starwars dataset.

  1. Use across inside summarise to calculate the mean of height, mass, and birth_year.

    starwars |> 
      summarise(
        across(
          c(height, mass, birth_year), \(x) mean(x, na.rm = TRUE))
        )
    # A tibble: 1 × 3
      height  mass birth_year
       <dbl> <dbl>      <dbl>
    1   175.  97.3       87.6
  2. What happens if you replace try to calculate the mean of all columns using everything()?

    The code below will give you a warning because you’re trying to apply mean to non-numeric columns.

    starwars |> 
      summarise(
        across(
          everything(), \(x) mean(x, na.rm = TRUE))
        )
    Warning: There were 11 warnings in `summarise()`.
    The first warning was:
    ℹ In argument: `across(everything(), function(x) mean(x, na.rm = TRUE))`.
    Caused by warning in `mean.default()`:
    ! argument is not numeric or logical: returning NA
    ℹ Run `dplyr::last_dplyr_warnings()` to see the 10 remaining warnings.
    # A tibble: 1 × 14
       name height  mass hair_color skin_color eye_color birth_year   sex gender
      <dbl>  <dbl> <dbl>      <dbl>      <dbl>     <dbl>      <dbl> <dbl>  <dbl>
    1    NA   175.  97.3         NA         NA        NA       87.6    NA     NA
    # ℹ 5 more variables: homeworld <dbl>, species <dbl>, films <dbl>,
    #   vehicles <dbl>, starships <dbl>
  3. Repeat your answer to (1), but calculate both the mean and sd.

    mean_cc <- function(x) mean(x, na.rm = TRUE)
    sd_cc <- function(x) sd(x, na.rm = TRUE)
    
    starwars |>
      summarise(
        across(c(height, mass, birth_year), 
               list(mean = mean_cc, sd = sd_cc))
      )
    # A tibble: 1 × 6
      height_mean height_sd mass_mean mass_sd birth_year_mean birth_year_sd
            <dbl>     <dbl>     <dbl>   <dbl>           <dbl>         <dbl>
    1        175.      34.8      97.3    169.            87.6          155.
    WarningGoing further
    1. Modify the code to include min and max.
    2. Use pivot_longer to convert the result into a ‘tidy’ data frame.

Use the mtcars data frame for the following two questions.

  1. Use summarise and across to compute the median for all columns starting with d.

    mtcars |> 
      summarise(across(starts_with("d"), median))
       disp  drat
    1 196.3 3.695
  2. Compute the mean for all numeric columns.

    mtcars |> 
      summarise(across(where(is.numeric), mean))
           mpg    cyl     disp       hp     drat      wt     qsec     vs      am
    1 20.09062 6.1875 230.7219 146.6875 3.596563 3.21725 17.84875 0.4375 0.40625
        gear   carb
    1 3.6875 2.8125

For more details, refer to:

2 Using ‘tidy evaluation’

  1. Write a function to produced a grouped summary of a given dataset.

    Your function:

    • Should take a data frame (.data) and the grouping column (.col) as inputs;
    • Should use summarise to calculate a summary (e.g., median or mean) of all numeric columns.
    • Use across and where to apply the summary function to all numeric columns.

    Test your function with the mtcars dataset.

    get_summary <- function(.data, .group) {
      .data |>
        group_by({{.group}}) |>
        summarise(across(where(is.numeric), mean))
    }
    
    get_summary(mtcars, cyl)
    # A tibble: 3 × 11
        cyl   mpg  disp    hp  drat    wt  qsec    vs    am  gear  carb
      <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
    1     4  26.7  105.  82.6  4.07  2.29  19.1 0.909 0.727  4.09  1.55
    2     6  19.7  183. 122.   3.59  3.12  18.0 0.571 0.429  3.86  3.43
    3     8  15.1  353. 209.   3.23  4.00  16.8 0     0.143  3.29  3.5 
  2. Write a function to calculate the square root (sqrt) of a given column. Your function should have two arguments: the data frame and the column name.

    calc_sqrt <- function(.data, .col) {
      .data |>
        mutate({{ .col }} := sqrt({{ .col }}))
    }
    
    mtcars |>
      calc_sqrt(wt)
                         mpg cyl  disp  hp drat       wt  qsec vs am gear carb
    Mazda RX4           21.0   6 160.0 110 3.90 1.618641 16.46  0  1    4    4
    Mazda RX4 Wag       21.0   6 160.0 110 3.90 1.695582 17.02  0  1    4    4
    Datsun 710          22.8   4 108.0  93 3.85 1.523155 18.61  1  1    4    1
    Hornet 4 Drive      21.4   6 258.0 110 3.08 1.793042 19.44  1  0    3    1
    Hornet Sportabout   18.7   8 360.0 175 3.15 1.854724 17.02  0  0    3    2
    Valiant             18.1   6 225.0 105 2.76 1.860108 20.22  1  0    3    1
    Duster 360          14.3   8 360.0 245 3.21 1.889444 15.84  0  0    3    4
    Merc 240D           24.4   4 146.7  62 3.69 1.786057 20.00  1  0    4    2
    Merc 230            22.8   4 140.8  95 3.92 1.774824 22.90  1  0    4    2
    Merc 280            19.2   6 167.6 123 3.92 1.854724 18.30  1  0    4    4
    Merc 280C           17.8   6 167.6 123 3.92 1.854724 18.90  1  0    4    4
    Merc 450SE          16.4   8 275.8 180 3.07 2.017424 17.40  0  0    3    3
    Merc 450SL          17.3   8 275.8 180 3.07 1.931321 17.60  0  0    3    3
    Merc 450SLC         15.2   8 275.8 180 3.07 1.944222 18.00  0  0    3    3
    Cadillac Fleetwood  10.4   8 472.0 205 2.93 2.291288 17.98  0  0    3    4
    Lincoln Continental 10.4   8 460.0 215 3.00 2.328948 17.82  0  0    3    4
    Chrysler Imperial   14.7   8 440.0 230 3.23 2.311926 17.42  0  0    3    4
    Fiat 128            32.4   4  78.7  66 4.08 1.483240 19.47  1  1    4    1
    Honda Civic         30.4   4  75.7  52 4.93 1.270827 18.52  1  1    4    2
    Toyota Corolla      33.9   4  71.1  65 4.22 1.354622 19.90  1  1    4    1
    Toyota Corona       21.5   4 120.1  97 3.70 1.570032 20.01  1  0    3    1
    Dodge Challenger    15.5   8 318.0 150 2.76 1.876166 16.87  0  0    3    2
    AMC Javelin         15.2   8 304.0 150 3.15 1.853375 17.30  0  0    3    2
    Camaro Z28          13.3   8 350.0 245 3.73 1.959592 15.41  0  0    3    4
    Pontiac Firebird    19.2   8 400.0 175 3.08 1.960867 17.05  0  0    3    2
    Fiat X1-9           27.3   4  79.0  66 4.08 1.391043 18.90  1  1    4    1
    Porsche 914-2       26.0   4 120.3  91 4.43 1.462874 16.70  0  1    5    2
    Lotus Europa        30.4   4  95.1 113 3.77 1.230041 16.90  1  1    5    2
    Ford Pantera L      15.8   8 351.0 264 4.22 1.780449 14.50  0  1    5    4
    Ferrari Dino        19.7   6 145.0 175 3.62 1.664332 15.50  0  1    5    6
    Maserati Bora       15.0   8 301.0 335 3.54 1.889444 14.60  0  1    5    8
    Volvo 142E          21.4   4 121.0 109 4.11 1.667333 18.60  1  1    4    2
  3. Update your function so it creates a new column ({.col}_sqrt) instead of overwriting the existing one.

    TipTip

    You can extend the column name by putting it in quotes. For example:

    "{{.col}}_sqrt"
    calc_sqrt <- function(.data, .col) {
      .data |>
        mutate("{{.col}}_sqrt" := sqrt({{ .col }}))
    }
    
    mtcars |>
      calc_sqrt(wt)
                         mpg cyl  disp  hp drat    wt  qsec vs am gear carb
    Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
    Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
    Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
    Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
    Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
    Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
    Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
    Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
    Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
    Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
    Merc 280C           17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
    Merc 450SE          16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
    Merc 450SL          17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
    Merc 450SLC         15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
    Cadillac Fleetwood  10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
    Lincoln Continental 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
    Chrysler Imperial   14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
    Fiat 128            32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
    Honda Civic         30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
    Toyota Corolla      33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
    Toyota Corona       21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
    Dodge Challenger    15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
    AMC Javelin         15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
    Camaro Z28          13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
    Pontiac Firebird    19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
    Fiat X1-9           27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
    Porsche 914-2       26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
    Lotus Europa        30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
    Ford Pantera L      15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
    Ferrari Dino        19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
    Maserati Bora       15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
    Volvo 142E          21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2
                         wt_sqrt
    Mazda RX4           1.618641
    Mazda RX4 Wag       1.695582
    Datsun 710          1.523155
    Hornet 4 Drive      1.793042
    Hornet Sportabout   1.854724
    Valiant             1.860108
    Duster 360          1.889444
    Merc 240D           1.786057
    Merc 230            1.774824
    Merc 280            1.854724
    Merc 280C           1.854724
    Merc 450SE          2.017424
    Merc 450SL          1.931321
    Merc 450SLC         1.944222
    Cadillac Fleetwood  2.291288
    Lincoln Continental 2.328948
    Chrysler Imperial   2.311926
    Fiat 128            1.483240
    Honda Civic         1.270827
    Toyota Corolla      1.354622
    Toyota Corona       1.570032
    Dodge Challenger    1.876166
    AMC Javelin         1.853375
    Camaro Z28          1.959592
    Pontiac Firebird    1.960867
    Fiat X1-9           1.391043
    Porsche 914-2       1.462874
    Lotus Europa        1.230041
    Ford Pantera L      1.780449
    Ferrari Dino        1.664332
    Maserati Bora       1.889444
    Volvo 142E          1.667333