R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (2024)

This article explains how to extract specific columns of a data set in the R programming language.

I will show you four programming alternatives for the selection of data frame columns. More precisely, the tutorial will contain the following contents:

  • Creation of Example Data
  • Example 1: Subsetting Data by Column Name
  • Example 2: Subsetting Data by Column Position
  • Example 3: Subsetting Data with select Argument of subset Function
  • Example 4: Subsetting Data with select Function (dplyr Package)

Let’s move on to the examples!

Creation of Example Data

In the examples of this tutorial, I’m going to use the following data frame:

data <- data.frame(x1 = c(2, 1, 5, 1), # Create example data x2 = c(7, 1, 1, 5), x3 = c(9, 5, 4, 9), x4 = c(3, 4, 1, 2))data # Print example data

R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (1)

Table 1: Example Data Frame.

Our example data frame consists of four numeric columns and four rows.

In the following, I’m going to show you how to select certain columns from this data frame. I will show you four different alternatives, which will lead to the same output. It depends on your personal preferences, which of the alternatives suits you best.

Example 1: Subsetting Data by Column Name

The most common way to select some columns of a data frame is the specification of a character vector containing the names of the columns to extract. Consider the following R code:

data[ , c("x1", "x3")] # Subset by name

R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (2)

Table 2: Subset of Example Data Frame.

As you can see based on Table 2, the previous R syntax extracted the columns x1 and x3. The previous R syntax can be explained as follows:

  • First, we need to specify the name of our data set (i.e. data)
  • Then, we need to open some square brackets (i.e. [])
  • Within these brackets, we need to write a comma to reflect the two dimensions of our data. Everything before the comma selects specific rows; Everything behind the comma subsets certain columns.
  • Behind the comma, we specify a vector of character strings. Each element of this vector represents the name of a column of our data frame (i.e. x1 and x3).

That’s basically it. However, depending on your personal preferences and your specific data situation, you might prefer one of the other alternatives. So keep on reading…

Example 2: Subsetting Data by Column Position

A similar approach to Example one is the subsetting by the position of the columns. Consider the following R code:

data[ , c(1, 3)] # Subset by position

Similar to Example 1, we use square brackets and a vector behind the comma to select certain columns.

However, this time we are using a numeric vector, whereby each element of the vector stands for the position of the column.

The first column of our example data is called x1 and the column at the third position is called x3. For that reason, the previous R syntax would extract the columns x1 and x3 from our data set.

Example 3: Subsetting Data with select Argument of subset Function

In Example 3, we will access and extract certain columns with the subset function. Within the subset function, we need to specify the name of our data matrix (i.e. data) and the columns we want to select (i.e. x1 and x3):

subset(data, select = c("x1", "x3")) # Subset with select argument

The output of the previous R syntax is the same as in Example 1 and 2.

Example 4: Subsetting Data with select Function (dplyr Package)

Many people like to use the tidyverse environment instead of base R, when it comes to data manipulation. A very popular package of the tidyverse, which also provides functions for the selection of certain columns, is the dplyr package. We can install and load the package as follows:

install.packages("dplyr") # Install dplyr R packagelibrary("dplyr") # Load dplyr R package

Now, we can use the %>% operator and the select function to subset our data set:

data %>% select(x1, x3) # Subset with select function

Again, the same output as in the previous examples. It’s up to you to decide, which option you like the most.

Video & Further Resources

There was a lot of content in this tutorial. However, if you need more explanations on the different approaches and functions, you could have a look at the following video of my YouTube channel. In the video, I’m explaining the examples of this tutorial in more detail:

In addition, you could have a look at the other R tutorials of my homepage. You can find some interesting tutorials for the manipulation of data sets in R below:

  • pull R Function of dplyr Package
  • Select Only Numeric Columns from Data Frame
  • Convert Data Frame Column to Vector
  • Extract Column of dplyr Tibble
  • Reorder Columns of Data Frame in R
  • Sample Random Rows of Data Frame in R
  • Create Empty Data Frame in R
  • Subset Data Frame Rows by Logical Condition
  • Rename Column Name of Data Frame
  • Row Bind in R
  • Column Bind in R
  • The R Programming Language

In this tutorial you have learned how to extract specific columns of a data frame in the R programming language. I have shown in multiple examples how to create subsets of consecutive and non-consecutive variables. If you have comments or questions, please let me know in the comments section below.

38 Comments. Leave new

  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (3)

    Kashyap

    November 17, 2020 5:44 am

    If my column is in sequence like
    X1 X2 X3 X4 X5

    But i wat to output only X4 and X2 column in sequence of first X4 column and after X2 column.

    How can i do that?

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (4)

      Joachim

      November 17, 2020 6:26 am

      Hey Kashyap,

      You may use the following R code:

      data[ , c("x4", "x2")]

      Regards,

      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (5)

    shubhangi

    February 8, 2021 1:08 pm

    what if i want to extract selected columns with specific row value?

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (6)

      February 8, 2021 1:48 pm

      Hey Shubhangi,

      Are you looking for this? https://statisticsglobe.com/return-data-frame-row-based-on-value-in-column-in-r

      Regards,

      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (7)

    Rogue

    March 19, 2021 10:48 pm

    Hi, I’m trying to extract columns from multiple datasets so I can sum them but the number of columns in each dataset varies. I’m attempting to use a for loop.

    Here is my attempt:

    for (df in 1:length(locs)){
    newdf 0] #get rid of all columns that have only 0s
    newdfsum <- colSums(newdf[ , 9:length(newdf) ]) #sum everything in column 9 and after
    summarysums[i] <- newdfsum #put new df or list in empty vector
    }

    I can do this for one, but I haven't been able to loop through multiple datasets..
    Thank you!

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (8)

      Joachim

      March 22, 2021 8:10 am

      Hey,

      I have just responded to your email.

      Regards,

      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (9)

    Abby

    April 23, 2021 6:30 pm

    how do you filter and pick specific rows to use

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (10)

      Joachim

      April 24, 2021 6:13 am

      Hi Abby,

      Please have a look at this tutorial: https://statisticsglobe.com/extract-row-from-data-frame-in-r

      Regards

      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (11)

    William

    June 17, 2021 1:27 pm

    Hello. If I had a dataframe called df (containing 5 columns and 30 rows). What code would I use to subset rows 10 to 20 and columns 1 and 5 using base R?

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (12)

      Joachim

      June 18, 2021 6:22 am

      Hey William,

      You may use the following R code:

      df_new <- df[10:20, c(1, 5)]

      Regards

      Joachim

      Reply
      • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (13)

        William

        June 23, 2021 8:23 am

        Thank you, I appreciate it!

        – William

        Reply
        • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (14)

          Joachim

          June 23, 2021 8:40 am

          Thanks a lot for the nice feedback William! 🙂

          Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (15)

    Jacque Mason

    June 19, 2021 4:38 am

    Hello Joachim, how can I write a specification to extract dataset from a excel spreadsheet?

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (16)

      Joachim

      June 21, 2021 7:07 am

      Hey Jacque,

      Please have a look at this tutorial: https://statisticsglobe.com/r-read-excel-file-xlsx-xls It explains how to read data from an Excel file into R.

      I hope that helps!

      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (17)

    HR

    December 5, 2021 8:04 pm

    Hello. If I had a dataframe called df (containing 26 columns a-z). What code would I use to subset columns K to Q using R not by column #, but by column name range?

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (18)

      Joachim

      December 6, 2021 7:35 am

      Hey,

      You could use the which function to identify the locations of these columns:

      data <- as.data.frame(matrix(1:130, ncol = 26))colnames(data) <- lettersdata# a b c d e f g h i j k l m n o p q r s t u v w x y z# 1 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121 126# 2 2 7 12 17 22 27 32 37 42 47 52 57 62 67 72 77 82 87 92 97 102 107 112 117 122 127# 3 3 8 13 18 23 28 33 38 43 48 53 58 63 68 73 78 83 88 93 98 103 108 113 118 123 128# 4 4 9 14 19 24 29 34 39 44 49 54 59 64 69 74 79 84 89 94 99 104 109 114 119 124 129# 5 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 130data_subset <- data[ , which(colnames(data) == "k"):which(colnames(data) == "q")]data_subset# k l m n o p q# 1 51 56 61 66 71 76 81# 2 52 57 62 67 72 77 82# 3 53 58 63 68 73 78 83# 4 54 59 64 69 74 79 84# 5 55 60 65 70 75 80 85

      Regards,
      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (19)

    Samuel Hwang

    December 15, 2021 12:35 pm

    Hi Thank you very much for your great explanations!
    By the way, I got a trouble with this kind of naming process.
    For example, I have a data file of three column vectors of no variable name.

    How can I give names for each three vectors?

    c(1,2,3,4,5), c(6,7,8,910), c(11,12,13,14,15).
    if this is a data file, how can I give each name for ecah vector?

    Could you kindly let me know the codes?

    Thank you.

    Samuel

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (20)

      Joachim

      December 15, 2021 2:24 pm

      Hey Samuel,

      Thank you for the kind words, glad you found the tutorial helpful!

      You may store all those vectors in a data frame as shown below:

      df <- data.frame(name_1 = c(1,2,3,4,5), name_2 = c(6,7,8,9,10), name_3 = c(11,12,13,14,15))df# name_1 name_2 name_3# 1 1 6 11# 2 2 7 12# 3 3 8 13# 4 4 9 14# 5 5 10 15

      Regards,
      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (21)

    JK

    February 17, 2022 5:53 pm

    I want to sort rowwise values in specific columns, get top ‘n’ values, and get corresponding column names in new columns.

    The output would look something like this:

    SL SW PL PW Species high1 high2 high3 col1 col2 col3
    dbl>
    1 5.1 3.5 1.4 0.2 setosa 3.5 1.4 0.2 SW PL PW
    2 4.9 3 1.4 0.2 setosa 3 1.4 0.2 SW PL PW
    3 4.7 3.2 1.3 0.2 setosa 3.2 1.3 0.2 SW PL PW

    Tried something like code below, but unable to get column names. Help appreciated.

    iris %>%
    rowwise() %>%
    mutate(rows = list(sort(c( Sepal.Width, Petal.Length, Petal.Width), decreasing = TRUE))) %>%
    mutate(high1 = rows[1], col1 = list(colnames(~.)[~. ==rows[1]]),
    high2 = rows[2], col2 = list(colnames(~.)[~. ==rows[2]]),
    high3 = rows[3], col3 = list(colnames(~.)[~. ==rows[3]])
    ) %>%
    select(-rows)

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (22)

      Joachim

      February 18, 2022 9:04 am

      Hi JK,

      Are you looking for this code?

      Regards,
      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (23)

    Fazhir Kayondo

    March 19, 2022 5:24 pm

    Hi I have a data frame with column names
    A B C D E F G
    2 4 6 ? 5 7 3
    ? 2 3 5 ? 3 4
    2 2 3 4 5 6 5

    How could I select out only the columns with “?” in them? Thanks

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (24)

      Joachim

      March 21, 2022 7:19 am

      Hey Fazhir,

      Please have a look at the example below:

      data <- data.frame(A = c(2, "?", 2), B = c(4, 2, 2), C = c(6, 3, 3), D = c("?", 5, 4))data# A B C D# 1 2 4 6 ?# 2 ? 2 3 5# 3 2 2 3 4data[ , colSums(data == "?") > 0]# A D# 1 2 ?# 2 ? 5# 3 2 4

      Regards,
      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (25)

    TK

    May 12, 2022 10:55 pm

    Hi,

    How I am trying to compute SD and VAR for a difference in start and end times, its a time-variable basically, I get a NA when I run hms::as_hms(var(hms::as_hms(samp.trips$ride_length))). how can I get around this?

    Thank you in advance.

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (26)

      Joachim

      May 16, 2022 9:16 am

      Hey TK,

      Could you share the output when you run the following line of code?

      head(samp.trips$ride_length)

      Regards,
      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (27)

    Aman

    May 13, 2022 9:07 am

    Hi I have data frame consist of the 15 rows and 39 columns. I wanted to change the name of column name from 4-39 with a year names from 1980 to 2015. I can do it manually but is there any fast ways to do it so that all the colanmes have new name from 1980-2015.

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (28)

      Joachim

      May 16, 2022 9:22 am

      Hi Aman,

      You may use the following R syntax for this:

      colnames(data)[4:39] <- 1980:2015

      Regards,
      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (29)

    Nangula

    August 27, 2022 8:55 pm

    Hello Joachim,
    Thank you very much for your very useful tutorials, I have an excel csv sheet of 10511 rows with about 1344 columns containing reflectance values for either grass, shrubs or trees. Columns from 1 up to 671 is reflectance values per dates, while columns from 672 to 1341 are dates matching each band values in column 1 to 671. Columns 1341 to 1344 represent 3 columns (grass,shrubs, trees) containing percentage values for the respective cover . I would like to plot all the reflectance values per date which correspond to only 100% Grass cover, and also maybe based on different grass percentage cover scenarios such as plot all the values in the other columns for where grass percentage cover is higher than 50%. Could you kindly help with the type of code to use?

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (30)

      Joachim

      August 29, 2022 9:42 am

      Hey Nangula,

      Thank you very much for the kind comment, glad you like the tutorial!

      If I get your question correctly, you want to draw a plot of a subset of your data frame rows? Please have a look at this tutorial. It explains how to do that.

      Regards,
      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (31)

    Nangula

    August 27, 2022 9:05 pm

    Dear Joachim,

    Now with my correct email to my query of 1344 columns, and 10511 row records

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (32)

      Joachim

      August 29, 2022 9:42 am

      Please have a look at my response to your other comment.

      Regards, Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (33)

    Monti

    September 9, 2022 12:11 pm

    Dear Joachim,
    thanks for your great tutorials and articels ! I have got a question to your Example 1: Subsetting Data by Column Name. If I extract a certain column the rownames are not displayed anymore. Is there a possibility to extract a certain column incl. column name and data where the row names are also automaticaly included ? Thanks in advance 🙂

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (34)

      Joachim

      September 9, 2022 8:22 pm

      Hi Monti,

      First of all, thank you so much for the kind words, glad you find my tutorials useful! 🙂

      Regarding your question: Are you extracting only one column? In this case, the data frame is automatically converted to a vector object. You can avoid that as explained here.

      I hope this helps!

      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (35)

    Patrick Kofon Jokwi

    September 9, 2022 5:12 pm

    Thank you, Joachim, for your notes. they are explicit especially for persons little knowledge in R-Programing.

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (36)

      Joachim

      September 9, 2022 8:25 pm

      Hi Patrick,

      Thank you so much for the kind words, this is great to hear! 🙂

      Regards,
      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (37)

    Nikki

    October 7, 2022 2:41 pm

    After extracting certain columns and data from a bigger data set, how can I saw the extracted data into a new file which I can then use e.g. a csv

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (38)

      Joachim

      November 14, 2022 11:50 am

      Hey Nikki,

      Please excuse the late response. I was on a long holiday so unfortunately I wasn’t able to reply sooner. Still need help with your code?

      Regards,
      Joachim

      Reply
  • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (39)

    Massub Tehseen

    October 31, 2022 5:31 pm

    Hi Jaochim,
    Thank you for your tutorial. I need some help please.
    I have a data with 2000 columns thousand of rows and say I want to retain all the rows and 1:10 cols but after that I randomly need 600 cols from the rest of the data. I have the list of the col names that I want. Is there any easy way like match and extract the cols that I want.
    Thanking you in advance.

    Reply
    • R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (40)

      Joachim

      November 14, 2022 12:46 pm

      Hi Massub,

      Thank you so much for the kind words, glad you find my tutorials helpful!

      I apologize for the delayed reply. I was on a long vacation, so unfortunately I wasn’t able to get back to you earlier. Do you still need help with your syntax?

      Regards,
      Joachim

      Reply

Leave a Reply

I’m Joachim Schork. On this website, I provide statistics tutorials as well as code in Python and R programming.

Statistics Globe Newsletter

Related Tutorials

Add Empty Column to Data Frame in R (2 Examples)

Reshape Data with Multiple Measure Columns from Wide to Long Format in R (2 Examples)

R Extract Specific Columns of Data Frame (4 Examples) | Select Subset (2024)

FAQs

How to select a subset of columns in R? ›

Selecting specific columns by Subsetting Data with select Argument of subset Function
  1. x: object to be subsetted.
  2. subset: logical expression indicating elements or rows to keep: missing values are taken as false.
  3. select: expression, indicating columns to select from a data frame.
  4. drop: passed on to [ indexing operator.
Dec 21, 2023

How to extract certain columns from a dataframe in R? ›

To pick out single or multiple columns use the select() function. The select() function expects a dataframe as it's first input ('argument', in R language), followed by the names of the columns you want to extract with a comma between each name.

How do I extract a subset of a Dataframe in R? ›

The most general way to subset a data frame by rows and/or columns is the base R Extract[] function, indicated by matched square brackets instead of the usual matched parentheses. For a data frame named d the general format is d[rows, columms] .

How do I select a specific column data frame in R? ›

To select a specific column, you can also type in the name of the dataframe, followed by a $ , and then the name of the column you are looking to select. In this example, we will be selecting the payment column of the dataframe.

How do you subset specific rows and columns in R? ›

dataframe[i, j] is syntax used to subset rows and column from R dataframe where i represents index or logical vector to subset rows and j represent index or logical vector to subset columns. For example, newdata[1, 3] will return value from 1st row and 3rd column.

How do you select subsets of data? ›

When selecting subsets of data, square brackets [] are used. Inside these brackets, you can use a single column/row label, a list of column/row labels, a slice of labels, a conditional expression or a colon. Select specific rows and/or columns using loc when using the row and column names.

How do I extract a list of columns from a Dataframe? ›

Use the tolist() Method

As we have seen, the tolist() method is the simplest and most efficient way to extract a list from a Pandas dataframe column. It takes advantage of the optimized internal data structures of Pandas and NumPy to quickly convert a column of data into a list.

How to extract specific values from dataframe in R? ›

Extract data frame cell value
  1. Extract value of a single cell: df_name[x, y] , where x is the row number and y is the column number of a data frame called df_name .
  2. Extract the entire row: df_name[x, ] , where x is the row number. ...
  3. Extract the entire column: df_name[, y] where y is the column number.

Which operation is used to extract specific columns from a table? ›

Correct answer is Projection. Projection Operation displays the specific column of a table. The Project operation is also known as vertical partitioning. It is denoted by pie (∏).

How does subset() work in R? ›

To limit your dataset to a subset of observations in base R, use brackets [ ] or subset() . With brackets you can subset based on row numbers, row names, or a logical expression. With subset() , you must use a logical expression. Selecting a subset of observations is called filtering.

How do you select a subset of a variable in R? ›

Selection using the Subset Function

The subset( ) function is the easiest way to select variables and observations. In the following example, we select all rows that have a value of age greater than or equal to 20 or age less then 10. We keep the ID and Weight columns.

How do you select certain variables in R? ›

To limit your dataset to a subset of variables in base R, use brackets [ ] or subset() . In tidyverse , use select() . As with subset() , you name the variables you want to keep, without quotes, or precede with a minus sign the names of variables you want to drop.

What does >%> mean in R? ›

R pipes are a way to chain multiple operations together in a concise and expressive way. They are represented by the %>% operator, which takes the output of the expression on its left and passes it as the first argument to the function on its right. Using pipes in R allows us to link a sequence of analysis steps.

How do I select all columns except one in a Dataframe in R? ›

The select() function of dplyr allows users to select all columns of the data frame except for the specified columns. To exclude columns, add the - operator before the name of the column or columns when passing them as an arguments to select().

How do you select all columns but one in a Dataframe in R? ›

Select All Columns Except One Column by Index in R

First, let's use the R base bracket notation df[] to select all columns except one column by Index. This notation takes syntax df[, columns] to select columns in R, And to ignore columns you have to use the – (negative) operator.

How do I subset only numeric columns in R? ›

The easiest way to do it is by using select_if function of dplyr package but we can also do it through lapply.

How do you subset all columns except one in R? ›

R provides the subset() function which can be used to drop all columns in a data frame except those that are specified. This is done by specifying the column names to be kept in the select argument of the subset() function.

What does %>% do in R? ›

R pipes are a way to chain multiple operations together in a concise and expressive way. They are represented by the %>% operator, which takes the output of the expression on its left and passes it as the first argument to the function on its right. Using pipes in R allows us to link a sequence of analysis steps.

Top Articles
Latest Posts
Article information

Author: Kelle Weber

Last Updated:

Views: 5524

Rating: 4.2 / 5 (53 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Kelle Weber

Birthday: 2000-08-05

Address: 6796 Juan Square, Markfort, MN 58988

Phone: +8215934114615

Job: Hospitality Director

Hobby: tabletop games, Foreign language learning, Leather crafting, Horseback riding, Swimming, Knapping, Handball

Introduction: My name is Kelle Weber, I am a magnificent, enchanting, fair, joyous, light, determined, joyous person who loves writing and wants to share my knowledge and understanding with you.