In R, a data frame is a popular data structure used to store and manipulate data. It is a table-like structure with rows and columns, and each column has a name. These names are used to identify the variables in the data frame.
In this tutorial, we will discuss different ways to get column names in R.
Here’s an example of a simple data frame in R that contains information about multiple programming websites.
df <- data.frame( Website = c("RTutorial", "CodeAcademy", "Udemy", "Coursera"), Number_of_Courses = c(120, 250, 500, 800), Average_Course_Duration = c("3 hours", "4 hours", "5 hours", "6 hours"), Pricing = c("Free", "Subscription", "Subscription", "Subscription"))
Website Number_of_Courses Average_Course_Duration Pricing1 RTutorial 120 3 hours Free2 CodeAcademy 250 4 hours Subscription3 Udemy 500 5 hours Subscription4 Coursera 800 6 hours Subscription
This data frame contains four columns:
- Website: The name of the website.
- Number_of_Courses: The number of courses available on the website.
- Average_Course_Duration: The average duration of a course on the website.
- Pricing: The pricing model for the website (e.g., free, subscription-based).
You can use these functions to get the names of this data frame’s column.
colnames()
This function is designed to get and set column names of matrix-like objects like data frames.
When used without any arguments, it returns a character vector containing the column names of the given data frame or matrix. For example, colnames(df) will return the column names of the data frame df.
> colnames(df)
[1] "Website" "Number_of_Courses" [3] "Average_Course_Duration" "Pricing"
You can access certain columns by using the indexing operator []. It allows you to get one or multiple column names using their indices in the vector that colnames()
has returned.
This is how you can, for example, get the name of the first column, the second and fourth columns, and the columns between the second and third positions, respectively:
> colnames(df)[1]
[1] "Website"
> colnames(df)[c(2, 4)]
[1] "Number_of_Courses" "Pricing"
> colnames(df)[2:3]
[1] "Number_of_Courses" "Average_Course_Duration"
names()
The names() function in R is a generic function that can be used to retrieve or set the names of an object.
It is used to get or set the names of the elements of the object, whether it be the names of the columns in a data frame or matrix, the names of the elements in a list, the names of the levels in a factor, or other types of objects that have a concept of “names”.
For data frames or matrices, names() can also be used to retrieve or set the column names. This command will return a vector whose elements are the names of the columns in our data frame:
> names(df)
[1] "Website" "Number_of_Courses" [3] "Average_Course_Duration" "Pricing"
Similarly, you can access individual column names using R’s indexing operator:
> names(df)[1]
[1] "Website"
> names(df)[c(2, 4)]
[1] "Number_of_Courses" "Pricing"
> names(df)[2:3]
[1] "Number_of_Courses" "Average_Course_Duration"
In this case, colnames() and names() are equivalent, and you can use either of them to retrieve the column names of a data frame.
However, colnames() is more specific for matrix-like objects and can be used for them only, whereas names() can be used for other objects in R as well.
dimnames()
The dimnames() function in R is a generic function that can be used to retrieve or set the names of the dimensions of an object.
For a data frame or matrix, dimnames() returns a list of length 2, with the first element being the row names and the second element being the column names. For example, dimnames(df) will return a list like this:
> dimnames(df)
[[1]][1] "1" "2" "3" "4"[[2]][1] "Website" "Number_of_Courses" [3] "Average_Course_Duration" "Pricing"
You can use dimnames() to get the column names of a data frame or matrix by accessing the second element of the list that it returns. In particular, this command returns all the column names in our data frame:
> dimnames(df)[[2]]
[1] "Website" "Number_of_Courses" [3] "Average_Course_Duration" "Pricing"
You can apply the indexing operator to this vector to get one or more column names in it:
> dimnames(df)[[2]][1]
[1] "Website"
> dimnames(df)[[2]][c(2, 4)]
[1] "Number_of_Courses" "Pricing"
> dimnames(df)[[2]][2:3]
[1] "Number_of_Courses" "Average_Course_Duration"
While similar, the dimnames() function is more powerful than colnames() and names() in the sense that it can be used to get or set row names as well.
There are multiple ways to get column names in R, such as colnames(), names(), and dimnames() functions. All of these functions can be used to retrieve the column names, but dimnames() can also be used to get both row and column names at the same time.
Post Views: 148