Apply function to table or timetable variables
collapse all in page
Syntax
B = varfun(func,A)
B = varfun(func,A,Name,Value)
Description
example
B = varfun(func,A)
appliesthe function func
separately to each variableof the table or timetable A
and returns the resultsin the table or timetable B
.
The function func
must take one input argument and return an array with the same number of rows each time it is called. The i
th value in the output argument, B{:,i}
, is equal to func(A{:,i})
.
example
B = varfun(func,A,Name,Value)
specifies options using one or more name-value arguments. For example, you can use the GroupingVariables
name-value argument to perform calculations on groups of data within table variables. For more information about calculations on groups of data, see Calculations on Groups of Data.
Examples
collapse all
Apply Element-Wise Function
Open Live Script
Apply an element-wise function to the variables of a table.
Create a table that contains numeric variables.
A = table([10.71;-2.05;-0.35;-0.82;1.57],[9.23;3.12;-1.18;0.23;16.41])
A=5×2 table Var1 Var2 _____ _____ 10.71 9.23 -2.05 3.12 -0.35 -1.18 -0.82 0.23 1.57 16.41
Round the numeric values in A
by applying the round
function. To specify a function as an input argument to varfun
, use the @
symbol. The variable names of the output table are based on the function name and the variable names from the input table.
B = varfun(@round,A)
B=5×2 table round_Var1 round_Var2 __________ __________ 11 9 -2 3 0 -1 -1 0 2 16
Apply Function That Reduces Table Variables
Open Live Script
You can apply a function, such as sum
or max
, that reduces table variables along the first dimension. For example, use varfun
to calculate the mean of each variable in a table.
Create a table that contains numeric variables.
A = table([0.71;-2.05;-0.35;-0.82;1.57],[0.23;0.12;-0.18;0.23;0.41])
A=5×2 table Var1 Var2 _____ _____ 0.71 0.23 -2.05 0.12 -0.35 -0.18 -0.82 0.23 1.57 0.41
Apply the mean
function to all the variables of the table. The output table contains the mean value of each variable of the input table.
B = varfun(@mean,A)
B=1×2 table mean_Var1 mean_Var2 _________ _________ -0.188 0.162
To have varfun
return a numeric vector instead of a table, specify the OutputFormat
name-value argument as "uniform"
. To use the "uniform"
output format, func
must always return a scalar.
B = varfun(@mean,A,"OutputFormat","uniform")
B = 1×2 -0.1880 0.1620
Apply Function to Groups Within Variables
Open Live Script
Create a table that has numeric data variables and a nonnumeric variable that is a grouping variable. Then perform a calculation on each group within the numeric variables.
Read data from a CSV (comma-separated values) file into a table. The sample file contains test scores for 10 students from two different schools.
scores = readtable("testScores.csv","TextType","string");scores.School = categorical(scores.School)
scores=10×5 table LastName School Test1 Test2 Test3 __________ __________ _____ _____ _____ "Jeong" XYZ School 90 87 93 "Collins" XYZ School 87 85 83 "Torres" XYZ School 86 85 88 "Phillips" ABC School 75 80 72 "Ling" ABC School 89 86 87 "Ramirez" ABC School 96 92 98 "Lee" XYZ School 78 75 77 "Walker" ABC School 91 94 92 "Garcia" ABC School 86 83 85 "Chang" XYZ School 79 76 82
Calculate the mean score for each test by school. The variables Test1
, Test2
, and Test3
are the numeric data variables. The School
variable is the grouping variable. When you specify a grouping variable, its unique values define groups that corresponding values in the data variables belong to.
vars = ["Test1","Test2","Test3"];meanScoresBySchool = varfun(@mean, ... scores, ... "InputVariables",vars, ... "GroupingVariables","School")
meanScoresBySchool=2×5 table School GroupCount mean_Test1 mean_Test2 mean_Test3 __________ __________ __________ __________ __________ ABC School 5 87.4 87 86.8 XYZ School 5 84 81.6 84.6
The output table includes a variable named GroupCount
to indicate the number of rows from the input table in that group.
Apply Function to Groups Within Timetable Variables
Open Live Script
Create a timetable containing sample data. The row times of the timetable can define groups because row times can be duplicates.
Timestamps = datetime(2023,1,1)+days([0 1 1 2 3 3])';A = timetable(Timestamps, ... [0.71;-2.05;-0.35;-0.82;1.57;0.09], ... [0.23;0.12;-0.18;0.23;0.41;0.02], ... 'VariableNames',["x","y"])
A=6×2 timetable Timestamps x y ___________ _____ _____ 01-Jan-2023 0.71 0.23 02-Jan-2023 -2.05 0.12 02-Jan-2023 -0.35 -0.18 03-Jan-2023 -0.82 0.23 04-Jan-2023 1.57 0.41 04-Jan-2023 0.09 0.02
Compute the mean values of the variables in the timetable by day. Specify the vector of row times as the grouping variable. The output B
is a timetable because the input A
is a timetable. When you specify the vector of row times as the grouping variable, you cannot specify any variable as another grouping variable.
B = varfun(@mean,A,"GroupingVariables","Timestamps")
B=4×3 timetable Timestamps GroupCount mean_x mean_y ___________ __________ ______ ______ 01-Jan-2023 1 0.71 0.23 02-Jan-2023 2 -1.2 -0.03 03-Jan-2023 1 -0.82 0.23 04-Jan-2023 2 0.83 0.215
Pass Optional Arguments to Applied Function
Open Live Script
To pass optional arguments when you apply a function, wrap the function call in an anonymous function.
Create a table that contains numeric variables. Assign NaN
to some elements of the table.
A = table([10.71;-2.05;NaN;-0.82;1.57],[9.23;NaN;-1.18;0.23;16.41])
A=5×2 table Var1 Var2 _____ _____ 10.71 9.23 -2.05 NaN NaN -1.18 -0.82 0.23 1.57 16.41
By default, the mean
function returns NaN
when input arrays have NaN
s.
B = varfun(@mean,A)
B=1×2 table mean_Var1 mean_Var2 _________ _________ NaN NaN
To omit NaN
s when you apply mean
, specify the "omitnan"
option. To use this option when you apply mean
, wrap a call that specifies "omitnan"
in an anonymous function.
func = @(x) mean(x,"omitnan");
Calculate the mean values with "omitnan"
by applying the anonymous function.
C = varfun(func,A)
C=1×2 table Fun_Var1 Fun_Var2 ________ ________ 2.3525 6.1725
Input Arguments
collapse all
func
— Function
function handle
Function, specified as a function handle. You can specify a handle for an existing function, define the function in a file, or specify an anonymous function. The function takes one input argument and must have a syntax in this form:
result = f(arg)
To call f
on the variables of A
, specify func
as shown in this call to varfun
.
func = @f;B = varfun(func,A);
For every variable in A
, varfun
calls func
on that variable, and then assigns the output of func
as the corresponding variable in output B
.
Some further considerations:
The function that
func
represents can have other syntaxes with additional optional arguments. But whenvarfun
calls the function, it calls the syntax that has only one input argument.For example, the
mean
function has syntaxes that specify optional arguments, such as"omitnan"
. But if you specifyfunc
as@mean
, thenvarfun
callsmean
using themean(arg)
syntax.To call a function with optional arguments, wrap it in an anonymous function. For example, to call
mean
with the"omitnan"
option, specifyfunc
as@(x) mean(x,"omitnan")
.If
func
returns an array with a different number of rows each time it is called, then specify theOutputFormat
name-value argument as"cell"
. Otherwise,func
must return an array with the same number of rows each time it is called.If
func
corresponds to more than one function file (that is, iffunc
represents a set of overloaded functions), MATLAB® determines which function to call based on the class of the input arguments.
Example: B = varfun(@mean,A)
calculates the mean value of an input.
Example: B = varfun(@(x) x.^2,A)
calculates the square of each element of an input.
Example: B = varfun(@(x) mean(x,"omitnan"),A)
calls mean
with the "omitnan"
option specified.
A
— Input table
table | timetable
Input table, specified as a table or timetable.
Name-Value Arguments
Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Example: B = varfun(func,A,InputVariables=["Var2","Var3"])
uses only the variables named Var2
and Var3
in A
as the inputs to func
.
Before R2021a, use commas to separate each name and value, and enclose Name
in quotes.
Example: B = varfun(func,A,"InputVariables",["Var2","Var3"])
uses only the variables named Var2
and Var3
in A
as the inputs to func
.
InputVariables
— Variables of A
to pass to func
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | pattern
scalar | logical vector | function handle
Variables of A
to pass to func
, specified using one of the indexing schemes from this table.
Indexing Scheme | Examples |
---|---|
Variable names:
|
|
Variable index:
|
|
Function handle:
|
|
Example: B = varfun(func,A,InputVariables=[1 3 4])
uses only the first, third, and fourth variables in A
as the inputs to func
.
Example: B = varfun(func,A,InputVariables=@isnumeric)
uses only the numeric variables in A
as the inputs to func
.
GroupingVariables
— Variables of A
to use as grouping variables
positive integer | vector of positive integers | string array | character vector | cell array of character vectors | pattern
scalar | logical vector
Variables of A
to use as grouping variables, specified using one of the indexing schemes from this table.
Indexing Scheme | Examples |
---|---|
Variable names:
|
|
Variable index:
|
|
The unique values in the grouping variables define groups. Rows in A
where the grouping variables have the same values belong to the same group. varfun
applies func
to each group of rows within each of the remaining variables of A
, rather than to entire variables. For more information on calculations using grouping variables, see Calculations on Groups of Data.
Grouping variables can have any of the data types listed in this table.
Values That Specify Groups | Data Type of Grouping Variable |
---|---|
Numbers | Numeric or logical vector |
Text | String array or cell array of character vectors |
Dates and times |
|
Categories |
|
Bins | Vector of binned values, created by binning a continuous distribution of numeric, |
Many data types have ways to represent missing values, such as NaN
s, NaT
s, undefined categorical
values, or missing strings. If any grouping variable has a data type that can represent missing values, then rows where missing values occur in that grouping variable do not belong to any group and are excluded from the output.
Row labels can be grouping variables. You can group on row labels alone, on one or more variables in A
, or on row labels and variables together.
If
A
is a table, then the labels are row names.If
A
is a timetable, then the labels are row times.
The output B
has one row for each group of rows in the input A
. If B
is a table or timetable, then B
has:
Variables corresponding to the input table variables that
func
was applied toVariables corresponding to the grouping variables
A new variable,
GroupCount
, whose values are the number of rows of the inputA
that are in each group
If B
is a timetable, then B
also has:
Row times, where the first row time from each group of rows in
A
is the corresponding row time inB
. To returnB
as a table without row times, specifyOutputFormat
as"table"
.
Example: B = varfun(func,A,GroupingVariables="Var3")
uses the variable named Var3
in A
as a grouping variable.
Example: B = varfun(func,A,GroupingVariables=["Var3","Var4"])
uses the variables named Var3
and Var4
in A
as grouping variables.
Example: B = varfun(func,A,GroupingVariables=[3 4])
uses the third and fourth variables in A
as grouping variables.
OutputFormat
— Format of B
"auto"
(default) | "table"
| "timetable"
| "uniform"
| "cell"
Format of B
, specified as one of the values in this table.
|
|
|
If |
|
If |
|
|
|
|
Example: B = varfun(func,A,OutputFormat="uniform")
returns the output as a vector.
ErrorHandler
— Function to call if func
fails
function handle
Function to call if func
fails, specified as a function handle. If func
throws an error, then the error handler function specified by ErrorHandler
catches the error and takes the action specified in the function. The error handler either must throw an error or return the same number of outputs as func
.
If you do not specify ErrorHandler
, then varfun
rethrows the error that it caught from func
.
The first input argument of the error handler is a structure with these fields:
cause
— MException object that contains information about the error (since R2024a)index
— Index of the variable where the error occurredname
— Name of the variable where the error occurred
The remaining input arguments to the error handler are the input arguments for the call to func
that made func
throw the error.
For example, suppose that func
returns two doubles as output arguments. You can specify the error handler as a function that raises a warning and returns two output arguments.
function [A,B] = errorFunc(S,varargin) warning(S.cause.identifier,S.cause.message); A = NaN; B = NaN;end
In releases before R2024a, the first input argument of the error handler is a structure with these fields:
identifier
— Error identifiermessage
— Error message textindex
— Index of the variable where the error occurredname
— Name of the variable where the error occurred
Example: B = varfun(func,A,ErrorHandler=@errorFunc)
specifies errorFunc
as the error handler.
Output Arguments
collapse all
More About
collapse all
Calculations on Groups of Data
In data analysis, you commonly perform calculations on groups of data. For such calculations, you split one or more data variables into groups of data, perform a calculation on each group, and combine the results into one or more output variables. You can specify the groups using one or more grouping variables. The unique values in the grouping variables define the groups that the corresponding values of the data variables belong to.
For example, the diagram shows a simple grouped calculation that splits a 6-by-1 numeric vector into two groups of data, calculates the mean of each group, and then combines the outputs into a 2-by-1 numeric vector. The 6-by-1 grouping variable has two unique values, AB
and XYZ
.
You can specify grouping variables that have numbers, text, dates and times, categories, or bins.
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
This function supports tall arrays with the limitations:
The
func
input must always returna tall array.Supported name-value arguments are:
InputVariables
— Value cannot be a function handle or pattern expression.GroupingVariables
— Value cannot be a pattern expression.OutputFormat
— Value can be"uniform"
,"table"
,"timetable"
, or"cell"
only.
When the input array is a tall timetable and
OutputFormat
is"timetable"
, the specified function must return an array with the same size in the first dimension as the input. SpecifyOutputFormat
as"table"
when the input function is a reduction function such asmean
.
For more information, see Tall Arrays.
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
The function handle input,
func
, must be constant.While function handles can be inputs to
varfun
itself, they cannot be inputs to your entry point functions. Specifyfunc
within the code meant for code generation. For more information, see Function Handle Limitations for Code Generation (MATLAB Coder).The values for all name-value arguments must be constant.
The values of the
InputVariables
andGroupingVariables
name-value arguments do not support pattern expressions.The
ErrorHandler
name-value argument is not supported for code generation.Variable-size input arguments are not supported.
Grouping variables cannot have duplicate values in generated code.
You cannot specify
OutputFormat
as"cell"
if you specify theGroupingVariables
name-value argument and the function returns a different data type for each variable specified byInputVariables
.If the input is a timetable and you specify
GroupingVariables
, then the output is always an irregular timetable.If you specify groups and the number of groups is not known at compile time, and that number is zero, then empty double variables in the output might have sizes of 1-by-0 in generated code. In MATLAB, such variables have sizes of 0-by-0.
Thread-Based Environment
Run code in the background using MATLAB® backgroundPool
or accelerate code with Parallel Computing Toolbox™ ThreadPool
.
Version History
Introduced in R2013b
expand all
R2023a: Match output data type to input data type by specifying the OutputFormat
name-value argument as "auto"
To return an output whose data type matches the data type of the input, specify the OutputFormat
name-value argument as "auto"
. This value is the default value.
See Also
rowfun | cellfun | structfun | arrayfun | findgroups | splitapply | groupsummary | convertvars | vartype
Topics
- Calculations When Tables Have Both Numeric and Nonnumeric Data
- Anonymous Functions
- Perform Calculations by Group in Table
- Summarize or Pivot Data in Tables Using Groups
- Data Cleaning and Calculations in Tables
- Grouped Calculations in Tables and Timetables
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- Deutsch
- English
- Français
- United Kingdom (English)
Contact your local office