I'm using dplyr
in my stored procedure. I'm passing in a dataset that's returned by querying responses based on a provided input parameter, StudyID. The goal is to return the results of dplyr from the R script back to the stored procedure. However, I receive the error in my question title.
My stored procedure is as follows:
ALTER PROCEDURE [dbo].[spCodeMeans]
-- Add the parameters for the stored procedure here
@StudyID int,
@StudyID_outer int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
BEGIN TRY
exec sp_execute_external_script
@language = N'R',
@script = N'
# Summary Mean
#
# Calculates the mean of all independent variables in a table of data
# grouped by \code{code}. Note that independent variables are
# identified as all columns matching the following pattern: the letter
# "c" followed by a one-or-more digit number.
#
# @param x Table to summarize
# @importFrom rlang .data
#
# @return Summary table where each distinct \code{code} value is
# represented by one row with columns for the respective means of
# each independent variable.
#install.packages("dplyr")
code_mean <- function(x) {
dplyr::group_by(x, .data$code) %>%
dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), mean)
}
cmresult <- code_mean(x = input_data_1)
# Summary Standard Deviation
#
# Calculates the standard deviation of all independent variables in a
# table of data grouped by \code{code}. Note that independent variables
# are identified as all columns matching the following pattern: the
# letter "c" followed by a one-or-more digit number.
#
# @param x Table to summarize
# @importFrom rlang .data
#
# @return Summary table where each distinct \code{code} value is
# represented by one row with columns for the respective standard
# deviations of each independent variable.
code_sd <- function(x) {
dplyr::group_by(x, .data$code) %>%
dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), stats::sd)
}
sdresult <- code_sd(x = input_data_1)
',
@input_data_1 = N'
Select Responses =
c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue,
c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1,
c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
from ClosedStudyResponses c
where DependentVarYN = 0
',
@params = N'@StudyID int',
@StudyID = @StudyID_outer
--@output_data_1_name = N'dfcm',
--@output_data_2_name = N'dfsd'
WITH RESULT SETS (
(cmsresult varchar(MAX)),
(sdresult varchar(MAX)))
END TRY
BEGIN CATCH
THROW;
END CATCH
END
I call spCodeMeans
per below:
DECLARE @out varchar(MAX), @out1 varchar(MAX);
EXEC spCodeMeans
@StudyID = 21,
@StudyID_outer = 21
SELECT @out, @out1;
GO
Error syntax:
Msg 39004, Level 16, State 20, Line 4
A 'R' script error occurred during execution of 'sp_execute_external_script'
with HRESULT 0x80004004.
Msg 39019, Level 16, State 1, Line 4
An external script error occurred:
Error in dplyr::group_by(x, .data$code) %>%
dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), :
could not find function "%>%"
Calls: source -> withVisible -> eval -> eval -> code_mean
Error in execution. Check the output for more information.
Error in eval(ei, envir) :
Error in execution. Check the output for more information.
Calls: source -> withVisible -> eval -> eval -> .Call
Execution halted
How do I remedy the error such that the query results are passed into the R Script and both mean and standard deviation are returned?
Update Per comments, I added to the stored procedure so that it is now as follows:
BEGIN TRY
exec sp_execute_external_script
@language = N'R',
@script = N'
# Summary Mean
#
# Calculates the mean of all independent variables in a table of data
# grouped by \code{code}. Note that independent variables are
# identified as all columns matching the following pattern: the letter
# "c" followed by a one-or-more digit number.
#
# @param x Table to summarize
# @importFrom rlang .data
#
# @return Summary table where each distinct \code{code} value is
# represented by one row with columns for the respective means of
# each independent variable.
#install.packages("dplyr")
code_mean <- function(x) {
`%>%` = magrittr:: `%>%`
dplyr::group_by(x, .data$code) %>%
dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), mean)
}
cmresult <- code_mean(x = input_data_1)
# Summary Standard Deviation
#
# Calculates the standard deviation of all independent variables in a
# table of data grouped by \code{code}. Note that independent variables
# are identified as all columns matching the following pattern: the
# letter "c" followed by a one-or-more digit number.
#
# @param x Table to summarize
# @importFrom rlang .data
#
# @return Summary table where each distinct \code{code} value is
# represented by one row with columns for the respective standard
# deviations of each independent variable.
code_sd <- function(x) {
`%>%` = magrittr:: `%>%`
dplyr::group_by(x, .data$code) %>%
dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), stats::sd)
}
sdresult <- code_sd(x = input_data_1)
',
@input_data_1 = N'
Select Responses =
c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue,
c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1,
c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
from ClosedStudyResponses c
where DependentVarYN = 0
',
@params = N'@StudyID int',
@StudyID = @StudyID_outer
--@output_data_1_name = N'dfcm',
--@output_data_2_name = N'dfsd'
WITH RESULT SETS (
(cmsresult varchar(MAX)),
(sdresult varchar(MAX)))
END TRY
BEGIN CATCH
THROW;
END CATCH
END
The original error is resolved and is replaced by the following:
Msg 39004, Level 16, State 20, Line 4
A 'R' script error occurred during execution of 'sp_execute_external_script'
with HRESULT 0x80004004.
Msg 39019, Level 16, State 1, Line 4
An external script error occurred:
Error in dplyr::group_by(x, .data$code) : object 'input_data_1' not found
Calls: source ... eval -> code_mean -> %>% -> eval -> eval -> <Anonymous>
Error in execution. Check the output for more information.
Error in eval(ei, envir) :
Error in execution. Check the output for more information.
Calls: source -> withVisible -> eval -> eval -> .Call
Execution halted
User contributions licensed under CC BY-SA 3.0