I'm re-factoring a script originally written in R Studio for R Services in SQL Server 2017. The stored procedure uses dplyr
and is supposed to return the mean and standard deviation for responses to a given StudyID.
The stored procedure syntax is as follows:
ALTER PROCEDURE [dbo].[spCodeMeans]
-- Add the parameters for the stored procedure here
@StudyID int,
@StudyID_outer int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
BEGIN TRY
exec sp_execute_external_script
@language = N'R',
@script = N'
# Summary Mean
#
# Calculates the mean of all independent variables in a table of data
# grouped by \code{code}. Note that independent variables are
# identified as all columns matching the following pattern: the letter
# "c" followed by a one-or-more digit number.
#
# @param x Table to summarize
# @importFrom rlang .data
#
# @return Summary table where each distinct \code{code} value is
# represented by one row with columns for the respective means of
# each independent variable.
#install.packages("dplyr")
code_mean <- function(x) {
`%>%` = magrittr:: `%>%`
dplyr::group_by(x, .data$StudyID) %>%
dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), mean)
}
cmresult <- code_mean(x = clsdStudies)
# Summary Standard Deviation
#
# Calculates the standard deviation of all independent variables in a
# table of data grouped by \code{code}. Note that independent variables
# are identified as all columns matching the following pattern: the
# letter "c" followed by a one-or-more digit number.
#
# @param x Table to summarize
# @importFrom rlang .data
#
# @return Summary table where each distinct \code{code} value is
# represented by one row with columns for the respective standard
# deviations of each independent variable.
code_sd <- function(x) {
`%>%` = magrittr:: `%>%`
dplyr::group_by(x, .data$StudyID) %>%
dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), stats::sd)
}
sdresult <- code_sd(x = clsdStudies)
',
@input_data_1 = N'
Select Responses =
c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue,
c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1,
c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
from ClosedStudyResponses c
where DependentVarYN = 0
',
@input_data_1_name = N'clsdStudies',
@params = N'@StudyID int',
@StudyID = @StudyID_outer
--@output_data_1_name = N'dfcm',
--@output_data_2_name = N'dfsd'
WITH RESULT SETS (
(cmsresult varchar(MAX)),
(sdresult varchar(MAX)))
END TRY
BEGIN CATCH
THROW;
END CATCH
END
When I run the stored procedure, and pass in a valid StudyID, I receive the following error:
Msg 39004, Level 16, State 20, Line 4
A 'R' script error occurred during execution of 'sp_execute_external_script'
with HRESULT 0x80004004.
Msg 39019, Level 16, State 1, Line 4
An external script error occurred:
Error in mutate_impl(.data, dots) :
Evaluation error: Column `StudyID`: not found in data.
Calls: source ... as.data.frame -> mutate -> mutate.tbl_df -> mutate_impl ->
.Call
Error in execution. Check the output for more information.
Error in eval(ei, envir) :
Error in execution. Check the output for more information.
Calls: source -> withVisible -> eval -> eval -> .Call
Execution halted
How do I resolve this error so that the StudyID parameter is passed into the dplyr functions?
User contributions licensed under CC BY-SA 3.0