This is an old requirement, as old as SQL Server 7 (or even earlier). The typical T-SQL solution is to use xp_cmdshell or an undocumented stored procedure xp_delete_file.
These options work, but there are a number of items to consider:
xp_cmdshell may not be allowed as per company security policy or auditing requirements. Undocumented stored procedures are never recommended for use in production environment as you never know whether your next patch will break it. For complex deletion logic such as deleting files older than 3 days, but only when file size is bigger than 1MB and the filename contains the string 'temp' are very difficult to program using the xp_cmdshell or xp_delete_file options.Another question is how to list files in a folder?
The typical solution is to use xp_cmdshell or write a CLR stored procedure/function, but CLR integration is considered relatively more difficult to develop and maintain than a T-SQL script.
SolutionStarting with SQL Server 2016, we are able to use the R language directly inside SQL Server via sp_execute_external_script.
The assumption here is that you have installed R service (in-database). For detailed installation steps, please check this MSDN document.
Before plugging R code into a SQL script, it is best to test the R script in a dedicated R IDE. I personally prefer RStudio Desktop IDE , and it is free.
Once the R service is installed and configured, we can prepare the demo case.
Let's say I have a folder with a few files in it as shown below.

Figure 1 - Sample Directory
Native R Script DemonstrationAll the R code below is run in the RStudio IDE.
List Files with RI will run the following script in RStudio to do the following tasks:
List files in a folder List files whose names match a condition List files whose property, such as size, creation time, matches a condition List files with multiple conditionsAll the script functions are detailed in the comments below:
#1. list files in a folderdir(path="c:/MyTest", full.names = TRUE) #R is case-sensitive, so True <> TRUE
#2.1. list files with names containing "temp" literal value, pattern parm accepts RegEx
dir(path="c:/MyTest", pattern = "temp") #R is case-sensitive, so True <> TRUE
#2.2. list files with names containing number(s) by using regular expression \d
dir(path="c:/MyTest", pattern = "\\d", full.names = T) # T means TRUE
#3. list files created more than 7 days ago.
fi <- file.info(dir(path="c:/Mytest", full.names = T));
fi[difftime(Sys.time(), fi[,"ctime"], units="days") > 7, c( 'ctime', 'size')]
#4. list files created within 7 days and size bigger than 1000 bytes
fi <- file.info(dir(path="c:/Mytest", full.names = T));
fi[difftime(Sys.time(), fi[,"ctime"], units="days") <= 7 & fi['size'] > 1000, c('ctime', 'size')]
The execution result is as follows in the RStudio IDE, you can check the result against the info in Figure 1.

Figure 2 - Results of List Files R Code
Delete Files with RNow we will look at how to delete files. The following script tasks are:
Delete a file Delete multiple files based on a criteria Delete a whole directoryHere is the script:
#1. del some specified filesfile.remove("c:/Mytest/result.jpg", "c:/mytest/mytext_1.txt")
# or you can do
unlink(c("c:/Mytest/result.jpg", "c:/mytest/mytext_1.txt"));
#2.1. del files whose names contain "temp"
file.remove(dir(path="c:/MyTest/", pattern = "temp", full.names = T))
# or you can do this way
unlink("c:/Mytest/*temp*.*");
#2.2. del files older than 7 days
# first find files that are older than 7 days
fi <- file.info(dir(path="c:/Mytest", full.names = T));
fi <- fi[difftime(Sys.time(), fi[,"mtime"], units="days") > 7, 1:3];
file.remove(rownames(fi));
#or you can do this way to replace the file.remove()
unlink(rownames(fi));
#3. to delete the whole directory c:/Mytest, including its sub-folders if any
unlink("c:/mytest", recursive = T, force = T)
In summary, if we want to find or delete a file based on its properties, i.e. size, creation time or modification time or others, we need to use file.info function to filter out what we need, then we can work on the result.
Using R Code in T-SQLNow we will embed the above R scripts into T-SQL via sp_exec_external_script. For more examples, please see MSDN document .
List files with R in T-SQLLet's look at the t-sql code first
--1. list files in a folderEXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'file_list <- as.data.frame(dir(path="c:/MyTest", full.names = TRUE));'
, @output_data_1_name = N'file_list'
with result sets (([AllFiles] varchar(256)))
go
--2. list files with names containing "temp" literal value, pattern parm accepts RegEx
EXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'file_list <- as.data.frame(dir(path="c:/MyTest", pattern = "temp"))'
, @output_data_1_name = N'file_list'
with result sets (([FileName_Has_Temp] varchar(256)))
go
--3. list files with names containing number(s)
EXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'file_list <- as.data.frame(dir(path="c:/MyTest", pattern = "\\d", full.names = T))'
, @output_data_1_name = N'file_list'
with result sets (([FileName_Has_Num] varchar(256)))
go
--4. list files created more than 7 days ago.
EXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'fi <- file.info(dir(path="c:/Mytest/", full.names = T));
fi <- fi[difftime(Sys.time(), fi[,"ctime"], units="days") > 7, c("ctime", "size")];
fi["Name"] <- rownames(fi);
file_list <- as.data.frame(fi[c("Name", "ctime")]);'
, @output_data_1_name = N'file_list'
with result sets (([File_7+Days_Old] varchar(256), Create_Time datetime))
go
--5. list files created within 7 days and size bigger than 1000 bytes
EXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'fi <- file.info(dir(path="c:/Mytest2", full.names = T));
fi <- fi[difftime(Sys.time(), fi[,"ctime"], units="days") <= 7 & fi["size"] > 1000, c("ctime", "size")];
fi["name"] <- rownames(fi)
file_list <- as.data.frame(fi[c("name", "ctime","size")])'
, @output_data_1_name = N'file_list'
with result sets (([File < 7Days with Size > 1000] varchar(256), ctime datetime, size int))
go
If you run the script in a SSMS window that connects to a SQL Server 2016 instance with R service installed, you will get the following results:
]
If we compare the R code embedded in T-SQL with the native R code examples we listed before, we can see there are some differences. In T-SQL, we need to use as.data.frame function to convert the result into a Data Frame structure before the R result can be displayed by T-SQL, otherwise, an error like the following will occur:
Msg 39004, Level 16, State 20, Line 1A 'R' script error occurred during execution of 'sp_execute_external_script' with HRESULT 0x80004004.
Msg 39019, Level 16, State 1, Line 1
An external script error occurred:
Error: The output dataset must be of type data frame; consider using data.frame() to convert it. Delete files with R in T-SQL
Deleting files does not need to return any result set, so we do not need as.data.frame in R code
--1. delete specific filesEXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'file.remove("c:/Mytest/result.jpg", "c:/mytest/mytext_1.txt");';
-- or
EXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'unlink(c("c:/Mytest/result.jpg", "c:/mytest/mytext_1.txt"))'
go
-- 2. delete files with names containing "temp"
EXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'file.remove(dir(path="c:/MyTest/", pattern = "temp", full.names = T))'
-- or
EXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'unlink("c:/Mytest/*temp*.*")'
go
-- 3. delete files older than 7 days
-- first find files that are older than 7 days
EXECUTE [sys].[sp_execute_external_script] @language = N'R',
@script = N'
fi <- file.info(dir(path="c:/Mytest", full.names = T));
fi <- fi[difftime(Sys.time(), fi[,"ctime"], units="days") > 7, c("ctime", "size")];
file.remove(rownames(fi)) # or use this: unlink(rownames(fi));'
go
After running all the deletion statements in SSMS, all files will be gone under C:\MyTest folder.
A tip here is that when I refer to a folder, I use / (forward slash) instead of \ (back slash), because \ is an escape letter in R, so if we refer to c:\Mytest in R code, we should write c:\\Mytest. To save time, we can simply code it as c:/Mytest
SummaryIn this tip, we have discussed how to use the R language to do file listing and deletion, this actually is a good replacement for a CLR stored procedure.
All the code in this tip was tested in an environment using SQL Server 2016 Developer edition on a windows 10 computer.
R language is mainly for data science, but it is powerful enough for DBAs to use it innovatively for daily work. To allow T-SQL to embed R code directly, Microsoft starts to treat T-SQL more like a first class citizen in the MS language portfolio.
Sp_execute_external_script has a parameter @language, currently the only valid value is 'R', but it is reasonable to guess that in the future this stored procedure will support other languages, such as python, javascript, both of which are popular, cross-platform languages, just like R. Coincidently, SQL Server vNext will run across multi-platforms as well.
With PowerShell to be run in linux/Mac soon, I hope T-SQL can also embed PowerShell code someday, if all this become true, TSQL itself will be both a language and a platform for administration and development tasks, and CLR integration may be minimized due to its complexity and cost for maintenance and development.
Next StepsR language opens a door for both DBAs and database developers, and we may do lots of things differently yet more efficiently with R, so it is really worthwhile to dig deeper into the R language.
Please check the following for further information:
SQL Server 2016 R Services: Executing R code in SQL Server SQL Server 2016 R Services: Guide for Client Configuration Unable to communicate with the runtime for 'R' script in SQL Server SQL Server 2016 R Services: Executing R code in Revolution R Enterprise SQL Server R Services TutorialsLast Update: 2/8/2017


About the author

Jeffrey Yao is a senior SQL Server consultant, striving to automate DBA work as much as possible to have more time for family, life and more automation. View all my tips
Related Resources
More Database Developer Tips...