Introduction to Transact SQL User-Defined Functions

By Karen Gayda



Microsoft added a host of new features to its SQL Server 2000 product, and
one of the most interesting for SQL programmers is the user-defined
function. Adding functions to the Transact SQL language has solved many
code reuse issues and provided greater flexibility when programming SQL
queries.


This article focuses on the syntax, structure, and application of Transact
SQL user-defined functions. The material presented is based on the SQL
Server 2000 Beta Release 2.


Types of Functions



SQL Server 2000 supports three types of functions: scalar, in-line table
functions, and multistatement table functions. All three types of
functions accept parameters of any scalar data type except rowversion.
Scalar functions return a single scalar value and in-line and
multistatement table functions return a table data type. (NOTE: the table
data type is new in SQL Server 2000.)


I. Scalar Functions


Scalar functions return a data type such as int, money, varchar, real, etc.
They can be used anywhere a built-in SQL function is allowed. The syntax
for a scalar function is the following:






CREATE FUNCTION [owner_name.] function_name
( [{ @parameter_name scalar_parameter_type [ = default]} [,..n]])
RETURNS scalar_return_type
[WITH <function_option> >::={SCHEMABINDING | ENCRYPTION]
[AS]
BEGIN
function_body
RETURN scalar_expression
END

A simple scalar function to cube a number would look like this:

CREATE FUNCTION dbo.Cube( @fNumber float)
RETURNS float
AS
BEGIN
RETURN(@fNumber * @fNumber * @fNumber)
END

Surprisingly, user-defined functions (UDFs) support recursion. Here is an
SQL Server 2000 UDF using the standard factorial example:

CREATE FUNCTION dbo.Factorial ( @iNumber int )
RETURNS INT
AS
BEGIN
DECLARE @i int

IF @iNumber <= 1
SET @i = 1
ELSE
SET @i = @iNumber * dbo.Factorial( @iNumber – 1 )
RETURN (@i)
END





II. In-Line Table Functions


In-line table functions are functions that return the output of a single
SELECT statement as a table data type. Since this type of function
returns a table, the output can be used in joins of queries as if it was a
standard table. The syntax for an in-line table function is as follows:





CREATE FUNCTION [owner_name.] function_name
( [{ @parameter_name scalar_parameter_type [ = default]} [,..n]])
RETURNS TABLE
[WITH <function_option>::={SCHEMABINDING | ENCRYPTION}]
RETURN [(] select_statement [)]

An in-line function to return the authors from a particular state would
look like this:

CREATE FUNCTION dbo.AuthorsForState(@cState char(2) )
RETURNS TABLE
AS
RETURN (SELECT * FROM Authors WHERE state = @cState)


III. Multistatement Table Functions

Multistatement table functions are similar to stored procedures except that
they return a table. This type of function is suited to address situations
where more logic is required than can be expressed in a single query. The
following is the syntax for a multistatement table function:


CREATE FUNCTION [owner_name.] function_name
( [{ @parameter_name scalar_parameter_type [ = default]} [,..n]])
RETURNS TABLE
[WITH <function_option> >::={SCHEMABINDING | ENCRYPTION]
[AS]
BEGIN
function_body
RETURN
END





Hierarchical data, such as an organizational structure, is an example of
data that cannot be gathered in a single query. The Northwind Company
database's Employees table contains a field called ReportsTo that
contains the EmployeeID of the employee's manager. GetManagerReports
is a multistatement table function that returns a list of the employees who
report to a specific employee, either directly or indirectly.






CREATE FUNCTION dbo.GetManagerReports ( @iEmployeeID int )
RETURNS @ManagerReports TABLE
(
EmployeeID int,
EmployeeFirstName nvarchar(10),
EmployeeLastName nvarchar(20),
Title nvarchar(30),
TitleOfCourtesy nvarchar(25),
Extension nvarchar(4),
ManagerID int
)
AS
BEGIN

DECLARE

@iRowsAdded int, — Counts rows added to
— table with each iteration
@PREPROCESSED tinyint, — Constant
for record prior
— to processing
@PROCESSING tinyint, — Constant
for record
— being processed
@POSTPROCESSED tinyint — Constant for
records that
— have been processed

SET @PREPROCESSED = 0
SET @PROCESSING = 1
SET @POSTPROCESSED = 2

DECLARE @tblReports TABLE (
— Holds employees added with each pass thru source employees table
EmployeeID int,
EmployeeFirstName nvarchar(10),
EmployeeLastName nvarchar(20),
Title nvarchar(30),
TitleOfCourtesy nvarchar(25),
Extension nvarchar(4),
ManagerID int,
ProcessedState tinyint
DEFAULT 0
)

–Begin by adding employees who report to the Manager directly.
INSERT INTO @tblReports
SELECT EmployeeID, FirstName, LastName, Title, TitleOfCourtesy,
Extension, ReportsTo, @PREPROCESSED
FROM Employees
WHERE ReportsTo = @iEmployeeID

–Save number of direct reports
SET @iRowsAdded = @@ROWCOUNT

— Loop through Employees table until no more iterations are necessary
— (e.g., no more rows added) to add all indirect reports.
WHILE @iRowsAdded > 0
BEGIN
–Set just added employees ProcessedState to PROCESSING
— (for first pass)
UPDATE @tblReports
SET ProcessedState = @PROCESSING
WHERE ProcessedState = @PREPROCESSED

–Add employees who report to Managers in
— ProcessedState = PROCESSING
INSERT INTO @tblReports
SELECT e.EmployeeID, e.FirstName, e.LastName, e.Title,
e.TitleOfCourtesy, e.Extension, e.ReportsTo, @PREPROCESSED
FROM Employees e
INNER JOIN @tblReports r ON e.ReportsTo = r.EmployeeID
WHERE r.ProcessedState = @PROCESSING
AND e.ReportsTo <> @iEmployeeID

–Save number of rows added for this iteration
SET @iRowsAdded = @@ROWCOUNT

–Set ProcessedState to POSTPROCESSED for Managers whose
–reports were added in this iteration
UPDATE @tblReports
SET ProcessedState = @POSTPROCESSED
WHERE ProcessedState = @PROCESSING
END

–Save all data to output table
INSERT INTO @ManagerReports
SELECT EmployeeID, EmployeeFirstName, EmployeeLastName, Title,
TitleOfCourtesy, Extension, ManagerID
FROM @tblReports
RETURN
END




The output of this function would be used in the same manner as a standard
table. Figure 1 demonstrates JOINING the output of GetManagerReports with
the Employees table to produce a listing of the organizational structure of
the Northwind Company:

User function used in JOIN query

Figure 1: User function used in JOIN query.




Invoking Functions




There are a few syntax idiosyncrasies to observe when invoking user-defined
functions. SQL Server 2000 provides some system-level user-defined
functions in the Master database. These system functions are invoked with
a slightly different syntax than ones that you would create. System
functions that return a table have the following syntax:




::function_name ([argument_expr], [,…])

System functions that return a scalar value use this syntax:

function_name ([argument_expr], [,…])

User-created scalar and rowset functions are invoked in exactly the same
manner. The syntax for invoking a user-created function looks like this:

[database_name] owner_name. function_name ([argument_expr], [,…])







Limitations




User-defined functions do have some restrictions placed upon them. Not
every SQL statement or operation is valid within a function. The following
lists enumerate the valid and invalid function operations:

Valid:


  • Assignment statements
  • Control-flow statements
  • Variable declarations
  • SELECT statements that modify local variables
  • Cursor operations that fetch into local variables
  • INSERT, UPDATE, DELETE statement that act upon local table variables


Invalid:



  • Built-in, nondeterministic functions such as GetDate()
  • Statements that update, insert, or delete tables or views
  • Cursor fetch operations that return data to the client


Performance Implications




Using UDFs will impact the performance of queries. The extent of the
performance impact depends upon how and where you use a user-defined
function. This is also true of built-in functions. However, UDFs have the
potential for more dramatic performance hits than built-in functions. You
should exercise caution when implementing functions in your queries and
perform benchmarking tests to insure that the benefits of using your
functions exceed the performance costs of using them.




Uses for Functions




Check constraints


Scalar user-defined functions can be used as check constraints for columns
in table definitions. As long as an argument to the function is a
constant or built-in function or an argument is the column being checked,
the function may be used to validate the column's value. These UDF
check constraints provide the ability to use more complex logic for
determining acceptable column values than Boolean expressions or LIKE
patterns would allow.


The following function validates that a serial number follows a specific
pattern and portions of the serial number match a specific algorithm for a
product type.






CREATE FUNCTION dbo.ValidSerialNumber( @nvcSerialNumber nvarchar(50))
RETURNS BIT
AS
BEGIN

DECLARE
@bValid BIT,
@iNumber INT

–default to invalid serial number
SET @bValid = 0

–Home Office Product
IF @nvcSerialNumber LIKE
'[0-9][A-Z][0-9][A-Z][0-9][0-9][0-9][0-9]'
BEGIN
SET @iNumber = CONVERT(int,RIGHT(@nvcSerialNumber,4))
IF @iNumber % 7 = 2
BEGIN
SET @bValid = 1
END
END

— Video Game
IF @nvcSerialNumber LIKE '[0-9][0-9][0-9][A-Z][0-9]5[A-Z]'

BEGIN
SET @iNumber = CONVERT(int,LEFT(@nvcSerialNumber, 3))
IF @iNumber % 2 = 0
BEGIN
SET @bValid = 1
END
END

RETURN ( @bValid)
END

CREATE TABLE dbo.CustomerProduct
(
CustomerID int NOT NULL PRIMARY KEY,
ProductID int NOT NULL,
SerialNumber nvarchar(20) NOT NULL
CHECK(dbo.ValidSerialNumber(SerialNumber) = 1)
)





Computed columns


Scalar functions can be used to compute column values in table definitions.
Arguments to computed column functions must be table columns, constants, or
built-in functions. This example shows a table that uses a Volume function
to compute the volume of a container:





CREATE FUNCTION dbo.Volume ( @dHeight decimal(5,2),
@dLength decimal(5,2),
@dWidth decimal(5,2) )
RETURNS decimal (15,4)
AS
BEGIN
RETURN (@dHeight * @dLength * @dWidth )
END

CREATE TABLE dbo.Container
(
ContainerID int NOT NULL
PRIMARY KEY,
MaterialID int NOT NULL
REFERENCES Material(MaterialID),
ManufacturerID int NOT NULL
REFERENCES
Manufacturer(ManufacturerID)
Height decimal(5,2) NOT NULL,
Length decimal(5,2) NOT NULL,
Width decimal(5,2) NOT NULL,
Volume AS
(
dbo.Volume( Height, Length, Width )
)
)




You should note that computed columns might be excluded from being indexed
if user-defined functions determine their value. An index can be created
on the computed column if the user-defined function is deterministic (e.g.,
always returns the same value given the same input).


Default constraints


Default column values can be set with user-defined functions. UDFs can be
very useful when a hard-coded value or built-in function does not suffice.
For example, if a doctor's office wished to save a patient's
appointment preference, a user-defined function could calculate the default
day and time in a function by using the current date/time when the
patient's record was created. If the patient's record were
created on a Friday at 10:34 AM the AppointmentPref column would default to
“Friday at 10:00” using the following function:






CREATE FUNCTION dbo.AppointmentPreference ( @dtDefaultDateTime datetime )
RETURNS nvarchar(50)
AS
BEGIN

DECLARE @nDay nvarchar(10),
@nHour nvarchar(6),
@nPreference nvarchar(50),
@tiHour tinyint

–Get date description
SET @nDay = DATENAME(dw, @dtDefaultDateTime )

–Find current hour
SET @tiHour = DATEPART(hh,@dtDefaultDateTime)

–Use only 12-hour times
IF @tiHour > 12
BEGIN
SET @tiHour = @tiHour – 12
END

–Don't allow appointments during lunch
IF @tiHour = 12
BEGIN
SET @tiHour = 1
END

— These are invalid hours
IF @tiHour IN(5,6,7,8)
BEGIN
SET @tiHour = 4
END

–Create preference text
SET @nPreference = RTRIM(@nDay) + '''s at ' +
CONVERT(varchar(2),@tiHour) + ':00'
RETURN ( @nPreference)
END

CREATE TABLE dbo.Patient
(
PatientID int NOT NULL PRIMARY KEY
IDENTITY,
FirstName nvarchar(20) NOT NULL,
LastName nvarchar(20) NOT NULL,
Addr1 nvarchar(50),
Addr2 nvarchar(50),
City nvarchar(50),
State nvarchar(2),
ZipCode nvarchar(20),
HomePhone nvarchar(20),
WorkPhone nvarchar(20),
AppointmentPref nvarchar(50)
DEFAULT (dbo.AppointmentPreference(GETDATE()))

)




Assignments


Scalar user-defined functions can be used to assign values to scalar
variables. They may be used in any situation where a scalar built-in
function may be used.





DECLARE @fCube float

SET @fCube = dbo.Cube( 4.5 )





Control flow


Scalar user-defined functions may be used to control program flow when used
in Boolean expressions.





IF dbo.ValidSerialNumber('002A15A') = 1
PRINT 'Yes'

ELSE
PRINT 'No'




Case expressions


User-defined functions that return a scalar value can be used in any of the
cases of CASE expressions. The following example uses the DailySpecial
function in a case function to determine what to display for a given day:




CREATE FUNCTION dbo.DailySpecial( @nvcDay nvarchar(10))
RETURNS NVARCHAR(100)
AS
BEGIN

DECLARE @nvcSpecial nvarchar(100)

SET @nvcDay = UPPER(@nvcDay)

IF @nvcDay = 'SUNDAY'
SET @nvcSpecial = 'Roast beef with green beans and
baked potato'

IF @nvcDay = 'MONDAY'
SET @nvcSpecial = 'Chopped beef with green bean
casserole'

IF @nvcDay = 'TUESDAY'
SET @nvcSpecial = 'Beef stew'

IF @nvcDay = 'WEDNESDAY'
SET @nvcSpecial = 'Beef pot pie'

IF @nvcDay = 'THURSDAY' OR @nvcDay = 'FRIDAY'
OR @nvcDay = 'SATURDAY'
SET @nvcSpecial = 'Beef surprise'

RETURN ( @nvcSpecial )
END

–Use output of DailySpecial function
SELECT Special =
CASE DateName(dw, getdate())
WHEN 'Sunday' THEN dbo.DailySpecial('Sunday')
WHEN 'Monday' THEN
dbo.DailySpecial('Monday')
WHEN 'Tuesday' THEN
dbo.DailySpecial('Tuesday')
WHEN 'Wednesday' THEN dbo.DailySpecial('Wednesday')
ELSE 'It's a mystery!'

END




Alternative to views


Rowset functions, functions that return tables, can be used as alternatives
to read-only views. Since views are limited to a single select statement,
user-defined functions can provide greater functionality than a view.
Powerful logic can be used when determining the records returned, which is
not possible within a view. Also, views cannot accept parameters so a separate view must be created if the WHERE clause must change for different search c
riteria.


Alternative to temporary tables

Rowset functions can be used as alternatives to temporary tables. For
example, if you wished to find authors in the Pubs database who sold no
books in a particular state, you could create a couple of functions that
would generate the desired resultset.


To find the quantity of books sold for a particular author in a given state
you could write the following function:





CREATE FUNCTION dbo.AuthorPoularityForState ( @cState Char(2))
RETURNS TABLE

AS
RETURN (
SELECT a.au_id, a.au_fname, a.au_lname,
SUM(s.qty) AS QTY
FROM Authors a
INNER JOIN TitleAuthor ta ON a.au_id = ta.au_id
INNER JOIN Titles t ON t.title_id = ta.title_id
INNER JOIN Sales s ON s.title_id = t.title_id
INNER JOIN Stores st ON st.Stor_ID = s.stor_id
WHERE st.state = @cState
GROUP BY a.au_id, a.au_fname, a.au_lname
ORDER BY QTY DESC, a.au_lname, a.au_fname
)
END





You could then create another function that would use the output from the
first function to find the authors in a particular state that have not had
any sales:




CREATE FUNCTION dbo.ReallyBoringAuthorsForState ( @cState Char(2) )
RETURNS TABLE

RETURN(
SELECT a.au_id as AuthorID, a.au_fname AS AuthorFirstName,
a.au_lname AS AuthorLastName, @cState AS State
FROM AuthorPopularityForState(@cState ) pa
RIGHT JOIN authors a ON pa.AuthorID = a.au_id
WHERE IsNull(pa.UnitsSold, 0) = 0

)



The following SQL statement would list the California authors who had not
sold any books:





SELECT AuthorLastName, AuthorFirstName, AuthorID
FROM ReallyBoringAuthorsForState('CA')
ORDER BY AuthorLastName, AuthorFirstName




List of authors without book sales

Figure 2: List of authors without book sales


Before the release of SQL Server 2000, temporary tables would likely have
been used to generate the interim data to be used for the final query
output. By using functions instead of temporary tables, potential table
name concurrency problems are avoided. Functions also offer greater code
reuse than temporary tables.




Conclusion



As demonstrated , user-defined functions provide many more programming
options than there were before UDFs were included in the Transact SQL
language. SQL Server programmers have waited a long time for the
user-defined functions Microsoft made possible with SQL Server 2000. The
advantages of code reusability and of fixing problems in a single routine
can now be realized by incorporating UDFs into our designs. Now if we
could only be given arrays in the next version.




About the Author



Karen Gayda is an independent software consultant from San Diego,
California. She specializes in Web application development using VB/COM,
SQL Server, DHTML, and scripting languages. She can be contacted at kgayda@yahoo.com.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read