Full Text Search: The Key to Better Natural Language Queries for NoSQL in Node.js
This article will present a brief overview of the usage of Pipelined Table Functions, introduced in Oracle 9i, which enables accepting and returning multiple rows using Table Functions in PL/SQL.
A feature introduced in Oracle 9i in PL/SQL is the pipelining of table functions, allowing result sets to be passed from one function to another, letting us set up a sequence of transformations with no table to hold intermediate results. Rows of the result set can be returned a few at a time, reducing the memory overhead for producing large result sets within a function.
Pipelined Table Functions
Table functions are functions that produce a collection of rows (either a nested table or a varray) that can be queried like a physical database table. You use a table function like the name of a database table, in the FROM clause of a query. A table function can take a collection of rows as input. An input collection parameter can be either a collection type (such as a VARRAY or a PL/SQL table) or a REF CURSOR.
Data is said to be pipelined if it is consumed by a consumer (transformation) as soon as the producer (transformation) produces it, without being staged in tables or a cache before being input to the next transformation. Pipelining enables a table function to return rows faster and can reduce the memory required in caching a table function's results. A pipelined table function can return the table function's result collection in subsets. The returned collection behaves like a stream that can be fetched from on demand. This makes it possible to use a table function like a virtual table.
For further details, please refer the Oracle 9i Documentation.
Using Pipelined Table Functions
A slightly complicated example is being used here, although in actual application development scenarios, requirements will not be so simple and straightforward. The reason such an example is being used is to bring out the usefulness of this feature, in a practical scenario. Simple examples highlighting the concept are given in the Oracle 9i Documentation, anyway.
The Insert/Update or Upsert statement MERGE has made life a lot easier for PL/SQL programmers. Refer to the Oracle 9i Documentation for usage of the MERGE statement. That will also be used in the example.
I can think of other, simpler ways to solve the same problem. But, add some complexity to the scenario and assume you are dealing with a large quantity of data, and this solution will seem better. As stated above, the purpose is to present the usage of Pipelined Table Functions.
Say a table, month_costs, has data for some type of cost incurred by ABC Co. Of these some costs has additional sub costs stored in another table, month_sub_costs. There is a table, ytd_costs, populated periodically (say monthly) with data from the other two tables.
CREATE TABLE month_costs ( Cost_no Number(4), Quantity Number(3), Type Char(1), Amount Number(7,2) ); CREATE TABLE month_sub_costs ( Cost_no Number(4), Sub_cost_no Number(4), Quantity Number(3), Type Char(1), Amount Number(7,2) ); CREATE TABLE ytd_costs ( Cost_no Number(4), Quantity Number(3), Type Char(1), Amount Number(7,2), status char(1) );
The Table ytd_costs is populated at periodic intervals, where data is derived from month_costs and month_sub_costs and if the cost_no exists in ytd_costs, the data is updated; otherwise, it is inserted. Say, the following conditions apply:
- Assume that if sub costs exist, quantity, type and amount in month_costs are Null.
- To derive one line per Cost_no, Sub cost no amounts and quantity are added up.
- If all amounts are 0 in month_sub_costs, the type in ytd_costs is set to 'A'.
- If amounts are non-zero, then if sub_costs have different types, the type is set as 'B'; else, 'C'.
- The status in ytd_costs is set to 'X' if the amount is 1000. If the amount is greater than 0 but less than 1000, status is 'Y'. For an amount greater than 1000, the status is 'Z'.
The following code will create a package ref_pkg, to hold the REF cursor and Type definitions needed by the Table Function.
CREATE PACKAGE ref_pkg IS TYPE refcur_t IS REF CURSOR RETURN month_costs%ROWTYPE; TYPE dst_ytd_lines IS TABLE OF ytd_costs%ROWTYPE; END ref_pkg; /
The following code will create a Pipelined Table Function named generate_lines.
CREATE OR REPLACE FUNCTION generate_lines (cur_month_lines_ref IN ref_pkg.refcur_t) RETURN ref_pkg.dst_ytd_lines pipelined IS r_month_costs month_costs%ROWTYPE; r_ytd_costs ytd_costs%ROWTYPE; TYPE dst_subcosts IS TABLE OF month_sub_costs%ROWTYPE; tab_subcosts dst_subcosts := dst_subcosts(); tab_subcosts_null dst_subcosts := dst_subcosts(); -- For re-initialization CURSOR c_subtot (b_costno month_sub_costs.cost_no%TYPE) IS SELECT * FROM month_sub_costs WHERE cost_no = b_costno; t_is_data BOOLEAN ; t_diff_type BOOLEAN ; i_index PLS_INTEGER ; -- save_amount month_costs.amount%TYPE; -- PL/SQL Block BEGIN LOOP FETCH cur_month_lines_ref INTO r_month_costs ; EXIT WHEN cur_month_lines_ref%NOTFOUND ; -- Get data directly to be carried over from month_costs to ytd_costs here r_ytd_costs.cost_no := r_month_costs.cost_no ; r_ytd_costs.quantity := r_month_costs.quantity ; r_ytd_costs.type := r_month_costs.type ; r_ytd_costs.amount := r_month_costs.amount ; -- Initializations tab_subcosts := tab_subcosts_null ; t_is_data := FALSE ; t_diff_type := FALSE ; i_index := 0 ; FOR r_subtot IN c_subtot( r_month_costs.cost_no) LOOP t_is_data := TRUE ; tab_subcosts.EXTEND ; i_index := tab_subcosts.LAST ; tab_subcosts(i_index).quantity := r_subtot.quantity ; tab_subcosts(i_index).type := r_subtot.type ; tab_subcosts(i_index).amount := r_subtot.amount ; IF (((i_index - 1) > 0) AND ( tab_subcosts(i_index).type <> tab_subcosts(i_index - 1).type ))THEN t_diff_type := TRUE ; END IF ; END LOOP ; i_index := 0 ; IF ( t_is_data = FALSE ) THEN PIPE ROW(r_ytd_costs) ; ELSE r_ytd_costs.quantity := 0 ; r_ytd_costs.type := NULL ; r_ytd_costs.amount := 0 ; FOR i_index IN tab_subcosts.FIRST .. tab_subcosts.LAST LOOP r_ytd_costs.quantity := r_ytd_costs.quantity + tab_subcosts(i_index).quantity; -- r_ytd_costs.type := r_ytd_costs.type ; r_ytd_costs.amount := r_ytd_costs.amount + tab_subcosts(i_index).amount; END LOOP ; IF ( r_ytd_costs.amount = 0 ) THEN r_ytd_costs.type := 'A' ; ELSE IF ( t_diff_type = TRUE) THEN r_ytd_costs.type := 'B' ; ELSE r_ytd_costs.type := 'C' ; END IF ; END IF ; PIPE ROW(r_ytd_costs) ; END IF ; END LOOP ; RETURN ; END ; /
The following is the listing of data present in the three tables. Cost numbers 101 and 102 are already in ytd_costs. Post-upsert, 102 should be updated with type A, amount 0, and quantity 11. New records for 103, 104, and 105 will be inserted into ytd_costs. 102, 104, and 105 have sub costs.
Month_Costs COST_NO QUANTITY TYPE AMOUNT ---------- ---------- ---- ---------- 102 103 23 C 2000 104 105 Month_Sub_Costs COST_NO SUB_COST_NO QUANTITY TYPE AMOUNT ---------- ----------- ---------- ---- ---------- 102 2 5 Q 0 102 3 6 S 0 104 1 1 S 245 104 2 4 S 34 104 3 3 S 456 104 4 5 S 266 105 1 3 S 300 105 2 3 Q 300 Ytd_Costs COST_NO QUANTITY TYPE AMOUNT S ---------- ---------- ---- ---------- - 101 4 C 1000 X 102 2 B 202 Y
We use the following MERGE statement to perform Upsert operation.
MERGE INTO ytd_costs m USING ( SELECT * FROM TABLE(generate_lines( CURSOR(SELECT * FROM month_costs)))) s ON ( m.cost_no = s.cost_no) WHEN MATCHED THEN UPDATE SET m.quantity = s.quantity ,m.type = s.type ,m.amount = s.amount ,m.status = DECODE(SIGN(1000 - s.amount), 0, 'X', 1, 'Y', -1, 'Z') WHEN NOT MATCHED THEN INSERT ( cost_no ,quantity ,type ,amount ,status ) VALUES ( s.cost_no ,s.quantity ,s.type ,s.amount ,DECODE(SIGN(1000 - s.amount), 0, 'X', 1, 'Y', -1, 'Z') ) ;
The data in ytd_costs will be:
COST_NO QUANTITY TYPE AMOUNT S ---------- ---------- ---- ---------- - 101 4 C 1000 X 102 11 A 0 Y 103 23 C 2000 Z 104 13 C 1001 Z 105 6 B 600 Y
As seen, the need to pass multiple records in collections is eliminated.
The value of the status in ytd_costs could have been derived in the table function. The reader may deduce the benefit of deriving it in the MERGE statement if the statement is to be used in an application within a PL/SQL program unit. Further changes will be needed in the code, of course. MERGE statement may present some problems in the usage of bind variables in earlier releases of Oracle 9i, and workarounds to that are available at the Oracle Metalink.
By using pipelining in Table Functions, we can see Table Functions as a highly streamlined transformation engine that always return a collection of records—a feature very useful in extraction, transformation, and load (ETL) scenarios involving large volumes of data. Parallel execution can further improve performance.
Oracle 9I Documentation June 2001 PL/SQL User's Guide and Reference.
Resources on ETL using Oracle 9i:Daniel White: April 2002 Oracle 9i ETL from the Database Out for Free
Load and Transform External data into Oracle 9i
About the Author
Toji Mammen George works as a Systems Analyst in the Offshore Development Center of one of Tata Consultancy Services' (TCS) Top-10 clients in the Transportation Industry Practice. TCS is Asia's largest independent software consultancy organization, providing services to clients in over 55 countries around the world. TCS has more than 100,000 person-years of experience in diverse business domains and technology areas.
The author has over three years' experience in application development including analysis, design, construction, testing, and implementation. He has experience in developing pro*C and PL/SQL programs on Oracle 8i and 9i database servers.