Today’s (well, yesterday’s really) trick deals with the storage of data. Let me first say that the methods I’m about to show you are NOT useful for where clauses or in normal database design. However, they can be very useful when you need a reference table that can store anything in a simple column.
For instance, take a look at Google. Google is an amazing search engine, capable of parsing billions of sites and enabling fast searching of the data it harvests.
In addition, Google is able to parse prices, mileage, dates, and other important elements, then use them as they were meant to be used; for instance, using a price to power Froogle’s bargain hunting tool.
If we were trying to duplicate the same concept into Oracle, we’d somehow need to create a table with an ID and a token, the token being the word that was parsed out of a complete document. However, what if we want that column to be able to store numbers, characters, dates…and we want them to act natively?
Enter ANYDATA. ANYDATA is a “self describing data instance type.” All this truly means is that the type not only holds a value, but it also holds its own datatype within. It also employs many methods that allow us to perform various functions with it.
So, back to our parser! We’ll start off with a very simple table:
create table search_words ( token_id number not null primary key, token sys.anydata );
Now we must populate the table, but normal inserts unfortunately wont work. We must use ANYDATA’s built-in Convert methods to perform our inserts. We’ll use the phrase “Make 50 dollars on 12/01/06”
INSERT INTO SEARCH_WORDS VALUES (1, sys.anydata.convertVarchar2('Make'); INSERT INTO SEARCH_WORDS VALUES (2, sys.anydata.convertNumber(50); INSERT INTO SEARCH_WORDS VALUES (3, sys.anydata.convertVarchar2('dollars'); INSERT INTO SEARCH_WORDS VALUES (4, sys.anydata.convertVarchar2('on'); INSERT INTO SEARCH_WORDS VALUES (5, sys.anydata.convertDate(to_date('12/01/2006', 'MM/DD/YYYY'); commit;
And now we have data! Let’s query it and see what it looks like.
SQL> select * from search_words; TOKEN_ID TOKEN() ---------- -------------------- 1 ANYDATA() 2 ANYDATA() 3 ANYDATA() 4 ANYDATA() 5 ANYDATA()
Oops! That’s not how we want our data to look. Unfortunately, Oracle is not yet sophisticated enough to understand which data type it is querying. Because of this, we have to make a function.
CREATE OR REPLACE FUNCTION getTokens(pv_token in sys.anydata) RETURN VARCHAR2 IS lv_number NUMBER; lv_date DATE; lv_varchar2 VARCHAR2(4000); BEGIN CASE pv_token.getTypeName WHEN 'SYS.NUMBER' THEN IF (pv_token.getNumber(lv_number) = dbms_types.success) THEN lv_varchar2 := lv_number; END IF; WHEN 'SYS.DATE' THEN IF (pv_token.getDate(lv_date) = dbms_types.success) THEN lv_varchar2 := lv_date; END IF; WHEN 'SYS.VARCHAR2' THEN IF (pv_token.getVarchar2(lv_varchar2) = dbms_types.success) THENa NULL; END IF; ELSE lv_varchar2 := 'unknown datatype'; END CASE; RETURN lv_varchar2; END getTokens;
Notice that we used the ANYDATA methods again, but this time with getVarchar2, getNumber, and getDate. Finally, let’s go ahead and query the table using the function we created.
SQL> column token format a20 SQL> select token_id, getTokens(token) token from search_words; TOKEN_ID TOKEN ---------- -------------------- 1 Make 2 50 3 dollars 4 on 5 01-DEC-06
And we have our data! Since we have to use a function, you can see why this would not be very good in a WHERE clause. However, for storing multiple datatypes in a single table, it’s tops.
You can find more information about ANYDATA in the Database Application Developer’s Guide, and information about its methods in the PL/SQL Packages and Types Reference.
Join me next time to learn about pulling XML from the database…with any query!
Great examples, I’m just trying to think of how to use it.
I know you said that you shouldn’t/wouldbn’t use a where clause to filter out the token column, but why can’t you do something like WHERE getTokens(TOKEN) = 50) and even have a function based index to help with the serach times?
Also can you do something like :
SELECT 2*getTokens(TOKEN) FROM SEARCH_WORDS WHERE TOKEN_ID=2;
and get 100 returned?
and from the same table :
SELECT SUBSTR(getTokens(TOKEN),1,2) || ‘de’ FROM SEARCH_WORDS WHERE TOKEN_ID=1;
and get “Made” returned?
or
SELECT to_char(getTokens(TOKEN)-1),’dd-MON-YY’)
FROM SEARCH_WORDS WHERE TOKEN_ID=5;
and get 31-NOV-06
NOTE: I assume this is a 10g functionality? I gon’t have a 10g DB to test in yet, so sorry for the questions that i would have tested myself if i did.
Oh i think i just answered my own geestion. I just noticed that the return value for your function is : VARCHAR2, so some may work, but probably not the date one, unless we do something like
SELECT to_char(to_date(to_char(getTokens(TOKEN)),’dd-MON-YY’) -1, ’dd-MON-YY’), ’dd-MON-YY’)
FROM SEARCH_WORDS WHERE TOKEN_ID=5;
Just a thought, why not overload the function somehow based on the return datatype….
Excellent example , Explained very well.
Thanks
Adarsh
AKTechnosoft Corp.