Tags: csv, database, file, files, format, hai, microsoft, mysql, oracle, parse, procedure, properitary, requirment, server, sql, stored, text

How to open a text file in a stored Procedure

On Database » Microsoft SQL Server

12,884 words with 20 Comments; publish: Wed, 05 Dec 2007 08:38:00 GMT; (25058.59, « »)

Hai..

I have data in text files ( not in csv format but in a properitary format).

My requirment is to read the text files and parse it into corresponding data and then store it into the MSSQL Server Database.

Is there any way to do this by using Stored Procedure or extended stored prodcedure in SQL Server. Or is there any other way to do this in

SQL Server. My database is there in SQL 2000.

Any help will be appreciated..

with regards

Sudar

All Comments

Leave a comment...

  • 20 Comments
    • Well, of course you can do it in an extended stored procedure. :) If you have any kind of an ODBC driver, you can use OPENQUERY or linked server to read it. You might want to look at using DTS for this though.
      #1; Tue, 11 Dec 2007 16:46:00 GMT
    • I'll second the recommendation for DTS, that is the kind of task that it was designed for!

      SQL Server has stayed with a cleaner model for Transact-SQL than some of the other database vendors... Instead of trying to turn Transact-SQL (the database management language) into a full blown programming language, the designers have left it for data declaration and management. This inhibits users from doing naughty things like user intereaction within the scripting language, which I see as a good thing.

      To work around this limitation, Microsoft has provided DTS. DTS gives the user a substantial sub-set of Visual Basic, with easy access to both the local server and other data sources.

      -PatP

      #2; Tue, 11 Dec 2007 16:47:00 GMT
    • Hai..

      Thnks for your reply.

      I found out that i cannot use OPENQUERY for this. My requirement is something like a fopen function in C.

      I want to read the dat from the file line by line and then use my bussiness rule to extract the data and then store only the data

      in the database.

      Any help will be appreciated..

      with regards

      Sudar

      #3; Tue, 11 Dec 2007 16:48:00 GMT
    • Here is a way that may work.

      create table #out (output varchar(255) null)

      insert #out xp_cmdshell 'type c:\text.txt'

      select * from #out

      #4; Tue, 11 Dec 2007 16:49:00 GMT
    • I think that I'd create the table with an identity and use bcp

      Then "move" the data with an INSERT to the final table...

      why do you need to do it row by row?

      #5; Tue, 11 Dec 2007 16:50:00 GMT
    • I found out that i cannot use OPENQUERY for this. My requirement is something like a fopen function in C.

      I want to read the dat from the file line by line and then use my bussiness rule to extract the data and then store only the data

      in the database.This would be a perfect application for DTS. That is exactly what DTS is supposed to do.

      -PatP

      #6; Tue, 11 Dec 2007 16:51:00 GMT
    • As an alternative, you could also try using the bulk insert statement and specify an fmt-file if necessary. I'd try the dts option first though.
      #7; Tue, 11 Dec 2007 16:52:00 GMT
    • Explain to me again how you easily release DTS to a production envirnoment from dev with no alterations?
      #8; Tue, 11 Dec 2007 16:53:00 GMT
    • Explain to me again how you easily release DTS to a production envirnoment from dev with no alterations?There is a "Save As" option in the DTS designer. It's a piece of cake!

      It doesn't sound to me like this is a scheduled production job, it sounds a lot more like a one-time data load to me. If it is a one-time shot, the appropriate values can be coded directly into the DTS package. If it is a scheduled job, the package needs to read the appropriate data from a data source like a table, then process it according to the business rules.

      I don't see this as rocket science. Am I missing something?

      -PatP

      #9; Tue, 11 Dec 2007 16:54:00 GMT
    • Well...we isolate our prod boxes...we have 1 group that supports all of them...

      And whilst I have sa access to my dev boxes, I don't have sa to prod...

      keeps the developers from building apps using sa...so save as is not an option

      I guess dts can be saved as a different owner though...see there's my ignorance showing again...

      I still don't like DTS...

      And "it's only a one time thing"

      Man, if that ain't the biggest lie in the bd'ness...

      I'd still go with a sproc...it might be harder at first...but it's way more flexible...

      Only time I'd use DTS would be if I had to read excel or access...

      And even in those cases, I'd make the data deliverer save it as tab delimited...before I had to deal with it...

      #10; Tue, 11 Dec 2007 16:55:00 GMT
    • Hai.

      Thnks for your replies...

      I think my last post was not clear. So i have a clear (and long) post now..

      I have a text file with data values inside it as shown..

      ===================================== Extract From the File ===================================

      Cache Copy Read Hits %()(B) 0.947033882141 1 0.92694246769 1 0.892471790314 1 0.820934414864 1

      3(F)Data Map Hits %(*)(B) 0.987548351288 1 0.993828952312 1 0.98747831583 1 0.982723712921 1

      2(B)Disk Read Time(2)C:(B) 0 18.7727622986 1 10.976811409 1 39.6991119385 1 35.7257003784 1

      ================================================== ===============================================

      Here each line contains data values for a particular Parameter for each time frame. The Parameter name comes first, followed

      by a '0' and then values for each time frame seperated by '1'. This is just a extract from the file. Actually the file contains

      more than 1000 parameter with more than 100 values for each parameter. I have to read the text file, parse the data and store

      it into different tables as i have shown below.

      Cache Copy Read Hits

      -------

      Time Frame Value

      --- --

      1 0.947033882141

      2 0.92694246769

      3 0.892471790314

      4 0.820934414864

      Data Map Hits %

      -----

      Time Frame Value

      ---- --

      1 0.987548351288

      2 0.993828952312

      3 0.98747831583

      4 0.982723712921

      Disk Read Time

      -----

      Time Frame Value

      ---- --

      1 18.7727622986

      2 10.976811409

      3 39.6991119385

      4 35.7257003784

      I definitely cannot use DTS for this because, first it is not a data souce, second the file is not in CSV (Comma Seperated Value) format.

      I have to read each and every line from the file one line at a time and then parse it and then get the value and then store

      it in the table. I have written the code to pharse the line and store it in the table provided the text file is hard coded in the

      stored procedure as a varchar variable. Now what i need is to automatically read the text file and retrive each line seperately

      as we do in C using 'fopen' function. I heared that i can do this using extended stored procedure but i could'nt find out how.

      Any help will be appriciated

      with regards

      Sudar

      #11; Tue, 11 Dec 2007 16:56:00 GMT
    • I'd go for a stored procedure in which the input file is either bulk inserted or bcp-ed into a table with say, a single column (if there's no column seperation possible). From that point on, you could process the table, reading each line, parsing it, etc.
      #12; Tue, 11 Dec 2007 16:57:00 GMT
    • DTS doesn't care if the file is a data source. DTS was designed to handle exactly this kind of problem.

      You could write a program in 6502 assembler to run on a Commodore 64 to read the data via a serial port, process it, and return the results back via the serial port. This would be a lot more work, but it certainly is possible.

      Let's go with an idea that is easier than the 6502 assembler so we'll get done in your lifetime, but enough harder than DTS so that you'll feel good about having done "enough" work instead of doing it the easy way.

      Create a table using CREATE TABLE (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_create2_8g9x.asp) with a single varchar column that is wider than the lines of your text file. Import the data into that table using BULK INSERT (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_ba-bz_4fec.asp). Once you've got the text file loaded into the table, parse it using Transact-SQL as you see fit.

      You should be done in only three to five times as long as it would have taken using DTS!

      -PatP

      #13; Tue, 11 Dec 2007 16:59:00 GMT
    • Here...I'll do the first one...

      USE Northwind

      GO

      CREATE TABLE myTable99(Col1 varchar(8000))

      GO

      INSERT INTO myTable99(Col1)

      SELECT '===================================== Extract From the File ===================================' UNION ALL

      SELECT '1234567890123456789012345678901234567890123456789 01234567890123456789012345678901234567890123456789 0' UNION ALL

      SELECT 'Cache Copy Read Hits %()(B) 0.947033882141 1 0.92694246769 1 0.892471790314 1 0.820934414864 1' UNION ALL

      SELECT '3(F)Data Map Hits %(*)(B) 0.987548351288 1 0.993828952312 1 0.98747831583 1 0.982723712921 1' UNION ALL

      SELECT '2(B)Disk Read Time(2)C:(B) 0 18.7727622986 1 10.976811409 1 39.6991119385 1 35.7257003784 1' UNION ALL

      SELECT '================================================= = ==============================================='

      GO

      EXEC master..xp_cmdshell 'bcp Northwind.dbo.myTable99 out d:\myTable99.dat -SNJROS1D151\NJROS1D151DEV -T -c'

      GO

      SELECT * FROM myTable99

      TRUNCATE TABLE myTable99

      SELECT * FROM myTable99

      GO

      EXEC master..xp_cmdshell 'bcp Northwind.dbo.myTable99 in d:\myTable99.dat -SNJROS1D151\NJROS1D151DEV -T -c'

      SELECT * FROM myTable99

      GO

      CREATE TABLE Cache(Col1 int IDENTITY(1,1), Col2 varchar(25))

      GO

      SET NOCOUNT ON

      DECLARE .sqlserver.todaysummary.com.start int, .sqlserver.todaysummary.com.end int, .sqlserver.todaysummary.com.value varchar(25)

      SELECT .sqlserver.todaysummary.com.start = CHARINDEX('(B)',Col1) + 4

      , .sqlserver.todaysummary.com.end = CHARINDEX(' 1', Col1, CHARINDEX('(B)',Col1) + 4)

      FROM myTable99

      WHERE SUBSTRING(Col1,1,5) = 'Cache'

      WHILE .sqlserver.todaysummary.com.end <> 0

      BEGIN

      INSERT INTO Cache(Col2)

      SELECT SUBSTRING(Col1,.sqlserver.todaysummary.com.start,.sqlserver.todaysummary.com.end-.sqlserver.todaysummary.com.start)

      FROM myTable99

      WHERE SUBSTRING(Col1,1,5) = 'Cache'

      SELECT .sqlserver.todaysummary.com.start = .sqlserver.todaysummary.com.end + 3

      , .sqlserver.todaysummary.com.end = CHARINDEX(' 1', Col1, .sqlserver.todaysummary.com.start)

      FROM myTable99

      WHERE SUBSTRING(Col1,1,5) = 'Cache'

      SELECT .sqlserver.todaysummary.com.start, .sqlserver.todaysummary.com.end

      END

      SELECT * FROM Cache

      SET NOCOUNT OFF

      GO

      DROP TABLE Cache

      DROP TABLE myTable99

      GO

      #14; Tue, 11 Dec 2007 17:00:00 GMT
    • Yup, that'll do it, at least logically it looks just right.

      And Pat, moving DTS to prod is not just about Save As. While moving something like this (see above) is just a matter of compiling the procedure on the target server, which is a snap. Of course there are tools like DTSBackup that would allow you to move packages around, but you'll still have to open them in designer and change connections.

      #15; Tue, 11 Dec 2007 17:00:00 GMT
    • Of course there are tools like DTSBackup that would allow you to move packages around, but you'll still have to open them in designer and change connections.Why would you bother to do that ?!?!

      -PatP

      #16; Tue, 11 Dec 2007 17:02:00 GMT
    • Hai...

      Thnks u very much for your replies and thnks brett for your code.

      But when i try to run the code in query Analyser i am getting the following error.

      bcp.exe - Application Error (Error Dialog box Title)

      The exception unknown Software exception(0xc0000409) occured in the application at location 0x00000000

      Do you have clue why this error is occuring...pls Help !!

      --

      Sudar

      #17; Tue, 11 Dec 2007 17:03:00 GMT
    • Eventhough an error as yours shouldn't happen, did you change the -S parameter to fit your sql-box?
      #18; Tue, 11 Dec 2007 17:04:00 GMT
    • Why would you bother to do that ?!?!

      -PatP

      OK...you got me swingning...don't you need to change to the destination?

      And Sudar...

      Did you change the Server in the bcp? And is your security set up as mixed mode or sql server only?

      And are you logged in as sa?

      What does

      EXEC master..xp_cmdshell 'DIR C:\*.*'

      Give you when you execute it?

      #19; Tue, 11 Dec 2007 17:05:00 GMT
    • OK...you got me swingning...don't you need to change to the destination?At least in our case, the server name is all that changes between development and production, the database names stay the same... You could actually allow the database names to change too if you wanted to, but it would add extra work for the simple "one server" cases with little real gain in flexibility. I got the idea for this from a session that Gert did ages ago at Tech-Ed, and it works wonderfully!

      For a purely local connection, use (local) as the server name so that when the Agent starts the job it will run on the local server.

      For a multi-server connection, get the server name(s) from a table in a database that is accessible via (local). The Dev, QA, and production servers will almost always have different values for the rows in that table. You have to create a DTS step that will read the appropriate values and apply them to the connections as needed.

      -PatP

      #20; Tue, 11 Dec 2007 17:06:00 GMT