Text files processing
-
There is a task to process and convert files. All files are analog CSV(s with dividers)
- Two or more files. We need to glue them in some fields and give them one processed file.
- One file. The fields must be transformed, the sticker must be made from several lines to one etc., and a single, processed file must also be issued.
Problem is, there's no single file structure. i.e. file files/samples of files may vary by field/structure
To date, the idea of such processing is only one:
- We'll load the file/Files into some OBD (in-memory, for example), pre- runtime reading the file structure - we'll set up the structure tables.
- We're doing some SQL scrip, which is the result of which we're building a exit file.
Question: Can there be some developed algorithms/biote for such conversion/processing to avoid the city of "a traveller"?
UPD: In order to avoid attaching to the structure, of course, some NoSql can be used, but then we will lose full force of SQL
-
We can record the file in the table, for example with columns DOC_NAME, ROW_NUMBER, ROW_CONTENT
DOC_NAME - file name ROW_NUMBER - row number in ROW_CONTENT - count the line from the file, e.g. Name; Name; Name; Road; Mestody - Petr; Semen; 12.12.1980;g. Moscow
This line may then be dissolved by sql on the columns and placed in a pre-defined table:
INSERT INTO PARSED_TABLE1 SELECT TRIM(SUBSTR(CON,0,INSTR(CON,CHR(59))-1)) NAME, TRIM(SUBSTR(CON,INSTR(CON,CHR(59),1,1)+1,INSTR(CON,CHR(59),1,2)- INSTR(CON,CHR(59),1,1)-1)) SURNAME, TRIM(SUBSTR(CON,INSTR(CON,CHR(59),1,2)+1,INSTR(CON,CHR(59),1,3)- INSTR(CON,CHR(59),1,2)-1)) BIRTHDAYDATE, TRIM(SUBSTR(CON,INSTR(CON,CHR(59),1,3)+1)) BIRTHPLACE FROM ( SELECT ROW_CONTENT CON FROM CONTENT_TABLE WHERE DOC_NAME="doc.csv")
By analogy, we make a request to fill PARSED_TABLE2, from the 2nd tables, you glue a new one as you wish, you can stick it straight without PARSED_TABLE. CHR(59) is a symbol ; in ANSII
This is also a good article on reading from .csv in Orasle OBD. http://torofimofu.blogspot.ru/2015/10/oracle-csv.html