World: r3wp
[Parse] Discussion of PARSE dialect
older | first |
BrianH 18-Dec-2011 [6057x4] | Yeah, blocks for cells are so far outside the data model of everything else that uses CSV files that TO-CSV was written to assume that you forgot to put an explicit translation to a string or binary in there (MOLD, FORM, TO-BINARY), or more likely that the block got in there by accident. Same goes for functions and a few other types. |
As for that TO-ISO-DATE behavior, yes, it's a bug. Surprised I didn't know that you can't use /hour, /minute and /second on date! values with times in them in R2. It can be fixed by changing the date/hour to date/time/hour, etc. I'll update the script on REBOL.org. | |
Having to put an explicit conversion from blocks, parens, objects, maps, errors, function types, structs, routines and handles, reminds you that you would need to explicitly convert them back when you LOAD-CSV. Or more often, triggers valuable errors that tell you that unexpected data made it in to your output. | |
TO-ISO-DATE fixed on REBOL.org | |
Henrik 18-Dec-2011 [6061] | Thanks |
GrahamC 18-Dec-2011 [6062x2] | dunno if it's faster but to left pad days and months, I add 100 to the value and then do a next, followed by a form ie. regarding you p0 function |
eg. next form 100 + date/month | |
BrianH 18-Dec-2011 [6064] | It's worth timing. I'll try both, in R2 and R3. |
GrahamC 19-Dec-2011 [6065] | and the outcome was? |
BrianH 19-Dec-2011 [6066x2] | Twice the speed using your method :) |
Updated on REBOL.org to use new method. | |
GrahamC 20-Dec-2011 [6068] | Yeah, generally math is faster than using logic. And old Forth trick. |
BrianH 20-Dec-2011 [6069] | Added a TO-CSV /with delimiter option, in case commas aren't your thing. It only specifies the field delimiter, not the record delimiter, since TO-CSV only makes CSV lines, not whole files. |
Endo 20-Dec-2011 [6070] | I'm using it to prepare data to bulk insert into a SQL Server table using BCP command line tool. I need to make some changes like /no-quote to not quote string values. Because there is no option in BCP to tell my data has quoted string values. |
BrianH 20-Dec-2011 [6071] | Be careful, if you don't quote string values then the character set of your values can't include cr, lf or your delimiter. It requires so many changes that it would be more efficient to add new formatter functions to the associated FUNCT/with object, then duplicate the code in TO-CSV that calls the formatter. Like this: to-csv: funct/with [ "Convert a block of values to a CSV-formatted line in a string." data [block!] "Block of values" /with "Specify field delimiter (preferably char, or length of 1)" delimiter [char! string! binary!] {Default ","} ; Empty delimiter, " or CR or LF may lead to corrupt data /no-quote "Don't quote values (limits the characters supported)" ] [ output: make block! 2 * length? data delimiter: either with [to-string delimiter] [","] either no-quote [ unless empty? data [append output format-field-nq first+ data] foreach x data [append append output delimiter format-field-nq :x] ] [ unless empty? data [append output format-field first+ data] foreach x data [append append output delimiter format-field :x] ] to-string output ] [ format-field: func [x [any-type!] /local qr] [ ; Parse rule to put double-quotes around a string, escaping any inside qr: [return [insert {"} any [change {"} {""} | skip] insert {"}]] case [ none? :x [""] any-string? :x [parse copy x qr] :x = #"^(22)" [{""""}] char? :x [ajoin [{"} x {"}]] money? :x [find/tail form x "$"] scalar? :x [form x] date? :x [to-iso-date x] any [any-word? :x binary? :x any-path? :x] [parse to-string :x qr] 'else [cause-error 'script 'expect-set reduce [ [any-string! any-word! any-path! binary! scalar! date!] type? :x ]] ] ] format-field-nq: func [x [any-type!]] [ case [ none? :x [""] any-string? :x [x] money? :x [find/tail form x "$"] scalar? :x [form x] date? :x [to-iso-date x] any [any-word? :x binary? :x any-path? :x] [to-string :x] 'else [cause-error 'script 'expect-set reduce [ [any-string! any-word! any-path! binary! scalar! date!] type? :x ]] ] ] ] If you want to add error checking to make sure the data won't be corrupted, you'll have to pass in the delimiter to format-field-nq and trigger an error if it, cr or lf are found in the field data. |
Henrik 20-Dec-2011 [6072] | Is this related to what you wrote above? >> to-csv [34] == {""""} |
BrianH 20-Dec-2011 [6073x3] | Nope, that's a bug in the R2 version only. Change this: :x = #"^(22)" [{""""}] to this: :x == #"^(22)" [{""""}] Another incompatibility between R2 and R3 that I forgot :( I'll update the script on REBOL.org. |
Weirdly enough, = and =? return true in that case in R2, but only == returns false; false is what I would expect for =? at least. | |
Updated, Henrik. | |
Henrik 20-Dec-2011 [6076] | Thanks. |
Endo 20-Dec-2011 [6077] | Thanks BrianH |
BrianH 20-Dec-2011 [6078x2] | Note that that was a first-round mockup of the R3 version, Endo. If you want to make an R2 version, download the latest script and edit it similarly. |
Have you looked into the native type formatting of bcp? It might be easier to make a more precise data file that way. | |
Endo 20-Dec-2011 [6080x2] | It uses a format file, it is very strict, but no chance to set a quote char for fields. |
Native formats runs well if you export from one SQL server and import from other. | |
BrianH 20-Dec-2011 [6082] | I figure it might be worth it (for me at some point) to do some test exports in native format in order to reverse-engineer the format, then write some code to generate that format ourselves. I have to do a lot of work with SQL Server, so it seems inevitable that such a tool will be useful at some point, or at least the knowledge gained in the process of writing it. |
Endo 20-Dec-2011 [6083x2] | The biggest problem would be the different datatypes for different versions of SQL Server, if there is no good documentation for the native format. But BCP does the job quite well. I CALL it when necessary and try to FIND if any error output. There is XML format files as well, easier to understand but no functional differencies betwenn non-XML format files. |
I'm working with SQL Server for a long time, if anything I can help or test for you, feel free to ask if you need. | |
Endo 5-Jan-2012 [6085] | Any one knows how do I find rebolek's R2E2 - REBOL Regular Expressions Engine. This link is dead I think http://bolek.techno.cz/reb/regex.r I saw it on http://www.rebol.org/documentation.r?script=regset.r |
Rebolek 5-Jan-2012 [6086] | Endo, I will try to find newest version and let you know. But do not expect it to translate every regular expession. |
Endo 5-Jan-2012 [6087:last] | Thank you, I don't need an exact regexp library, but would be nice to have some regexp functionality. |
older | first |