Thursday, June 01, 2006

EOL character transformations in CSV files on Solaris

I had a CSV dump from SQL Server tables that I needed to load into Oracle using SQL*LDR. The first, and the most common problem faced is the presence of a ^M at the end of each line. This crept in when a file is FTPed from DOS to UNIX. the dos2unix command takes care of this very easily.

The second and more frustrating problem was that of newline characters in user-entered fields, such as the 'comments' field. Any sql*ldr script failed on such lines for obvious reasons. Looking at the octal dump of the file using 'od -c ', I noticed that user comments newlines had a '\n', while real EOL newlines had a '\r' and then a '\n'. This might vary in different flavors of Unix, but Solaris 8 showed the preceding behavior.

All I had to do was find all '\n' without a preceding '\r'. This is simple using a search and replace in sed with lookbehind (?<!char), right? Wrong. Solaris doesn't support lookbehind. (Their regex syntax by itself is pretty weird, but that needs more investigation). So I had to do a byte replace in two steps.
Step 1) Replace all '\n' with a space or whatever character you'd like.
Step 2) Now that only '\r' is remaining, replace this with '\n' (the real EOL newlines). The commands used were:

od -c <filename>

1) tr '\n' ' ' < <oldfile> > <tempfile>
2) tr '\r' '\n' < <tempfile> > <newfile>

Thursday, December 15, 2005

Macro String Comparisons in SAS

String comparisons inside macros in SAS are hard to make sense. The rule of thumb I follow is:
a. If you're comparing with a %if-%then, then the comparison will be made with NO QUOTES.

b. However, if you're comparing within a data step inside a macro, normal data-step comparison rules, such as using quotes for a string literal, will have to be followed.

I don't even want to get into single/double quote dilemma, especially in ODS.

Wednesday, November 30, 2005

A function inside a JSP

This is something that I recently found out the hard way. In JSP, never, ever use global variables using a <%! ... >. When the servlet is generated, the global variables are placed outside the service() method.

Why did I do this in spite of dire warnings? I needed to call a function inside a JSP. The simple solution: Don't. Put the method inside a bean and call it there. MVC, remember!?

Tuesday, November 08, 2005

will code for food

See dev2r.
dev2r needs a job.
dev2r can code.
code dev2r, code.

email iam.dev2r (at) gmail (dot) com if you'd like to see his resume or if you can forward it to someone who has an opening.

Tuesday, November 01, 2005

Calling a macro from a SAS data step

1) The CALL EXECUTE statement inside a data step will delay execution of the macro until runtime.

Although this has its advantages (you can pass a variable from each observation and compute to your heart's content), it has a big disadvantage: A CALL SYMPUT inside the macro doesn't instantaneously create the SYMPUT variable. This makes CALL EXECUTE particularly useless as a function inside a loop (to speak in normal programming terms).

The workaround? Jeff suggests placing the entire data step inside the macro thus avoiding any macro variable passing between data step and macro call. Thus, instead of passing a variable name as a macro parameter, pass the entire dataset name and add a new variable inside to the dataset inside the macro.

Tuesday, August 02, 2005

Printing page breaks using CSS

You can set page breaks using CSS. I used this when I needed to print a HTML report with 4-7 rows per record (Some records break over two pages, reducing readability). I checked the record count and inserted a page break every fifth record thus :

if ((i%5 == 0) && (i!= 0))
out.println("<tr class=\"breakhere\"><td></td></tr>");

The CSS for breakhere is:

page-break-before: always;

More here.

Hello World

This blog will contain code fragments/tricks I use while programming. Might also contain other geekery, who knows!?


View My Stats