Thursday 21 June 2012

Debian setup

# used by jenkins
apt-get install daemon
#apt-get install jenkins
apt-get install tomcat6
apt-get install tomcat6-admin
apt-get install curl
apt-get install r-base
apt-get install r-cran-rmysql

apt-get install texlive-xetex
apt-get install texlive-science
apt-get install texlive-latex-extra
apt-get install subversion
apt-get install cvs
apt-get install git
apt-get install wkhtmltopdf
apt-get install okular

apt-get install emacs23-nox

apt-get install jedit

apt-get install git-svn

apt-get install mailutils # smarthost

apt-get install graphviz


R
> install.packages("Rserve")
> install.packages("car") # may not be used
> install.packages("AER") #used by ParasiteClearance
> install.packages("DBI") #used by clinRepRep
> install.packages("RMySQL") #used by clinRepRep
>

Others currently installed on app-dev:
Cairo                   R graphics device using cairo graphics library
                        for creating high-quality bitmap (PNG, JPEG,
                        TIFF), vector (PDF, SVG, PostScript) and
                        display (X11 and Win32) output.
car                     Companion to Applied Regression
Formula                 Extended Model Formulas
lmtest                  Testing Linear Regression Models
sandwich                Robust Covariance Matrix Estimators
strucchange             Testing, Monitoring, and Dating Structural
                        Changes
zoo                     S3 Infrastructure for Regular and Irregular
                        Time Series (Z's ordered observations)


cairoDevice             Cairo-based cross-platform antialiased graphics
                        device driver.
Rserve                  Binary R server

chmod o+w /var/lib/tomcat6/webapps/ cp  /home/timp/.m2/repository/org/springframework/spring-instrument-tomcat/3.0.5.RELEASE/spring-instrument-tomcat-3.0.5.RELEASE.jar /usr/share/tomcat6/lib/
Ensure /etc/tomcat6/tomcat6-users.xml contains
<role rolename="poweruser" />
   <role rolename="poweruserplus" />
   <role rolename="probeuser" />
 
   <user username="admin" password="" roles="manager,admin" />

Get scrollbars in Eclipse:


 sudo su -c 'echo export LIBOVERLAY_SCROLLBAR=0 > /etc/X11/Xsession.d/80overlayscrollbars'

Get innotop working


perl -MCPAN -e shell
CPAN> install Term::ReadKey 

Ensure Rserve and Jenkins start at reboot


cd /etc/init.d
update-rc.d jenkins defaults
update-rc.d Rserve defaults

Set up networking.

Set up GMail as smart host for postfix.

Send me the IP address on boot.

Saturday 9 June 2012

Siemens S16-39 washing machine: code F18 - fixed

Our Siemens S16-39 washing machine started to display F18 and beeping (not the normal 'finished' beep).

We switched it off. Googling suggested that the impeller was broken/unable to move.

Switching back on made a straining noise - switched it off again.

I turned the large pipe (front right bottom) through ninety degrees.

Switched back on, turned to Empty. No bad noise and emptying started.

When empty I placed a towel under outlet and completely undid the large pipe.

There was human hair, a shirt stiffener and piece of slate, removed these and all well!

Friday 1 June 2012

Representing empty CSV columns in a database

The particular case I want to nail down is a CSV file where field values are not quoted:

key,name,options,
1,one,,

The example can be represented as a three field table with non-nullable fields key and name and a nullable field options (yes I know about the trailing comma - they do occur in the wild).

For these purposes the data can be divided into Strings and others, as an empty string ("") is not a legal value for any other type.

The case is straight forward for XML. Paired tags with no content represent a zero length string. Unpaired tags (empty elements) or missing tags represent null.

For CSV the situation is more confused. When importing CSV files there are three possibilities for a column: it is not present, empty or filled with a value.

If the column is missing then the value in the database should be null.

If the column is present and empty and the column type is not String then the value should be Null; for a String column the value might be either an empty string or null: we have to decide.

Both string treatments can be argued for:

  1. A present, empty column, represented by two consecutive commas, is just the limiting case of a normal string and so represents a string of zero length. This ensures that no null value will ever enter that column.
  2. String handling should be consistent with the handling of other types where an empty column represents null, hence an empty string is an illegal value in a non-nullable column.

The problem with position 1. is that there is no way to represent null,
the problem with position 2. is that there is no way to represent the empty string.

Our hand is forced, as it happens, by the behaviour of MySQL, which in characteristic fashion chooses the wrong option.

MySQL chooses option 2, so if we wish to gain nulls we must either hand craft the import or post-process the column.

I have chosen to post process all nullable string columns to convert empty strings to null.

I do not believe there is a use case for storing zero length strings in nullable database columns.