A CSV File Parser

August 13, 2006 – 22:33 | java

Yeah, right, keep using String.split(",") to parse CSV files and one of these days you’ll run into some comma hidden inside a pair of quotes - like how I did while trying to import some Excel-generated CSV file into a database. So I sat down and wrote this CSVParser with java.util.regex - it actually turned out to be a fun brain teaser to make a CSV parser that’s able to parse any CSV files as defined by the Specification. To cope with the huge files I was importing, an additional requirement was to be able to parse in stream mode, instead of having to read the entire file into the memory first. That seemed to be easy at first but became quite a hassle because of the “a new line character can appear in a double-quoted field” part of the Spec.

Trackback from your site, or follow the comments in RSS.
  1. 4 Responses to “A CSV File Parser”

  2. http://ostermiller.org/utils/CSV.html

    By Tony P on Aug 14, 2006

  3. FYI: http://jakarta.apache.org/commons/sandbox/csv/

    By Torsten Curdt on Aug 14, 2006

  4. Jakarta commons is putting together a CSV implementation. Perhaps you’d consider donating/helping out?
    http://jakarta.apache.org/commons/sandbox/csv/

    By Stephen Colebourne on Aug 14, 2006

  5. Tony,

    Thanks, Ostermiller looks very nice and obviously far more polished than mine, which to me was more of a “fun brain teaser” (quote myself :) ). That is, the fun comes from doing it with one regular expression rather than writing or generating an FSM.

    Torsten, Stephen,

    Good to see Commons CSV getting started. I’ll definitely see if I can help out.

    By Jing Xue on Aug 17, 2006

Post a Comment