title

Useful Regex : CSV

 

Useful Regex : CSV

29 Avr 2013, Posted by antoine in

//CSV: Change delimiter
//Changes the delimiter from a comma into a tab.
//The capturing group makes sure delimiters inside double-quoted entries are ignored.
‘(« [^ »rn]* »)?,(?![^ »,rn]* »$)’

//CSV: Complete row, all fields.
//Match complete rows in a comma-delimited file that has 3 fields per row,
//capturing each field into a backreference.
//To match CSV rows with more or fewer fields, simply duplicate or delete the capturing groups.
‘^(« [^ »rn]* »|[^,rn]*),(« [^ »rn]* »|[^,rn]*),(« [^ »rn]* »|[^,rn]*)$’

//CSV: Complete row, certain fields.
//Set %SKIPLEAD% to the number of fields you want to skip at the start, and %SKIPTRAIL% to
//the number of fields you want to ignore at the end of each row.
//This regex captures 3 fields into backreferences. To capture more or fewer fields,
//simply duplicate or delete the capturing groups.
‘^(?:(?: »[^ »rn]* »|[^,rn]*),){%SKIPLEAD%}(« [^ »rn]* »|[^,rn]*),(« [^ »rn]* »|[^,rn]*),(« [^ »rn]* »|[^,rn]*)(?:(?: »[^ »rn]* »|[^,rn]*),){%SKIPTRAIL%}$’

//CSV: Partial row, certain fields
//Match the first SKIPLEAD+3 fields of each rows in a comma-delimited file that has SKIPLEAD+3
//or more fields per row. The 3 fields after SKIPLEAD are each captured into a backreference.
//All other fields are ignored. Rows that have less than SKIPLEAD+3 fields are skipped.
//To capture more or fewer fields, simply duplicate or delete the capturing groups.
‘^(?:(?: »[^ »rn]* »|[^,rn]*),){%SKIPLEAD%}(« [^ »rn]* »|[^,rn]*),(« [^ »rn]* »|[^,rn]*),(« [^ »rn]* »|[^,rn]*)’

//CSV: Partial row, leading fields
//Match the first 3 fields of each rows in a comma-delimited file that has 3 or more fields per row.
//The first 3 fields are each captured into a backreference. All other fields are ignored.
//Rows that have less than 3 fields are skipped. To capture more or fewer fields,
//simply duplicate or delete the capturing groups.
‘^(« [^ »rn]* »|[^,rn]*),(« [^ »rn]* »|[^,rn]*),(« [^ »rn]* »|[^,rn]*)’

//CSV: Partial row, variable leading fields
//Match the first 3 fields of each rows in a comma-delimited file.
//The first 3 fields are each captured into a backreference.
//All other fields are ignored. If a row has fewer than 3 field, some of the backreferences
//will remain empty. To capture more or fewer fields, simply duplicate or delete the capturing groups.
//The question mark after each group makes that group optional.
‘^(« [^ »rn]* »|[^,rn]*),(« [^ »rn]* »|[^,rn]*)?,(« [^ »rn]* »|[^,rn]*)?’