A Module free way to process delimited data with row independence in perl
Example CSV input content (myfile.csv):
last,first,phone,zipcode,country
jones,jim,314-555-1212,63033,usa
smith,john,314-555-1001,63146,usa
doe,jane,314-555-0019,63141,usa
smith,jim,314-555-1210,65401,usa
The script below reads, processes, and hashes the csv input data
#!/usr/bin/perl -w
#
# simplecsv.pl - a quick and module free way to deal with csv data
#
use strict;
# declare vars
my ($country,@lines,@fields,%csvhash,$fullname,$zipcode,$key,@keys,$i,
%allzips,%firstnames);
# read the input file into an array
open INPUT,"<myfile.csv";
@lines = <INPUT>;
close INPUT;
# take the first line and create a list of headers out of it
chomp $lines[0];
@fields = split( /,/, $lines[0] );
print "top row fields from csv file are: @fields\n\n";
# delete that first line now
shift @lines;
# now process each remaining line
foreach my $line ( @lines ) {
chomp $line;
# Skipping if the line is empty or a comment
next if ( $line =~ /^\s*$/ ); # skip empty lines
next if ( $line =~ /^\s*#/ ); # skip lines that start with #
# store this line into a hash table keyed with the fields above
@csvhash{ @fields } = split( /,/, $line );
# print some data as we parse each line
print "$csvhash{first} $csvhash{last} lives in $csvhash{country} and can be reached at $csvhash{phone}\n";
# store the data in some hash tables using the header names
$allzips{$csvhash{zipcode}}++;
$firstnames{$csvhash{first}}++;
}
# now that we have all the data stored, do something with it
print "\n";
# count and display the unique zip codes
@keys = sort keys %allzips;
$i = $#keys + 1;
print "we saw $i unique zip codes, they are: @keys\n\n";
# count and display the first names and the number of times that each appeared
@keys = sort keys %firstnames;
$i = $#keys + 1;
print "we saw $i first names, they are:\n";
print "\t name\t occurances\n";
foreach $key (@keys) {
print "\t $key\t $firstnames{$key}\n";
}
Output from the program is as follows:
gvolk@wumpus:~/scripts$ ./simplecsv.pl
top row fields from csv file are: last first phone zipcode country
jim jones lives in usa and can be reached at 314-555-1212
john smith lives in usa and can be reached at 314-555-1001
jane doe lives in usa and can be reached at 314-555-0019
jim smith lives in usa and can be reached at 314-555-1210
we saw 4 unique zip codes, they are: 63033 63141 63146 65401
we saw 3 first names, they are:
name occurances
jane 1
jim 2
john 1
gvolk@wumpus:~/scripts$