CrazyEngineers Archive
Old, but evergreen and popular discussions on CrazyEngineers, presented to you in read-only mode.
@Aashish Joshi • 21 Dec, 2015
Hi,

I sometimes work (sort, filter, create pivot tables, etc) with CSV files that have a few million rows.

Since Excel & Open office don't support more than 1048576 rows I have to first upload the CSVs into a database and then work on them - which isn't very convenient as I have to spend extra time on uploading the data. Depending on the file size this sometimes takes an hour or so. 😔

I'd like to avoid splitting the CSV into smaller files.

Has anyone faced a similar problem before? Are there any other solutions - perhaps there are tools I'm unaware of?

Also, slightly off topic, why is there a limitation on the number of rows? Earlier versions of office supported about 65k rows, now they've "upgraded" to a little over 1 million rows. 😨
@Manish Goyal • 31 Dec, 2015 did you check this discussion?

Excel CSV. file with more than 1,048,576 rows of data
@Kaustubh Katdare • 31 Dec, 2015 MS Access, perhaps? If that's not a solution; I'd rather setup a local database with phpMyAdmin (XAMPP or MAMP) and simply import all my data into it. It'd make all the operations a breeze. What say?
@simplycoder • 04 Jan, 2016 • 1 like If Microsoft stack then perhaps it can be done using SSIS packages.
@Aashish Joshi • 07 Jan, 2016 I did try importing the csv file "as-is" into Oracle 11g Express Edition using the Application Express UI. That failed miserably - 6 hours and not a single row imported! 😔

Then I had a light-bulb moment! 😁

I wrote a small Perl script to read the csv and add the columns that were of interest to me into mysql using "Insert Delayed". This worked like a charm 😀
@Kaustubh Katdare • 08 Jan, 2016
Aashish Joshi
I wrote a small Perl script to read the csv and add the columns that were of interest to me into mysql using "Insert Delayed". This worked like a charm
Sharing is caring, my friend.
@Aashish Joshi • 12 Jan, 2016 Here's the perl script I used. Hope it helps someone 😀

It will read a csv line-by-line and and add the values to an Oracle database. With minor modifications the same script can be used for MySQL and other databases as well.

The script ignores the first line that usually contains the column names.

use strict;
use warnings;
use Text::CSV;
use DBI;
use MIME::Lite;

#open the csv file
open my $fh,"<",'sample.csv';

my $csv = Text::CSV->new({binary => 1, auto_diag => 1, allow_whitespace => 1});

#skip the first line that contains the headers
my $line = <$fh>;

#connect to the database.
#depending on the type of DB
#you're using the connect string
#will vary. Read the DBI CPAN docs for more info.
#get the SID from your DBA and
#enter that instead of XE.
#for express edition the SID is XE
my $db_string = 'DBI:Oracle:XE';

#$db_username & $db_password need to be set
#to real values. if you don't know this
#ask your DBA or read the documentation!!
my $db_username = 'scott';
my $db_password = 'tiger';

my $dbh = DBI->connect($db_string, $db_username, $db_password) or report_error("Couldn't connect to database!",DBI->errstr);

#prepare the insert statement.
#in the line below each '?' represents
#a value that will be passed
#by the script.
my $sth = $dbh->prepare('INSERT INTO TABLE_NAME (column1,column2,column3) VALUES (?,?,?)') or report_error("Couldn't prepare insert statement!",DBI->errstr);

#now, we read the csv line-by-line
#and insert selected columns into the database
#you'll need to identify what columns
#you have to insert.for the sample csv
#i'll upload the first, second and fourth colums only
#note that Perl starts the column numbering
#from 0.

while ($line = $csv->getline($fh)){

    $sth->execute($line->[0],$line->[1],$line->[3]) or report_error("Couldn't add value from CSV to database!","Values: ".$line->[0],$line->[1],$line->[3]."\nDB Error:\n\n".DBI->errstr);

}

#close the csv file opened at the start
close $fh;

##################################################################################
#
#    THE ERROR REPORTING SUB-ROUTINE IS DEFINED BELOW. IT TAKES 2 VARIABLES:
#    1. THE FIRST ONE IS THE SUBJECT OF THE EMAIL
#    2. THE SECOND ONE IS THE BODY.
#
#    IT ACCEPTS HTML AS WELL, ALTHOUGH IT SHOULD BE USED SPARINGLY TO ENSURE
#    YOUR MAIL ISN'T INCORRECTLY MARKED AS SPAM.
#
##################################################################################

sub report_error {
   
    my $err_msg = MIME::Lite->new(
                    From    => 'from@domain.com',
                    To      =>'to1@domain.com,to2@domain.com',
                    Subject =>$_[0],
                    Type    =>'multipart/related'
                    );
    $err_msg->attach(
                    Type => 'text/html',
                    Data => $_[1]
                    );

    #enter your email username & password here
    #AuthUser is the username that you use to sign in to email clients
    #domain should be the SMTP domain
    MIME::Lite->send(
                    'smtp',
                    "domain.com",
                    Timeout=>60,
                    AuthUser=>'email_username',
                    AuthPass=>'password'
                    );
    $err_msg->send();
    my $err_time = DateTime->now();
    exit;
}
6k views

Related Posts

@Harshad Italiya · Mar 25, 2011

DIY: - Do It Yourself ​ Wireless Data Transmission​ Wireless Chat using RFM70 RFM70 is a GFSK transceiver module operating in the world wide ISM frequency band at 2400 -...
7.2k views

@Kaustubh Katdare · Apr 1, 2012

I've been hearing the rumors that Tata group companies, Tata Technologies and Tata Consultancy Services (TCS) have a surprise for the candidates recruited in the 2011-2012 campus recruitment season. The...
5.6k views

@Kaustubh Katdare · Aug 1, 2013

Engineers at Indian Institute Of Technology, Madras (Chennai) have developed a refrigerator that can achieve temperatures as low as -200 degree C, using a single stage vapour compression refrigerator that...
3.4k views

@jav · Jan 21, 2009

LET YOUR HEART BE IN PEACE Hi, Sir, i wanna method for increasing the pressure of the steam, can you guide me in what all ways can i do that,...
4.2k views

@Ankita Katdare · Apr 3, 2015

Gionee's India body has rolled up its sleeves as it gets ready to launch its high-end smartphone Gionee Elife S7 in India at an unveiling event scheduled for April 4th,...
3.4k views