CrazyEngineers Archive
Old, but evergreen and popular discussions on CrazyEngineers, presented to you in read-only mode.
@Aashish Joshi • 21 Dec, 2015
Hi,

I sometimes work (sort, filter, create pivot tables, etc) with CSV files that have a few million rows.

Since Excel & Open office don't support more than 1048576 rows I have to first upload the CSVs into a database and then work on them - which isn't very convenient as I have to spend extra time on uploading the data. Depending on the file size this sometimes takes an hour or so. 😔

I'd like to avoid splitting the CSV into smaller files.

Has anyone faced a similar problem before? Are there any other solutions - perhaps there are tools I'm unaware of?

Also, slightly off topic, why is there a limitation on the number of rows? Earlier versions of office supported about 65k rows, now they've "upgraded" to a little over 1 million rows. 😨
@Manish Goyal • 31 Dec, 2015 did you check this discussion?

Excel CSV. file with more than 1,048,576 rows of data
@Kaustubh Katdare • 31 Dec, 2015 MS Access, perhaps? If that's not a solution; I'd rather setup a local database with phpMyAdmin (XAMPP or MAMP) and simply import all my data into it. It'd make all the operations a breeze. What say?
@simplycoder • 04 Jan, 2016 • 1 like If Microsoft stack then perhaps it can be done using SSIS packages.
@Aashish Joshi • 07 Jan, 2016 I did try importing the csv file "as-is" into Oracle 11g Express Edition using the Application Express UI. That failed miserably - 6 hours and not a single row imported! 😔

Then I had a light-bulb moment! 😁

I wrote a small Perl script to read the csv and add the columns that were of interest to me into mysql using "Insert Delayed". This worked like a charm 😀
@Kaustubh Katdare • 08 Jan, 2016
Aashish Joshi
I wrote a small Perl script to read the csv and add the columns that were of interest to me into mysql using "Insert Delayed". This worked like a charm
Sharing is caring, my friend.
@Aashish Joshi • 12 Jan, 2016 Here's the perl script I used. Hope it helps someone 😀

It will read a csv line-by-line and and add the values to an Oracle database. With minor modifications the same script can be used for MySQL and other databases as well.

The script ignores the first line that usually contains the column names.

use strict;
use warnings;
use Text::CSV;
use DBI;
use MIME::Lite;

#open the csv file
open my $fh,"<",'sample.csv';

my $csv = Text::CSV->new({binary => 1, auto_diag => 1, allow_whitespace => 1});

#skip the first line that contains the headers
my $line = <$fh>;

#connect to the database.
#depending on the type of DB
#you're using the connect string
#will vary. Read the DBI CPAN docs for more info.
#get the SID from your DBA and
#enter that instead of XE.
#for express edition the SID is XE
my $db_string = 'DBI:Oracle:XE';

#$db_username & $db_password need to be set
#to real values. if you don't know this
#ask your DBA or read the documentation!!
my $db_username = 'scott';
my $db_password = 'tiger';

my $dbh = DBI->connect($db_string, $db_username, $db_password) or report_error("Couldn't connect to database!",DBI->errstr);

#prepare the insert statement.
#in the line below each '?' represents
#a value that will be passed
#by the script.
my $sth = $dbh->prepare('INSERT INTO TABLE_NAME (column1,column2,column3) VALUES (?,?,?)') or report_error("Couldn't prepare insert statement!",DBI->errstr);

#now, we read the csv line-by-line
#and insert selected columns into the database
#you'll need to identify what columns
#you have to insert.for the sample csv
#i'll upload the first, second and fourth colums only
#note that Perl starts the column numbering
#from 0.

while ($line = $csv->getline($fh)){

    $sth->execute($line->[0],$line->[1],$line->[3]) or report_error("Couldn't add value from CSV to database!","Values: ".$line->[0],$line->[1],$line->[3]."\nDB Error:\n\n".DBI->errstr);

}

#close the csv file opened at the start
close $fh;

##################################################################################
#
#    THE ERROR REPORTING SUB-ROUTINE IS DEFINED BELOW. IT TAKES 2 VARIABLES:
#    1. THE FIRST ONE IS THE SUBJECT OF THE EMAIL
#    2. THE SECOND ONE IS THE BODY.
#
#    IT ACCEPTS HTML AS WELL, ALTHOUGH IT SHOULD BE USED SPARINGLY TO ENSURE
#    YOUR MAIL ISN'T INCORRECTLY MARKED AS SPAM.
#
##################################################################################

sub report_error {
   
    my $err_msg = MIME::Lite->new(
                    From    => 'from@domain.com',
                    To      =>'to1@domain.com,to2@domain.com',
                    Subject =>$_[0],
                    Type    =>'multipart/related'
                    );
    $err_msg->attach(
                    Type => 'text/html',
                    Data => $_[1]
                    );

    #enter your email username & password here
    #AuthUser is the username that you use to sign in to email clients
    #domain should be the SMTP domain
    MIME::Lite->send(
                    'smtp',
                    "domain.com",
                    Timeout=>60,
                    AuthUser=>'email_username',
                    AuthPass=>'password'
                    );
    $err_msg->send();
    my $err_time = DateTime->now();
    exit;
}
4.3k views

Related Posts

@Abhishek Rawal · Mar 6, 2013

I don't have on-field experience as I am not even graduated. But, Every IT professionals I talked with wanna join TCS. Born curious,I googled some of stuffs of TCS &...
4.7k views

@Ankita Katdare · Aug 24, 2012

Here is a list of project ideas based on Artificial Intelligence (AI). Students belonging to third year or final year can use these projects as mini-projects as well as mega-projects....
139.8k views

@madhu27 · May 25, 2013

I am keen in knowing physical meaning of poles and zeros. physical meaning in the sense what does poles and zeros mean in real world....
2.4k views

@Guneet Khanuja · Feb 7, 2014

Clear car rental is a startup based out of Aurangabad in Maharashtra. Sachin Kate, the founder of Clear car rental, moved out of his native place in Marathwada for basic...
2.3k views

@Ankita Katdare · Jan 16, 2015

In the year 2014, we saw the upsurge of wearable electronic gadgets becoming a part of common man's life. Be it smartwatches or smartglasses, the top leading tech companies are...
3.1k views