CrazyEngineers
  • Open a file with more than 1048576 rows in Excel/Openoffice

    Aashish Joshi

    Aashish Joshi

    @aashish-VrevFC
    Updated: Oct 26, 2024
    Views: 1.6K
    Hi,

    I sometimes work (sort, filter, create pivot tables, etc) with CSV files that have a few million rows.

    Since Excel & Open office don't support more than 1048576 rows I have to first upload the CSVs into a database and then work on them - which isn't very convenient as I have to spend extra time on uploading the data. Depending on the file size this sometimes takes an hour or so. 😔

    I'd like to avoid splitting the CSV into smaller files.

    Has anyone faced a similar problem before? Are there any other solutions - perhaps there are tools I'm unaware of?

    Also, slightly off topic, why is there a limitation on the number of rows? Earlier versions of office supported about 65k rows, now they've "upgraded" to a little over 1 million rows. 😨
    0
    Replies
Howdy guest!
Dear guest, you must be logged-in to participate on CrazyEngineers. We would love to have you as a member of our community. Consider creating an account or login.
Replies
  • Manish Goyal

    MemberDec 30, 2015

    did you check this discussion?

    #-Link-Snipped-#
    Are you sure? This action cannot be undone.
    Cancel
  • Kaustubh Katdare

    AdministratorDec 30, 2015

    MS Access, perhaps? If that's not a solution; I'd rather setup a local database with phpMyAdmin (XAMPP or MAMP) and simply import all my data into it. It'd make all the operations a breeze. What say?
    Are you sure? This action cannot be undone.
    Cancel
  • simplycoder

    MemberJan 4, 2016

    If Microsoft stack then perhaps it can be done using SSIS packages.
    Are you sure? This action cannot be undone.
    Cancel
  • Aashish Joshi

    MemberJan 7, 2016

    I did try importing the csv file "as-is" into Oracle 11g Express Edition using the Application Express UI. That failed miserably - 6 hours and not a single row imported! 😔

    Then I had a light-bulb moment! 😁

    I wrote a small Perl script to read the csv and add the columns that were of interest to me into mysql using "Insert Delayed". This worked like a charm 😀
    Are you sure? This action cannot be undone.
    Cancel
  • Kaustubh Katdare

    AdministratorJan 7, 2016

    Aashish Joshi
    I wrote a small Perl script to read the csv and add the columns that were of interest to me into mysql using "Insert Delayed". This worked like a charm
    Sharing is caring, my friend.
    Are you sure? This action cannot be undone.
    Cancel
  • Aashish Joshi

    MemberJan 12, 2016

    Here's the perl script I used. Hope it helps someone 😀

    It will read a csv line-by-line and and add the values to an Oracle database. With minor modifications the same script can be used for MySQL and other databases as well.

    The script ignores the first line that usually contains the column names.

    use strict;
    use warnings;
    use Text::CSV;
    use DBI;
    use MIME::Lite;
    
    #open the csv file
    open my $fh,"<",'sample.csv';
    
    my $csv = Text::CSV->new({binary => 1, auto_diag => 1, allow_whitespace => 1});
    
    #skip the first line that contains the headers
    my $line = <$fh>;
    
    #connect to the database.
    #depending on the type of DB
    #you're using the connect string
    #will vary. Read the DBI CPAN docs for more info.
    #get the SID from your DBA and
    #enter that instead of XE.
    #for express edition the SID is XE
    my $db_string = 'DBI:Oracle:XE';
    
    #$db_username & $db_password need to be set
    #to real values. if you don't know this
    #ask your DBA or read the documentation!!
    my $db_username = 'scott';
    my $db_password = 'tiger';
    
    my $dbh = DBI->connect($db_string, $db_username, $db_password) or report_error("Couldn't connect to database!",DBI->errstr);
    
    #prepare the insert statement.
    #in the line below each '?' represents
    #a value that will be passed
    #by the script.
    my $sth = $dbh->prepare('INSERT INTO TABLE_NAME (column1,column2,column3) VALUES (?,?,?)') or report_error("Couldn't prepare insert statement!",DBI->errstr);
    
    #now, we read the csv line-by-line
    #and insert selected columns into the database
    #you'll need to identify what columns
    #you have to insert.for the sample csv
    #i'll upload the first, second and fourth colums only
    #note that Perl starts the column numbering
    #from 0.
    
    while ($line = $csv->getline($fh)){
    
        $sth->execute($line->[0],$line->[1],$line->[3]) or report_error("Couldn't add value from CSV to database!","Values: ".$line->[0],$line->[1],$line->[3]."\nDB Error:\n\n".DBI->errstr);
    
    }
    
    #close the csv file opened at the start
    close $fh;
    
    ##################################################################################
    #
    #    THE ERROR REPORTING SUB-ROUTINE IS DEFINED BELOW. IT TAKES 2 VARIABLES:
    #    1. THE FIRST ONE IS THE SUBJECT OF THE EMAIL
    #    2. THE SECOND ONE IS THE BODY.
    #
    #    IT ACCEPTS HTML AS WELL, ALTHOUGH IT SHOULD BE USED SPARINGLY TO ENSURE
    #    YOUR MAIL ISN'T INCORRECTLY MARKED AS SPAM.
    #
    ##################################################################################
    
    sub report_error {
       
        my $err_msg = MIME::Lite->new(
                        From    => 'from@domain.com',
                        To      =>'to1@domain.com,to2@domain.com',
                        Subject =>$_[0],
                        Type    =>'multipart/related'
                        );
        $err_msg->attach(
                        Type => 'text/html',
                        Data => $_[1]
                        );
    
        #enter your email username & password here
        #AuthUser is the username that you use to sign in to email clients
        #domain should be the SMTP domain
        MIME::Lite->send(
                        'smtp',
                        "domain.com",
                        Timeout=>60,
                        AuthUser=>'email_username',
                        AuthPass=>'password'
                        );
        $err_msg->send();
        my $err_time = DateTime->now();
        exit;
    }
    
    Are you sure? This action cannot be undone.
    Cancel
Home Channels Search Login Register