migrating from oscommerce to ubercart

stepping through the node import wizard

I recently undertook a migration from oscommerce to ubercart. The scope of this migration was limited to the transfer of products (and categories) - I didn't try and migrate customers and previous orders.

Here's an overview of the procedure I followed:

  • generate a CSV file of categories in oscommerce
  • import into drupal / ubercart using taxonomy_csv module
  • generate a CSV file of product data from oscommerce
  • import into drupal / ubercart using (patched) node_import module

My life was made relatively easy by the fact that although the categories in oscommerce had a hierarchical structure, it was very simple. There were only a handful of top-level categories, and the tree was only one deep. Here's what the schema for categories looks like in oscommerce:

mysql> SHOW TABLES LIKE 'categor%';
+-----------------------------------+
| Tables_in_mysite_osc1 (categor%) |
+-----------------------------------+
| categories                        |
| categories_description            |
+-----------------------------------+
mysql> DESCRIBE categories;
+------------------+-------------+------+-----+---------+----------------+
| FIELD            | TYPE        | NULL | KEY | DEFAULT | Extra          |
+------------------+-------------+------+-----+---------+----------------+
| categories_id    | INT(11)     | NO   | PRI | NULL    | AUTO_INCREMENT |
| categories_image | VARCHAR(64) | YES  |     | NULL    |                |
| parent_id        | INT(11)     | NO   | MUL | 0       |                |
| sort_order       | INT(3)      | YES  |     | NULL    |                |
| date_added       | datetime    | YES  |     | NULL    |                |
| last_modified    | datetime    | YES  |     | NULL    |                |
+------------------+-------------+------+-----+---------+----------------+
mysql> DESCRIBE categories_description;
+-----------------+-------------+------+-----+---------+-------+
| FIELD           | TYPE        | NULL | KEY | DEFAULT | Extra |
+-----------------+-------------+------+-----+---------+-------+
| categories_id   | INT(11)     | NO   | PRI | 0       |       |
| language_id     | INT(11)     | NO   | PRI | 1       |       |
| categories_name | VARCHAR(32) | NO   | MUL |         |       |
+-----------------+-------------+------+-----+---------+-------+

I had 115 categories, 5 of which were top-level, and the remaining 110 were all children of one of those 5 parents. This meant it was simple to generate the CSV for the import of the category hierarchy into drupal manually; I simply went through my top-level categories and got a list of all their children. All I wanted for the import was the category names. (n.b. I could also ignore language as this site's monolingual.) I did some simple queries to get the names of categories where the parent was one of my top-level categories (which all had a parent_id of 0), e.g.

SELECT categories_name FROM categories c 
                    INNER JOIN categories_description cd ON c.categories_id = cd.categories_id 
                    WHERE c.parent_id = 31;

...and prepared a CSV file, a snippet of which is below (where Empire and Europe and Colonies are top-level) I then imported my categories using taxonomy_csv, set to mode Hierarchical tree structure or one term by line structure. node_import can also import taxonomy terms, but unless I'm mistaken it doesn't support hierarchical taxonomies.

"Empire",
,"Aden"
,"Antigua"
,"Ascension"
,"Australia"
,"B.O.I.C."
,"Bahamas"
...snip...
"Europe and Colonies",
,"Austria"
,"Baltic"
,"Benelux"
,"Eastern Europe"
,"France"
...etc...

Next was my products. I also wanted to use a CSV file to import these into ubercart, so I had to generate a CSV file from the oscommerce database. I wrote a quick php cli script which queries the database, (optionally) grabs product images using CURL from the webserver oscommerce is running on, and outputs a nice CSV file and a folder full of product images (which need to be put in the right place on the drupal/ubercart server). Here's the script:

#!/usr/bin/php
<?php
 
define('CSV_OUTPUT',             '/tmp/osc_products.csv'              );
define('DB_HOST',                'localhost'                          );
define('DB_NAME',                'mysite_osc1'                        );
define('DB_USER',                'root'                               );
define('DB_PASS',                'top-secret'                         );
define('LIMIT',                  1000                                 );
define('GRAB_IMAGES',            false                                );
define('IMAGE_REMOTE_PATH',      'http://shop.example.com/images/'    );
define('IMAGE_LOCAL_DIR',        '/tmp/osc_images/'                   );
 
$query = <<<EOQ
SELECT * FROM 
  products p 
    INNER JOIN products_description pd ON p.products_id = pd.products_id 
    INNER JOIN products_to_categories ptc ON p.products_id = ptc.products_id 
    INNER JOIN categories_description cd ON ptc.categories_id = cd.categories_id 
  WHERE p.products_status = 1
EOQ;
$query .= ' LIMIT ' . LIMIT;
 
$db = db_connect(DB_HOST, DB_NAME, DB_USER, DB_PASS);
$products = db_query($query, $db);
$counter = 0;
 
/* example of results:
(
    [[products_id]] => 5656
    [[products_quantity]] => 1
    [[products_model]] => 
    [[products_image]] => AdenStH.jpg
    [[products_price]] => 10.0000
    [[products_date_added]] => 2009-03-16 12:34:00
    [[products_last_modified]] => 
    [[products_date_available]] => 
    [[products_weight]] => 0.00
    [[products_status]] => 1
    [[products_tax_class_id]] => 0
    [[manufacturers_id]] => 0
    [[products_ordered]] => 0
    [[language_id]] => 1
    [[products_name]] => N05656 - Ascension : Multifranked to St. Helena 1939
    [[products_description]] => St. Helena receiver dated 1941! Curiosity!!
    [[products_url]] => 
    [[products_viewed]] => 159
    [[categories_id]] => 409
    [[categories_name]] => Ascension
)
*/
 
// prepare files
$handle = fopen(CSV_OUTPUT, 'w');
 
$columns = array('sku', 'name', 'date', 'description', 'image', 'price', 'category');
fputcsv($handle, $columns);
 
 
if (GRAB_IMAGES) {
  if (!is_dir(IMAGE_LOCAL_DIR)) { 
    mkdir(IMAGE_LOCAL_DIR);
  }
}
 
while (($product = db_object($products)) && ($counter < LIMIT)) {
  //print_r($product);
  $counter ++;
 
  $sku = substr($product->products_name, 0, 6);
  $product_name = substr($product->products_name, 9);
  $image_name = clean_filename($product->products_image);
 
  $data_to_write = array(
                          $sku,
                          $product_name,
                          $product->products_date_added,
                          $product->products_description,
                          $image_name,
                          $product->products_price,
                          $product->categories_name
                        );
  $data_to_write = array_map('trim', $data_to_write);
 
  if (GRAB_IMAGES) {
    if (grab_image($product->products_image, $image_name)) {
      echo 'grabbed ' . $product->products_image . " ($image_name)\n";
    }
    else {
      echo 'failed to grab ' . $product->products_image . "\n";
    }
  }
 
  fputcsv($handle, $data_to_write);
}
 
fclose($handle);
echo "# iterated over $counter products\n### END\n";
exit;
 
/** helper functions **/
 
function db_error($message) {
  echo "db_error: $message\n" . mysql_error() . "\n";
}
 
function db_connect($db_host, $db_name, $db_user, $db_pass) {
  $db = mysql_connect($db_host, $db_user, $db_pass) or db_error('Unable to connect to database');
  mysql_select_db($db_name, $db);
  return $db;
}
 
function db_query($query, $db) {
  $result = mysql_query($query, $db) or db_error($sql);
  return $result;
}
 
function db_object($result) {
  return mysql_fetch_object($result);
}
 
function grab_image($image_name, $new_name) {
  $url = IMAGE_REMOTE_PATH . $image_name;
   // use curl to grab the image from the server
  $ch = curl_init($url);
 
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
  curl_setopt($ch, CURLOPT_HEADER, FALSE);
 
  $data = curl_exec($ch);
  curl_close($ch);
  if (strlen($data) > 0) {
    $retval = file_put_contents(IMAGE_LOCAL_DIR . $new_name, $data);
  }
  else {
    $retval = false;
  }
  return $retval;
}
 
function clean_filename($old_name) {
  $bad_stuff =  array('.JPG', ' ', '&');
  $good_stuff = array('.jpg', '',  '-');
  $new_name = str_replace($bad_stuff, $good_stuff, $old_name);
  return $new_name;
}

You'll see my products had an SKU we'd put into the first part of the title in oscommerce - this should be easy to remove if it's not applicable. The image grabbing requires php-curl - you could just grab a whole image directory off the server running oscommerce, but I wanted to be careful to only migrate images actually being used. There are obviously many other ways of doing this.

I had to apply a couple of patches to node_import (rc4) to get it working for me:

...I also set escape char to \ (when using fputcsv in my script). After all that, I was able to follow the node_import wizard, mapping fields in the CSV to fields in my ubercart product content type, after which the imported successfully digested my CSV, and all my products appeared with their images in my new ubercart site (I obviously had to put the images my script had grabbed into the right place on the ubercart webserver as well).