Prepare a CSV file for import

This article helps you prepare your files for a successful import using the file import data source.

Data sets

To get the most out your imported data, data should have a visitor identifier associated with each row to ensure that the corresponding visitor within the Customer Data Hub is enriched.

When multiple files are uploaded at the same time using SFTP or S3, the files are processed in alphabetical order based on the file name.

The following data sets are recommended for use with data imports using File Import Data Source:

  • Demographic Data
  • Prospect and Lead Data
  • Historical Purchases and Refunds
  • Offline Interactions (support, direct marketing, event attendance, etc.)

File format

To ensure a successful import of the data, use the following guidelines when formatting your file:

  • File Import Data Source supports CSV files where the first line of the file must be a header line that names the columns of the file. Each line after that represents an event or a visitor record and must contain at least one visitor identifier attribute.
  • Column names may not contain “#”, “^”, or whitespace characters.
  • Use the following example format for file attributes that are to be mapped to an array of strings: "[""one"",""two"",""three""]".
  • Use the following example format for file attributes that are to be mapped to an array of numbers:

File naming

CSV files must be named using the following format: {prefix}_{version}.csv. This format consists of the following two (2) parts:

Format Description
prefix A unique identifier for groups of files that share the same CSV column names.
version A unique identifier for a file within a prefix, usually a timestamp and (optionally) a version number

File names are case-sensitive and may not contain special characters other than hyphen ‘-’ and underscore ‘_’.

File naming examples

  • Example 1
    The following example is for in-store purchases dated March 19, 2017 and separated into two (2) files, “a” and “b”.
Prefix Version File Name
storepurchases 20170319-a storepurchases_20170319-a.csv
storepurchases 20170319-b storepurchases_20170319-b.csv
  • Example 2
    This example is a file named for in-store returns from March 14, 2017 and March 15, 2017.
Prefix Version File Name
storereturns 20170314 storereturns_20170314.csv
storereturns 20170315 storereturns_20170315.csv

The prefix of one set of files should not match the prefix of another set of files. For example, attempts to maintain a prefix of store-transactions and store-transactions-returns will cause unexpected results because they share the same prefix: “store-transactions”.

Data format

Use the following data format guidelines to ensure a successful import:

  • Column Values
    Column values should not exceed 1,000 characters. Any characters over this limit are trimmed.
  • Double-Quote Values with Commas
    If a value contains a comma (,) it must be surrounded by double-quotes to ensure the integrity of the CSV columns. We recommend putting all values in double-quotes. Use a doubled double quote ("") to indicate a double quote character in the data.
  • Extra Columns
    Extra columns that are not configured as AudienceStream attributes will be ignored. Column names are case-sensitive and must match the attribute names exactly.
  • Column Order
    The order of the columns does not matter, but we recommend placing the visitor ID column first.
  • File encoding
    We suggest UTF-8 (without BOM encoding).
  • Group Rows by Visitor ID
    Each row has a visitor ID column and file may contain multiple rows for the same visitor. Rows that contain the same visitor ID should be grouped together to optimize the import process.
  • Normalize Dates
    Date values must be consistent throughout a file for a given field. Each date value must match the format specified for that column. For example, if the expected format is “yyyy-MM-dd”, the values must contain 4-digit years and 2-digit month and day values, for example, “2021-02-09”.
    The following date components are supported: Year, Month, Day, Hour, Minutes, Seconds, Milliseconds, AM/PM, ISO , RFC, Era. (Learn more about Java SimpleDateFormat Patterns.)
  • Normalize Empty Values
    Some rows might not contain a value for each column. Make sure that these columns are empty and do not contain placeholder values such as “null”, “empty”, “blank”, etc.
  • Omit Special Characters from Currency Amounts
    Columns that represent currency amounts, such as “OrderTotal” or “ProductPrice”, must not contain symbols or commas.
Valid Values Invalid Values
39.75 $39.75
0 zero
1399.00 €1.399,00

Was this page helpful?

This page was last updated: January 7, 2023