Prepare a CSV file for import
This article helps you prepare your files for a successful import using the file import data source.
Data sets
To get the most out your imported data, data should have a visitor identifier associated with each row to ensure that the corresponding visitor within the Customer Data Hub is enriched.
When multiple files are uploaded at the same time using SFTP or S3, the files are processed in alphabetical order based on the file name.
The following data sets are recommended for use with data imports using File Import Data Source:
- Demographic Data
- Prospect and Lead Data
- Historical Purchases and Refunds
- Offline Interactions (support, direct marketing, event attendance, etc.)
File format
To ensure a successful import of the data, use the following guidelines when formatting your file:
- File Import Data Source supports CSV files where the first line of the file must be a header line that names the columns of the file. Each line after that represents an event or a visitor record and must contain at least one visitor identifier attribute.
- Column names may not contain “#”, “^”, or whitespace characters.
- Use the following example format for file attributes that are to be mapped to an array of strings:
"[""one"",""two"",""three""]"
. - Use the following example format for file attributes that are to be mapped to an array of numbers:
"[1.99,20.99]"
.
File naming
CSV files must be named using the following format: {prefix}_{version}.csv
. This format consists of the following two (2) parts:
Format | Description |
---|---|
prefix |
A unique identifier for groups of files that share the same CSV column names. |
version |
A unique identifier for a file within a prefix, usually a timestamp and (optionally) a version number |
File names are case-sensitive and may not contain special characters other than hyphen ‘-’ and underscore ‘_’.
File naming examples
- Example 1
The following example is for in-store purchases dated March 19, 2017 and separated into two (2) files, “a” and “b”.
Prefix | Version | File Name |
---|---|---|
storepurchases |
20170319-a |
storepurchases_20170319-a.csv |
storepurchases |
20170319-b |
storepurchases_20170319-b.csv |
- Example 2
This example is a file named for in-store returns from March 14, 2017 and March 15, 2017.
Prefix | Version | File Name |
---|---|---|
storereturns |
20170314 |
storereturns_20170314.csv |
storereturns |
20170315 |
storereturns_20170315.csv |
The prefix of one set of files should not match the prefix of another set of files. For example, attempts to maintain a prefix of store-transactions
and store-transactions-returns
will cause unexpected results because they share the same prefix: “store-transactions”.
Data format
Use the following data format guidelines to ensure a successful import:
- Column Values
Column values should not exceed 1,000 characters. Any characters over this limit are trimmed. - Double-Quote Values with Commas
If a value contains a comma (,) it must be surrounded by double-quotes to ensure the integrity of the CSV columns. We recommend putting all values in double-quotes. Use a doubled double quote ("") to indicate a double quote character in the data. - Extra Columns
Extra columns that are not configured as AudienceStream attributes will be ignored. Column names are case-sensitive and must match the attribute names exactly. - Column Order
The order of the columns does not matter, but we recommend placing the visitor ID column first. - File encoding
We suggest UTF-8 (without BOM encoding). - Group Rows by Visitor ID
Each row has a visitor ID column and file may contain multiple rows for the same visitor. Rows that contain the same visitor ID should be grouped together to optimize the import process. - Normalize Dates
Date values must be consistent throughout a file for a given field. Each date value must match the format specified for that column. For example, if the expected format is “yyyy-MM-dd”, the values must contain 4-digit years and 2-digit month and day values, for example, “2021-02-09”.
The following date components are supported: Year, Month, Day, Hour, Minutes, Seconds, Milliseconds, AM/PM, ISO , RFC, Era. (Learn more about Java SimpleDateFormat Patterns.) - Normalize Empty Values
Some rows might not contain a value for each column. Make sure that these columns are empty and do not contain placeholder values such as “null”, “empty”, “blank”, etc. - Omit Special Characters from Currency Amounts
Columns that represent currency amounts, such as “OrderTotal” or “ProductPrice”, must not contain symbols or commas.
Valid Values | Invalid Values |
---|---|
39.75 | $39.75 |
0 | zero |
1399.00 | €1.399,00 |
This page was last updated: January 7, 2023