Quick Start¶
This guide will get you syncing CSV files to PostgreSQL in 5 minutes.
Step 1: Install crump¶
Step 2: Prepare Your Data¶
Create a sample CSV file (users.csv):
user_id,name,email,notes
1,Alice,alice@example.com,Admin user
2,Bob,bob@example.com,Regular user
3,Charlie,charlie@example.com,Guest user
Step 3: Create Configuration¶
You can either create a configuration file manually or use the prepare command to analyze your CSV and generate one automatically.
This will: - Analyze your CSV file - Detect column types - Suggest an ID column - Suggest indexes - Generate a configuration file
Step 4: Set Database URL¶
Step 5: Preview Changes (Optional)¶
Before syncing, you can preview what changes will be made:
This shows:
- Schema changes (tables, columns, indexes to be created)
- Number of rows to be inserted/updated
- Number of stale rows to be deleted
- Without actually modifying the database
Step 6: Sync Your Data¶
You should see output like:
Syncing users.csv using job 'users_sync'...
✓ Successfully synced 3 rows
Table: users
File: users.csv
Step 7: Verify in Database¶
Connect to your database and verify the data:
id | full_name | email_address
----+---------------+--------------------
1 | Alice | alice@example.com
2 | Bob | bob@example.com
3 | Charlie | charlie@example.com
What Just Happened?¶
- Table Creation: crump created the
userstable automatically - Column Mapping: CSV columns were renamed according to your config
- Type Detection: Column types were inferred from your CSV data
- Primary Key: The
user_idcolumn was mapped toidas the primary key - Upsert: Data was inserted using PostgreSQL's upsert mechanism
Running Again¶
The sync is idempotent - you can run it multiple times safely:
# Update a row in users.csv
# Change Alice's email to alice.new@example.com
# Run sync again
crump sync users.csv crump_config.yml --job users_sync
The existing rows are updated, no duplicates are created.
Next Steps¶
Now that you have the basics working, learn about more advanced features:
- Configuration Guide - Advanced YAML configuration
- Features - Learn about all features
- Filename-based value extraction
- Automatic stale record cleanup
- Compound primary keys
- Database indexes
- CLI Reference - All command-line options
- API Reference - Use crump in your Python code
Common Use Cases¶
Daily Data Updates¶
Extract date from filename and automatically cleanup old data:
jobs:
daily_sales:
target_table: sales
id_mapping:
sale_id: id
filename_to_column:
template: "sales_[date].csv"
columns:
date:
db_column: sync_date
type: date
use_to_delete_old_rows: true
Selective Column Sync¶
Only sync specific columns, ignore others:
jobs:
users_sync:
target_table: users
id_mapping:
user_id: id
columns:
name: full_name
email: email
# Other CSV columns are ignored
Compound Primary Keys¶
Use multiple columns as the primary key: