Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

S3 to Redshift input data format

I'm trying to run a simple chain s3-pipeline-redshift, but I've got completely stucked with input data format. Here's my file:

1,Toyota Park,Bridgeview,IL
2,Columbus Crew Stadium,Columbus,OH
3,RFK Stadium,Washington,DC
4,CommunityAmerica Ballpark,Kansas City,KS
5,Gillette Stadium,Foxborough,MA
6,New York Giants Stadium,East Rutherford,NJ
7,BMO Field,Toronto,ON
8,The Home Depot Center,Carson,CA
9,Dick's Sporting Goods Park,Commerce City,CO
10,Pizza Hut Park,Frisco,TX

and here's the table I'm using:

    create table venue_new(
    venueid smallint not null,
    venuename varchar(100) not null,
    venuecity varchar(30),
    venuestate char(2),
    venueseats integer not null default '1000');

When I use | as a delimiter, I'm getting error 1214 - Delimiter not found , when I use comma - same thing, when I converted file to utf-8, I'm getting "Invalid digit, Value '.', Pos 0, Type: Short'. I ran out of ideas. What a heck is wrong with that thing? Can somebody please give me the example of the input file or tell what I'm doing wrong? Thanks in advance. P.S. I also found that sample files are available in bucket awssampledb, but I have no idea how to get them.

like image 530
KorsaR Avatar asked Feb 19 '14 19:02

KorsaR


1 Answers

Based on the data in the file example. You need to remember that you have 5 fields in your table, and there is no 5th field in any of your data - yet it is a not null field. Your Copy command needs to reference the 4 columns you are providing at the start of the statement.

copy venue_new(venueid, venuename, venuecity, venuestate) 
from 's3://mybucket/data/venue_noseats.txt' 
credentials 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>'
delimiter ',';

I found the above command (from AWS Docs COPY examples worked successfully for me, leaving me with the default 1000 in the 'venueseats' column.

like image 167
davefender Avatar answered Sep 20 '22 12:09

davefender