Dataflow reading multiple text files from cloud storage to bigquery

Former Community Member
Not applicable

Hello I currently have a website on compute engine, every time a user enters data into it, it saves that customers data in a .txt file and stores it in a cloud storage bucket in a certain folder, I am trying to have dataflow read in files as they are coming in and store them in big query, I got the schema to match but every time I run it, it only reads the file I select in that cloud storage, it doesn't read in multiple and it doesn't detect the event in which a new file is added to that folder.

here is my bq schema: 

 

 

 

 

[
    {
        "name": "fname",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "lname",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "email",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "creditcard",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "date",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "vanilla",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "vanilla_qty",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "chocolate",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "chocolate_qty",
        "type": "STRING",
        "mode": "NULLABLE"
    }
]

 

 

 

 

 

and here is how my data is coming in, in a .txt file.

{"fname": "Jane", "lname": "Smith", "email": "jane.smith@example.com", "creditcard": "9876543210987654", "date": "2024-03-27", "vanilla": "yes", "vanilla_qty": 1, "chocolate": "yes", "chocolate_qty": 3}

 

I have tried the streaming template in dataflow, both of them and I have tried the batch one as well. I tried storing my data in .json file as well and that didn't work. not sure what the issue is. 

Solved Solved
3 1 111
1 ACCEPTED SOLUTION

I have replicated the concern and it seems working fine for me every time I add a new .txt file it gets I can query it to BigQuery . When setting up the Job (in console) be sure to follow the pattern: 

gs://bucketname/path/*.txt

change txt format depending on your use case

nceniza_0-1712266911465.png

My Schema file content:

{
  "BigQuery Schema": [
   {
        "name": "fname",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "lname",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "email",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "creditcard",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "date",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "vanilla",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "vanilla_qty",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "chocolate",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "chocolate_qty",
        "type": "STRING",
        "mode": "NULLABLE"
    }
  ]
}

 Results in BQ:

nceniza_1-1712267122320.png

Values added everytime I added a file. 

View solution in original post

1 REPLY 1

I have replicated the concern and it seems working fine for me every time I add a new .txt file it gets I can query it to BigQuery . When setting up the Job (in console) be sure to follow the pattern: 

gs://bucketname/path/*.txt

change txt format depending on your use case

nceniza_0-1712266911465.png

My Schema file content:

{
  "BigQuery Schema": [
   {
        "name": "fname",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "lname",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "email",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "creditcard",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "date",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "vanilla",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "vanilla_qty",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "chocolate",
        "type": "STRING",
        "mode": "NULLABLE"
    },
    {
        "name": "chocolate_qty",
        "type": "STRING",
        "mode": "NULLABLE"
    }
  ]
}

 Results in BQ:

nceniza_1-1712267122320.png

Values added everytime I added a file.