YouTube Data Analysis | END TO END DATA ENGINEERING PROJECT

YouTube Data Analysis | END TO END DATA ENGINEERING PROJECT

Darshil Parmar

3 года назад

369,923 Просмотров

Ссылки и html тэги не поддерживаются


Комментарии:

@CarlxWentzel
@CarlxWentzel - 10.09.2024 12:28

This project is insane and I'm only halfway. As someone that looked at AWS I was like HUH. You did a great job explaining it so far. :)

Ответить
@Alexchow-s3q
@Alexchow-s3q - 16.09.2024 05:11

Thank you Darshil for this awesome video! I have issues viewing the cleaned date in athena. i got "HIVE_UNKNOWN_ERROR: Path missing in file system location: [my path]
This query ran against the "[cleaned db]" database, unless qualified by the query. " but i checked the path name are correct and i can access the parquet file locally. Can anyone help with this issue?

Ответить
@agwunobichinedu6088
@agwunobichinedu6088 - 19.09.2024 17:27

Thank you bro, this is good

Ответить
@ADESHKUMAR-yz2el
@ADESHKUMAR-yz2el - 03.10.2024 22:48

hey, if i use Aws Glue will it cost me for free tier account?

Ответить
@AbhinandanPatra-003
@AbhinandanPatra-003 - 05.10.2024 18:42

HIVE_UNKNOWN_ERROR: Path missing in file system location: s3://de-onprem-yt-cleansed-ap-southeast-2-dev
This query ran against the "" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 7e93f3f6-f2e4-47a3-96a5-744cd7fc9dab

I'm getting this error at the time of running the table in Athena. Can anyone help me out

Ответить
@maryamjabbar7552
@maryamjabbar7552 - 08.10.2024 14:01

can anyone please let me know if the AWS services used in this project are paid or free? As I'm beginner, I don't know about the paid services. I would really appreciate any support in this regard.

Ответить
@harshitraj180
@harshitraj180 - 10.10.2024 01:31

i am getting CreateMultipartUpload permission denied

Ответить
@ghezaegoitom4429
@ghezaegoitom4429 - 23.10.2024 00:25

I got {{ An error occurred (AccessDenied) when calling the PutObject operation: Access Denied }} when i tried to upload the files to the s3 bucket. I tried to give permission but that also gave me another error {{ You either don’t have permissions to edit the bucket policy, or your bucket policy grants a level of public access that conflicts with your Block Public Access settings. To edit a bucket policy, you need the s3:PutBucketPolicy permission }}

Ответить
@user-px4tb5jo9u
@user-px4tb5jo9u - 02.11.2024 03:46

Did anyone completed the project recently?

Ответить
@abhisheksingh-ll6ew
@abhisheksingh-ll6ew - 18.11.2024 20:24

Bhai what is the total cost of using all the aws services for this project?

Ответить
@paulshobhik
@paulshobhik - 20.11.2024 15:15

Did anyone faced the issue where I deployed the code too but when i click test nothing happens. No error and nothing. Am i missing anything?

Ответить
@kruthithunoli1
@kruthithunoli1 - 03.12.2024 05:58

Hi Darshil,
Can explain some more details on environment variable inputs.

Ответить
@incognitomato
@incognitomato - 08.12.2024 13:36

i'm facing problem at the "aws s3 ls". I clicked it on my cmd, but still there was no activity. I'm a windows 10 user. Kindly help.

Ответить
@SiyaSehrawat-k5h
@SiyaSehrawat-k5h - 09.12.2024 23:06

I m getting parameter invalid error

Ответить
@musafir8638
@musafir8638 - 12.12.2024 11:42

man i don't have a credit card!!

Ответить
@AbdulBasitKaimKhani
@AbdulBasitKaimKhani - 13.12.2024 07:23

Bro data engineer need maths??

Ответить
@aarshmehtani5468
@aarshmehtani5468 - 13.12.2024 20:22

how we have declared the environment variables at 40.49 minutes of the video?

Ответить
@edwinarosie8692
@edwinarosie8692 - 22.12.2024 05:56

I will be grateful if anyone can help me this error
{
"errorMessage": "Unable to import module 'lambda_function': No module named 'awssdkpandas'",
"errorType": "Runtime.ImportModuleError",
"requestId": "",
"stackTrace": []
}

Ответить
@shaikbushra7482
@shaikbushra7482 - 25.12.2024 10:23

hi... when i tried to add data wrangler layer in aws lambda i couldnt find. what to do ?

Ответить
@NafisAnsari-vr2xq
@NafisAnsari-vr2xq - 08.01.2025 08:36

5 min into the video and you've got a sub Darshil. Thank you for this 💯

Ответить
@bayoudata
@bayoudata - 11.01.2025 09:08

PLEASE HELP!
Can i do this project in the free tier AWS?

Ответить
@murarisrikanth8227
@murarisrikanth8227 - 18.01.2025 12:35

waiting for continuation video... Can't take off my sight as an intense suspense full movie.

Ответить
@funatic9912
@funatic9912 - 20.01.2025 23:47

I was trying to test, but then the result comes out to be : {
"statusCode": 200,
"body": "\"Hello from Lambda!\""
}

but my bucket is still empty

Ответить
@CourageGbeve
@CourageGbeve - 23.01.2025 18:55

free tier is not accepting lambda functions why?

Ответить
@user-yr3nz2zi8k
@user-yr3nz2zi8k - 28.01.2025 04:35

What is the overall cost for this project? (in reference to AWS services)

Ответить
@FatimaHABIB-jm4ji
@FatimaHABIB-jm4ji - 04.02.2025 13:03

Thanks for the video, i have finished the first part, however in the second part the table raw_statistics cant be read, no columns were deticted. I think there is a preprocessing that was done but it is not included in the video.

Ответить
@nagasundarp5347
@nagasundarp5347 - 04.02.2025 17:15

Issues I faced with lambda and the fix
My lambda kept running and wasn’t stopping even with increased processing power and time limit

Solution -
- did not have to change the import awswrangler statement
- Added the AWSSDKPandas-Python39
- Check your s3 cleansed bucket you’ll have the file generated
- Use the file in a crawler and choose your cleansed_db then run the crawler
- New table will be generated and you can query it in Athena

One issue I faced with using US_category_id.json was that the json had bracket issues. To address this issue I just used another file the GB_category_id.json

Hope this helps

Ответить
@shivanigole1665
@shivanigole1665 - 22.02.2025 20:21

when everybody else is discussing the error and you're worried that you didn't get an error 🥲

Ответить
@timothyolawunijr8259
@timothyolawunijr8259 - 25.02.2025 02:22

So i paid for your sql, python and data warehouse class on your old website. is theree a way for me to access that information again since you changed the website?

Ответить
@ayankaran9476
@ayankaran9476 - 28.02.2025 18:25

Why use data lake rather than a data mart?

Ответить
@pinigantichakravarthy2464
@pinigantichakravarthy2464 - 04.03.2025 06:40

I am facing an issue in lambda - i have tried the python 3.9 and awssdkpandas version 39, still i am not able to get the cleaned data in clean bucket, all i get is {
"statusCode": 200,
"body": "\"Hello from Lambda!\""
}

please need help !!!!

Ответить
@sarveshrodi33
@sarveshrodi33 - 05.03.2025 13:34

Please avoid particular

Ответить
@SravaniAlla
@SravaniAlla - 09.03.2025 03:01

My code is working fine i still didn't see any errors but the problem is i couldn't see the output in my bucket.

Ответить
@tejassah1185
@tejassah1185 - 17.03.2025 18:47

Hey,
I am getting error in making crawler although i have provided all the permissions as well as policies, also my account is not a part of any organization.
I have tried everything but still access denied.
PLEASE HELP!!!!!!!

Ответить
@keshavrajpoot8189
@keshavrajpoot8189 - 20.03.2025 06:53

I’m creating and AWS Account but it is asking me to provide my bank credentials without it I can’t go ahead. Is there any other way to create an free account or I have to do that only?

Ответить
@shubhamchauhan6930
@shubhamchauhan6930 - 25.03.2025 03:44

I have project pro. Very basic projects. I will not recommend anyone to use it.

Ответить
@sunilverma-ry8yn
@sunilverma-ry8yn - 01.04.2025 22:18

hi darshil bro i am here with request that i cant make aws account because i done have cc/dc. so can you or anybody can tell me what's the alternative to do. i mean alternative platform where i can perform these project because i want to but i dont have option. also i have read most of the aws related threads like aws educate, stater account and all type of ways but they are not working any more if you want to use aws i have to add cc/dc is the final which i cant afford. please help to find ant alternative platform or ways.

Ответить
@PoojaSen-q7k
@PoojaSen-q7k - 04.04.2025 17:07

@Darshil, I am following this video, just i test lambda event, I am getting below status code 200, though cleaned data table is not created anywhere, please help me here, why I am not getting any error . {
"statusCode": 200,
"body": "\"Hello from Lambda!\""
} Log group '/aws/lambda/de1-rowdata-useast1-lambda-json-parquet-format' does not exist for account ID 'xxxx' (Service: AWSLogs; Status Code: 400; Error Code: ResourceNotFoundException; Request ID: requestidnumber; Proxy: null)

Ответить
@SunilVerma-eu9nt
@SunilVerma-eu9nt - 04.04.2025 19:12

​ @snehakadam16 hey can you tell me that the services using in this project all are free? if not can you tell me how can i avoid extra charge? or how did you setup aws for this project?

Ответить
@sahilmahajan3022
@sahilmahajan3022 - 29.04.2025 10:37

Can someone pls help me, I'm not getting any access key after creating the IAM user

Ответить
@Umerkhange
@Umerkhange - 14.05.2025 16:22

After Every Code Change:

 1. Save → Deploy → Test

  - Be vigilant about:
   • Permissions
   • Correct S3 paths
Repost

Roles Configuration:

 - Role 1: Glue Role
  Use Case: AWS Glue
  Permissions:
   • AmazonS3FullAccess
   • AWSGlueServiceRole

 - Role 2: Lambda Role
  Use Case: AWS Lambda
  Permissions:
   • AmazonS3FullAccess
   • AWSGlueServiceRole
AWS SDK for pandas

AWS Layer for Python 3.9:

 - Layer Name: AWSSDKPandas-Python39
 - Applicable for AWS Lambda functions using Python 3.9
 - Provides AWS SDK for pandas (awswrangler)

Ответить
@souravps6527
@souravps6527 - 17.05.2025 21:59

Where is the command link for Uploading the data??

Ответить
@prasathjs
@prasathjs - 20.05.2025 21:59

"errorMessage": "'s3_cleansed_layer'",
"errorType": "KeyError",
"requestId": "",

can anyone say why i got this error
this my environment variable:
s3_cleansed_layer : s3://de-cleaned-data-testing/yt

Ответить