All Questions
Tagged with distcp google-cloud-platform
5 questions
0
votes
0
answers
288
views
Fastest way to copy large data from HDFS location to GCP bucket using command
I have a 5TB of data which need to transfer to GCP bucket using some command.
I tried using hadoop discp -m num -strategy dynamic source_path destination_path. It's still getting executed since long.
...
0
votes
2
answers
890
views
distcp - copy data from cloudera hdfs to cloud storage
I am trying to replicate data between hdfs and my gcp cloud storage. This is not one time data copy. After first copy, I want copy only new files, updates files. and if files are deleted on on-prem it ...
2
votes
0
answers
281
views
Error in accessing google cloud storage bucket via hadoop fs -ls that runs on Cloudera Hadoop CDH 6.3.3 integrated with Kerberos/SSL/LDAP cluster
I am getting the below error while accessing a Google Cloud Storage bucket for the first time via Cloudera CDH 6.3.3 Hadoop Cluster. I am running the command on the edge node where Google Cloud SDK is ...
2
votes
1
answer
589
views
Hadoop distcp copy from on prem to gcp strange behavior
when I user distcp command as
hadoop distcp /a/b/c/d gs:/gcp-bucket/a/b/c/ , where d is a folder on HDFS containing subfolders.
If folder c is already there on gcp then it copies d ( and its ...
0
votes
2
answers
1k
views
DISTCP to GCS behind PROXY
I am trying to use distcp to copy some files from HDFS to Amazon gcs. My Hadoop cluster connects to the internet through an HTTP proxy, but I can't figure out how to specify this when connecting to ...