Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
1 vote
1 answer
285 views

-Dmapred.job.name does not work with s3-dist-cp command

I'd like to copy some files from emr-hdfs to s3 bucket using s3-dist-cp, I've tried this cmd from "EMR Master Node": s3-dist-cp -Dmapred.job.name=my_copy_job --src hdfs:///user/hadoop/abc s3://...
TheCodeCache's user avatar
4 votes
1 answer
2k views

Hadoop Distcp - increasing distcp.dynamic.max.chunks.tolerable config and tuning distcp

I am trying to move data between two hadoop clusters using distcp. There is a lot of data to move with a large number of small files. In order to make it faster, I tried using -strategy dynamic, which ...
Hemanth's user avatar
  • 735
0 votes
0 answers
45 views

How can I transfer data after processing to another cluster using MapReduce?

I am new to Hadoop. I want to write a single MR job which does some processing of the data and moves the result to another cluster. I am aware I can simply change the destination within the driver ...
Gagan Goel's user avatar
0 votes
1 answer
488 views

Map Reduce job in java for distcp

I am trying to copy data from one cluster to another on daily basis. Searched a lot but everybody is suggesting to to call main function of DistCp with args. I was wring java code for same. But its ...
Garry's user avatar
  • 688
0 votes
1 answer
4k views

Number of mappers while doing distcp

How can I set the number of mappers to do distcp job? I know that we can set the max number of mappers by doing Hadoop distcp -m. But is it possible to set the number instead of the maximum number of ...
helloworld's user avatar
0 votes
1 answer
2k views

Not able to copy one HDFS data to another HDFS location using distcp

I am trying to copy one HDFS data to another HDFS location. I am able to achieve the same using "distcp" command hadoop distcp hdfs://mySrcip:8020/copyDev/* hdfs://myDestip:8020/copyTest But I want ...
USB's user avatar
  • 6,129
0 votes
1 answer
608 views

How do I determine if a call to distcp2 was successful?

The best advice I could find online is that you should either compare the files after transfer or make a second run with -update, and the second is considered unreliable. Is there a way of ...
Robert Rapplean's user avatar
0 votes
2 answers
1k views

hadoop distcp not working,MR job in accepted state

I am trying to copy data from CDH4 to CDH5 cluster. When I submit the distcp job from CDH5, MR job goes to accepted state and stays there ( I have tried it multiple times, it stayed there for more ...
user2917246's user avatar
5 votes
1 answer
1k views

Hadoop DistCp handle same file name by renaming

Is there any way to run DistCp, but with an option to rename on file name collisions? Maybe it's easiest to explain with an example. Let's say I'm copying to hdfs:///foo to hdfs:///bar, and foo ...
Joe K's user avatar
  • 18.4k