Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
25 views

Unable to Move Deleted Files to Trash via Hadoop Web Interface

I have encountered an issue with the Hadoop-3.3.6 Web interface regarding file deletion. By default, when I delete files through the Hadoop Web interface, they are permanently removed and do not go to ...
leizhuokin's user avatar
0 votes
0 answers
66 views

How to check if webhdfs is working properly, and debug it

I am having a Hadoop setup on my machine, and I want to use Python hdfs library to send a file. On my localhost:9870 I am seeing a hadoop user interface, which works nicely. But, when I go to ...
Александар Пламенац's user avatar
2 votes
0 answers
85 views

WebHdfs Api successfully CREATES file, but APPEND is apparent failure despite 200 response using HttpClient()

I have a WebHdfs API that successfully CREATES a file, but then when I try to APPEND a byte array of data to the same file I run into issues. I have no issues creating a file, so I know it is not a ...
user23562292's user avatar
0 votes
1 answer
602 views

How I can upload a file into HDFS using WebHDFS REST API

I want to upload a file from local server to to HDFS via webHDFS REST API. Based on the documentation, this operation take two steps: Submit a HTTP PUT request, that return the location ...
Kate's user avatar
  • 165
0 votes
1 answer
237 views

WebHDFS REST API and Spring Boot

I have a Hadoop cluster and I want to manipulate data from a Spring Boot microservice: Create folders / Put Data/ Read Data/ Delete Data... There is an API: https://hadoop.apache.org/docs/stable/...
Kate's user avatar
  • 165
1 vote
1 answer
615 views

Installing WebHDFS library in Docker failed, Error shows "krb5-config: Permission denied"

I'm trying to install apache-airflow-providers-apache-hdfs library in my Airflow-Docker 2.5.3. I've installed all the necessary Kerberos' libs, and I got the following error: #0 5.236 Requirement ...
Donny's user avatar
  • 31
0 votes
1 answer
697 views

Web interface login Apache Hadoop Cluster with Kerberos

I've a Docker stack with an Apache Hadoop (version 3.3.4) cluster, composed by one namenode and two datanodes, and a container with both Kerberos admin server and Kerberos kdc. I'm trying to configure ...
C. Fabiani's user avatar
1 vote
1 answer
2k views

how to connect hdfs in airflow?

How to perform HDFS operation in Airflow? make sure you install following python package pip install apache-airflow-providers-apache-hdfs #Code Snippet #Import packages from airflow import ...
Swapnil's user avatar
  • 59
0 votes
1 answer
423 views

Error in web ui hadoop related to webhdfs

I am using a single-node hadoop version release-3.3.1-RC3. In web ui hadoop under utilities -> browse the file system it is possible to view the contents of the file (beginning and end) directly in ...
Lemito's user avatar
  • 1
0 votes
1 answer
124 views

Not able to access files in hadoop cluster

I was trying to read the file present in hadoop cluster through the following code. The default port used is 9000. (since at 50700, it is not getting connected) //webhdfs-read-tests.js // Include ...
shihack's user avatar
  • 55
0 votes
0 answers
575 views

Setup WebHDFS authentication

I have setup a WebHDFS server with a self signed SSL certificate for testing. Now I need some kind of authentication on it where the user has to pass some credentials in the WebHDFS rest call. I am ...
Ufder's user avatar
  • 527
0 votes
0 answers
518 views

Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error

vijay@ubuntu:~$ start-all.sh WARNING: Attempting to start all Apache Hadoop daemons as vijay in 10 seconds. WARNING: This is not a recommended production deployment configuration. WARNING: Use CTRL-C ...
vijays's user avatar
  • 1
0 votes
1 answer
650 views

How to setup webhdfs hadoop

I am trying to configure Hadoop with WebHDFS enabled, and then I also want to enable SSL on it. My hdfs-site.xml looks like this: <configuration> <property> <name>dfs....
Ufder's user avatar
  • 527
-1 votes
1 answer
153 views

How to get specific key/value from HDFS via HTTP or JAVA API?

How can I get the value of one or more keys in HDFS via HTTP or JAVA api from remote client? For example, the file below has a million keys and values. I just want to get the values of the 'phone' and ...
nhkb_55's user avatar
0 votes
1 answer
371 views

Writing to kerberosed hdfs using python | Max retries exceeded with url

I am trying to use python to write to secure hdfs using the following lib link Authentication part: def init_kinit(): kinit_args = ['/usr/bin/kinit', '-kt', '/tmp/xx.keytab', '...
Atheer Abdullatif's user avatar
0 votes
0 answers
281 views

Do I need to do checksum verification of my file post upload to my hadoop cluster using webhdfs? How to compare local file and hadoop file checksum

Does webhdfs carry out checksum verification? When I upload a file to my remote hadoop cluster using webhdfs, does it carry out checksum verification of the file before upload and after upload to ...
OVERTHETOP's user avatar
-1 votes
1 answer
246 views

Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error on macOS Monterey

I have installed Hadoop and able to access localhost Hadoop interface. When I try to upload files the interface gives me the error "Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server ...
Mikhail A's user avatar
-1 votes
1 answer
685 views

Erreur: HTTPConnectionPool(host='dnode2', port=9864): Max retries exceeded with url: /webhdfs

I'm trying to read a file on my hdfs server in my python app deployed with docker, during dev, I don't have any problem, but in prod there are this error : Erreur: HTTPConnectionPool(host='dnode2', ...
Eboua Osée's user avatar
0 votes
1 answer
178 views

Azure Data Factory HDFS dataset preview error

I'm trying to connect to the HDFS from the ADF. I created a folder and sample file (orc format) and put it in the newly created folder. Then in ADF I created successfully linked service for HDFS using ...
Alex's user avatar
  • 127
0 votes
1 answer
99 views

In webhdfs, what is the difference between length and spaceConsumed?

Using webhdfs we can get the content summary of a directory/file. However, the following properties are unclear for me: "length": { "description": "The ...
Itération 122442's user avatar
0 votes
1 answer
259 views

Why does a datanode doesn´t disappear in the hadoop web site when the datanode job is killed?

I have a 3 node HA cluster in a CentOS 8 VM. I am using ZK 3.7.0 and Hadoop 3.3.1. In my cluster I have 2 namenodes, node1 is the active namenode and node2 is the standby namenode in case that node1 ...
Pablo Ochoa's user avatar
0 votes
0 answers
826 views

PUT Data on HDFS via HTTP WEB API

I'm trying to implement a PUT request on HDFS via the HDFS Web API. So I looked up the Documentation on how to do that : https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE First do a PUT ...
BeGreen's user avatar
  • 921
0 votes
1 answer
682 views

webhdfs sensor-Airflow

I want to use sensors to check the arrival of files in hdfs. I used hdfs sensor but I was not able to install snakebite as it required python2 and I'm running on python3. As an alternative I am using ...
Vijju's user avatar
  • 37
1 vote
1 answer
4k views

Unable to upload file or create directory via Hadoop UI

I have installed hadoop-3.2.1 in Ubuntu 18.04 with Java-8. I am able to send files to HDFS using the hadoop fs -put command via terminal. But when I try to upload files or create a directory via UI, I ...
mark86v1's user avatar
  • 312
0 votes
2 answers
861 views

How to upload files to hdfs web page from terminal?

I just started hadoop and doing hdfs configuration. I have done all the steps but this last part of uploading the file is not working. I used this to make my directory, it works hadoop fs -mkdir /...
alex's user avatar
  • 3
0 votes
0 answers
516 views

Is there a way to read a file from a Kerberized HDFS into a non kerberized spark cluster given the keytab file, principal and other details?

I need to read data from a Kerberized HDFS cluster using webHDFS in a non Kerberized Spark cluster. I have access to the Keytab file, username/principal, and can access any other details needed to log ...
Yashu Gupta's user avatar
4 votes
1 answer
724 views

High availability HDFS client python

In HDFSCLI docs it says that it can be configured to connect to multiple hosts by adding urls separated with semicolon ; (https://hdfscli.readthedocs.io/en/latest/quickstart.html#configuration). I use ...
Kallie's user avatar
  • 316
0 votes
1 answer
135 views

How return the list of file form HDFS using the HDFS API

I created a java function to open a file in HDFS. The function is used only the API HDFS. I do not use any Hadoop dependencies in my code. My function worked well: public static openFile() { ...
Isabelle's user avatar
  • 153
1 vote
1 answer
285 views

How Upload file from EFS (WinSCP) to WebHDFS (Hue/Cloudera) in PowerShell?

I've been trying to break down that problem in two parts in order to automate that: PowerShell: Transfer file from local Desktop to EFS (via WinSCP) - OK PowerShell: Get that same file on EFS (via ...
Petter_M's user avatar
  • 465
1 vote
0 answers
729 views

How to use hdfscli python library?

I have following use case, I wanted to connect a remote hadoop cluster. So, I got all the hadoop conf files (coresite.xml, hdfs-site.xml and others) and stored it in one directory in local file system....
Neil's user avatar
  • 11
0 votes
1 answer
133 views

Connect to WebHDFS using powershell: How to set the Different Credentials

I am rtying to connect te WebHDFS by powershell and have been retrieving some errors. I think the 401 error is because of the Credentials. The code I've been using is: Invoke-RestMethod -...
Petter_M's user avatar
  • 465
0 votes
0 answers
313 views

Can't access WebHDFS using Big Data Europe with docker-compose

I can't access WebHDFS via Curl or using Python HDFS when using the Big Data Europe 2020 Hadoop Cluster via docker-compose (https://github.com/big-data-europe/docker-hadoop/). For instance, the ...
Rob's user avatar
  • 1
1 vote
0 answers
174 views

HttpClient behavior different between .net core 3.1 and .net 5

The below code retrieves a JSON document from a WebHDFS instance using Kerberos authentication: HttpClientHandler clientHandler = new() { Credentials = CredentialCache.DefaultNetworkCredentials, ...
vc 74's user avatar
  • 38.1k
0 votes
1 answer
562 views

Can WebHDFS UI delete functionality be disabled?

Starting from HDP 3.0, the WebHDFS UI (i.e. the namenode UI file explorer on port 50070) now includes a bin icon that can be used to delete HDFS files. It seems to do this by calling a rest api DELETE ...
KDC's user avatar
  • 1,471
0 votes
0 answers
1k views

How can i get the schema details(table structure) from parquet file using hdfs API

I have a parquet file located in hdfs system. I am using webhdfs API to read file but not getting the schema details in proper format. Any help would be appriciated?
ProfessionSDET's user avatar
1 vote
1 answer
981 views

How to connect and access Azure Datalake Gen1 storage using Azure Ad username and password only - c#

I want to connect and access Azure Datalake Gen1 storage using Azure Ad username and password only. I have a service account that has access to the Azure Datalake Gen1 storage. I am able to connect ...
Pushkar Thakar's user avatar
1 vote
0 answers
416 views

Airflow conn_id with multiple server

I am using WebHDFSSensor and for that we need to provide namenode. However, active namenode and standBy namenode change. I can't just provide current namenode host to webhdfs_conn_id. I have to create ...
Ayush Goyal's user avatar
2 votes
2 answers
2k views

Hadoop Can't access datanode without using the IP

I have the following system: Windows host Linux guest with Docker (in Virtual Box) I have installed HDFS in Docker (Ubuntu, Virtual Box). I have used the bde2020 hadoop image from Docker Hub. This ...
David Zamora's user avatar
0 votes
1 answer
396 views

How can I get passed Connection error in pywebhfds?

I have a locally single-node hosted hadoop. my name and datanode are same. I'm trying to create a file using python library. self.hdfs = PyWebHdfsClient(host='192.168.231.130', port='9870', user_name='...
Kush Singh's user avatar
1 vote
2 answers
3k views

How to read parquet files from remote HDFS in python using Dask/ pyarrow

Please help me with reading parquet files from remote HDFS i.e.; setup on Linux server using Dask or pyarrow in python? Also suggest me if there are better ways to do the same other than the above two ...
gayathri nadella_user6699670's user avatar
0 votes
1 answer
295 views

create file with webHdfs

I would like to create a file to hdfs with webhdfs, I wrote the function below public ResponseEntity createFile(MultipartFile f) throws URISyntaxException { URI uriPut = new URI( ...
L. Quastana's user avatar
  • 1,336
1 vote
1 answer
1k views

WebHDFS FileNotFoundException rest api

I am posting this question as a continuation of post webhdfs rest api throwing file not found exception I have an image file I would like to OPEN through the WebHDFS rest api. the file exists in hdfs ...
user13950802's user avatar
0 votes
1 answer
1k views

webHDFS curl --negotiate on Windows

Following command works on Linux but fails on Windows. Before I run the Command I use kinit to get a valid Kerberos Ticket. curl -v -i --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt "http:...
Exciter's user avatar
  • 94
1 vote
1 answer
533 views

Unable Connecting Power BI to Hadoop HDFS failed to get contents

When I'm trying to connect Power BI to Hadoop webhdfs, i get this error DataSource.Error: HDFS failed to get contents from 'http://xxx.xx.x.x:50070/webhdfs/v1/myFolder/20200626150740_PERSONAL_IDS'. ...
asotahu's user avatar
  • 55
0 votes
1 answer
131 views

SQL Server BDC Pools and Performance

Against an AKS based SQL Server 2019 BDC, I loaded the Flight_delay dataset that is available at www.kaggle.com. I wanted to test the performance of the various data stores, ie, master instance, data ...
Gopinath Rajee's user avatar
1 vote
1 answer
1k views

logstash to webhdfs Failed to APPEND_FILE /user/

I try to ingest csv file vrom filebeat into hdfs by logstash. Filebeat successfully transferred it to logstash because im using stdout{codec=>rubydebug} and i can see the them being parsed.Seems ...
yuliansen's user avatar
  • 500
0 votes
0 answers
271 views

ORC file read from WebHDFS Rest API

I have a task of reading orc file from my java program , I am able to read successfully if the orc file is in my local machine using below code, where below targetFilePath is the file path and name of ...
venkat ramana rao vallam setti's user avatar
0 votes
0 answers
326 views

download large file from Jetty (ambari webhdfs) is slow

I have a file about 5G, download from hdfs using python client at 12M/s, buy my network could reach 500M/s, and smaller file work fine. Then I reproduced this problem with curl. Here is curl debug ...
Ian.Zhang's user avatar
  • 171
0 votes
1 answer
466 views

Hadoop got Expected JSON. Is WebHDFS enabled? Got ''

I have serveral csv file in hadoop already, when I try hdfs = pyhdfs.HdfsClient(hosts='34.71.193.160:8123', user_name='root') files_name = hdfs.listdir('/user/input/') Got this error message, can'...
Charlie's user avatar
  • 71
0 votes
1 answer
3k views

Hadoop: Failed to connect to HDFS(Hadoop) using python

I am trying to connect to HDFS which is in VM with Ubuntu by using python jupyter tool from windows10. Can anybody help me with the below connection error am getting. Thank you. Package used: ...
Harsha Ragyari's user avatar

1
2 3 4 5 6