15

I am using

drop table <table_name>

If I recreate the table with the same schema and name, I am getting the old data back. Should I remove the table directory from hdfs file system to completely get rid of the data?

1

6 Answers 6

7

You have to change the external to internal table before drop it:

example

beeline> ALTER TABLE $tablename SET TBLPROPERTIES('EXTERNAL'='False'); // make the table as internal

and then:

beeline> drop table $tablename; //if you drop the table data will be dropped as well.
4

First get path of the table using following command :

hive> describe formatted database_name.table_name;

Then copy entire location which appear in description, for example : /user/hive/warehouse/database_name.db/table_name

After this use following command to truncate all the data from given table :

***hive> dfs -rmr /user/hive/warehouse/database_name.db/table_name;*** 

OR

***hive> dfs -rm -r /user/hive/warehouse/database_name.db/table_name;***

Then you can wipe it completely using DROP TABLE command.

1
  • This is the best answer ever. Thanks
    – Yamur
    Commented May 29, 2020 at 8:43
2

Although I agree with pensz, a slight alteration, you need not drop the table. Just replace the external hdfs file with whichever new file you want (the structure of the replaced file should be the same) and when you do a select * of the previous table, you will notice that it will have the new data and not the old one.

External tables basically only denote the schema of the data and the location of the file. You can add many files to the same location, and your table will automatically contain all the data related to these files. Similarly, you can replace any data and automatically your table will reflect this.

1
  • 1
    However if your table is partitioned, there will be slight changes necessary. Im assuming your table isnt.
    – Nicole Hu
    Commented Nov 25, 2012 at 18:26
1

No need to remove the directory in hdfs except you need more hdfs space.

If you wanna replace new data, u just need to replace file in hdfs.

If u wanna use the table name for other use, then drop the table and remove the directory in hdfs.

In fact, I think this is a very handy feature that you can change your table's schema(for instance, you wanna change field name or concat two field to one field) without lose any data.

2
  • 2
    My problem is I need to get rid of the data but recreate table with same name and schema?
    – amrk7
    Commented Nov 24, 2012 at 14:18
  • 5
    remove the hdfs file, drop table; import new file in hdfs and create new table.
    – pensz
    Commented Nov 24, 2012 at 14:26
0

if it is an external table, dropping the table means you are just deleting the scheme

so you have to manually delete the file from HDFS

or create a new table, and give a different file location in tbl properties

-1

Indeed dropping EXTERNAL TABLES won't delete data.

You can use TRUNCATE TABLE to get rid of data.

Doc here:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-TruncateTable

Then use DROP TABLE to delete schema if needed

1
  • Truncating an external table results in "Error while compiling statement: FAILED: SemanticException [Error 10146]: Cannot truncate non-managed table TABLENAME. " Commented Jul 18, 2019 at 17:54

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.