Export Hive Table Into CSV File With Header? - Spark By {Examples}
Maybe your like
To export a Hive table into a CSV file you can use either INSERT OVERWRITE DIRECTORY or by piping the output result of the select query into a CSV file. In this article, I will explain how to export the Hive table into a CSV file on HDFS, Local directory from Hive CLI and Beeline, using HiveQL script, and finally exporting data with column names on the header.
AdvertisementsRelated: Start HiveServer2 and use Beeline commands
First, start HiveServer2 and connect using the beeline as shown below.
bin/beeline -u connect jdbc:hive2://192.168.1.48:10000 scott tigerReplace IP address, username, and password according to your setup.
If you are using older version of Hive and wanted to use Hive CLI, use below option.
$HIVE_HOME/bin/hiveCreate Table and Load a few Rows
In order to export the table into a CSV file, first, let’s create a table employee in the emp database and load the table with some data.
Follow the below steps to LOAD data into this table.
- Create a data file (for our example, I am creating a file with comma-separated fields)
- Upload the data file (data.txt) to HDFS. Note you can also load the data from LOCAL without uploading to HDFS.
- Now use the Hive LOAD command to load the file into table.
The emp.employee table is loaded with below data.

1. Exporting from Hive or Beeline Terminal
Hive provides an INSERT OVERWRITE DIRECTORY statement to export a Hive table into a file, by default the exported data has a ^A (invisible character) as a field separator.
If you wanted to export the Hive table into a CSV file (with comma delimiter fields), use the option ROW FORMAT DELIMITED FIELDS TERMINATED BY and specify the field delimiter you want.
1.1 Export Table into CSV File on HDFS
By default INSERT OVERWRITE DIRECTORY command exports result of the specified query into HDFS location.
#Exports to HDFS directory INSERT OVERWRITE DIRECTORY '/user/data/output/export' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' SELECT * FROM emp.employee;This exports the complete Hive table into an export directory on HDFS. Based on your table size, this command may export data into multiple files.
Since our file is very small it exports into a single file and you can use the HDFS command to check the content of the exported file.
hdfs dfs -cat /user/data/output/export/000000_0 1,James,30,M 2,Ann,40,F 3,Jeff,41,M 4,Jennifer,20,FWhen you have data exported into multiple files, If you want a single file then you need to concatenate them on the client-side once the export is done.
hdfs dfs -cat /user/data/output/export/* | hadoop fs -put - /user/data/output/export/output.txtThis concatenates all files and put it back into HDFS at specified location.
1.2 Export Table into CSV File on LOCAL Directory
Use optional LOCAL option to export data from Hive table into local CSV file.
#Exports to LOCAL directory INSERT OVERWRITE LOCAL DIRECTORY '/tmp/export' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' SELECT * FROM emp.employee;This exports the result of the select query in CSV format into export directory at the /tmp/ location. If your dataset is small, you can use Unix cat command to check the data.
cat /tmp/export/000000_0 1,James,30,M 2,Ann,40,F 3,Jeff,41,M 4,Jennifer,20,F hiveuser@namenode:~/hive$If you have a huge table, this exports the data into multiple part files, You can combine them into a single file using Unix cat command as shown below.
cat /tmp/export/* > output.csvNote: Since Hive uses MapReduce or Tez behind to run the queries and they create multiple part files when you have large data, it’s not a good practice to export the data along with column names on the header as each exported part files contains a header record. In case if you want to export with a header for any reason, I’ve some examples here to export with column names.
2. Export Hive Table into CSV Using Beeline
In this section, I will explain exporting a Hive table into CSV using beeline CLI directly.
If you are using an older version of the hive and using the hive command then jump to exporting table using the Hive command
2.1 Export Using Beeline into HDFS Directory
Instead of connecting to Hive/Beeline CLI and running commands may not be an option for some use cases. Alternatively, you can export directly using Hive/Beeline command.
#Exports to HDFS location bin/beeline -u jdbc:hive2://192.168.1.148:10000 -n scott -p tiger -e "INSERT OVERWRITE DIRECTORY '/user/data/output/export' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' SELECT * FROM emp.employee"2.2 Export Using Beeline into LOCAL Directory
Use optional LOCAL clause to export a CSV file from Hive table into local directory.
#Exports to LOCAL directory bin/beeline -u jdbc:hive2://192.168.1.148:10000 -n scott -p tiger -e "INSERT OVERWRITE LOCAL DIRECTORY '/tmp/export' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' SELECT * FROM emp.employee"2.3 Export Hive Table into CSV File with Header
By default Beeline terminal outputs the data into a tabular format, by changing the format into CSV2, Hive beeline returns the results in a CSV format. By piping this output into a CSV file, we will get a CSV file with a header. Let’s see this in action.
#This exports with field names on header bin/beeline -u jdbc:hive2://localhost:10000 -n scott -p tiger --outputformat=csv2 -e "SELECT * FROM emp.employee" > export.csvThis exports the result of the select query in CSV format into export.csv file at the current local directory. If your dataset is small, you can use Unix cat command to check the data.
cat export.csv employee.id,employee.name,employee.age,employee.gender 1,James,30,M 2,Ann,40,F 3,Jeff,41,M 4,Jennifer,20,F hiveuser@namenode:~/hive$3. Export Hive Table into CSV Using Hive CLI Command
If you are using an older version of Hive, below are the different examples to export a table into a CSV file.
This example exports to HDFS directory.
#Exports to HDFS directory bin/hive -e "INSERT OVERWRITE DIRECTORY '/user/data/output/export' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' SELECT * FROM emp.employee"This example exports to LOCAL directory.
#Exports to LOCAL directory bin/hive -e "INSERT OVERWRITE LOCAL DIRECTORY '/tmp/export' ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' SELECT * FROM emp.employee"Another example which exports into LOCAL directory. here we use sed command to replace the tab with comma separator.
#Exports to LOCAL directory bin/hive -e "SELECT * FROM emp.employee" | sed 's/[\t]/,/g' > export.csvYou can also specify a property set hive.cli.print.header=true before the SELECT to export CSV file with field/column names on the header.
#This exports with field names on header bin/hive -e 'set hive.cli.print.header=true; SELECT * FROM emp.employee' | sed 's/[\t]/,/g' > export.csvIf your Hive version supports, you can also try this. This is a clean approach and doesn’t have to use sed to replace tab or special char with comma.
#Exports to LOCAL directory using CSV2 option bin/hive --outputformat=csv2 -e 'SELECT * FROM emp.employee' > export.csvConclusion
In this article, you have learned how to export the table and the result of the query into a CSV file on HDFS location and LOCAL location also learned exporting using hive, beeline and from their respective terminals
Hope you like it !!
Happy Learning !!
Related Articles
- Hive Load CSV file into Table
- What are the Different Types of Tables present in Apache Hive
- How to Drop Table & Database Explained with Examples
- What is a Temporary Table and its Usage with Examples
- Hive Load Partitioned Table with Examples
- Hive Load CSV File into Table
- Hive Create Partition Table Explained
- Hive – How to Show All Partitions of a Table?
Tag » How To Add Header Column To Hive Table Csv
-
Hive External Table-CSV File- Header Row - Stack Overflow
-
Create Hive Tables From CSV Files - Cloudera Community - 204532
-
OpenCSVSerDe For Processing CSV - Amazon Athena
-
Query Does Not Skip Header Row On External Table - Azure Databricks
-
CSV Files - Spark 3.3.0 Documentation
-
Export Hive Table Into CSV File With Header? - Pinterest
-
Hive External Table-CSV File- Header Row - Ask Codes
-
Query Does Not Skip Header Row On External Table - Knowledge Base
-
Best Way To Export Hive Table To CSV File | By Ganesh Chandrasekaran
-
How To Write CSV Data To A Table In Hive In Pyspark - ProjectPro
-
Create Hive Table From Csv - Login Page - Yonge Sushi
-
How To Export Apache Hive Data As A CSV File With Headers
-
Create Hive Tables With Headers And Load Quoted CSV Data - Hue