Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hbase - How to get column names in a table?

I have some HBase tables with millions of rows but only a few columns. I want to extract the column names of each table and store it in a separate file. What is the best way to do this? Thanks.

like image 261
Anuranjan Avatar asked Nov 09 '16 13:11

Anuranjan


People also ask

What is column family name in HBase?

The table that is represented by Table 1 has two column families: cfd and cfi. The cfd family has two columns with qualifiers cqnm and cqv. A column in HBase is referenced by using family:qualifier . The cfi column family has one column: cqdesc.

What is column qualifier in HBase?

Column qualifiers are the actual column names, or column keys. For example, the HBase table in Figure 5-3 consists of column families cf1, cf2, and cf3.

What are the data manipulation commands of HBase?

Moreover, These HBase commands are create, update, read, delete, scan, count and truncate data manipulation.


2 Answers

This should save column names in Hbase_table_columns.txt file on local (not on hdfs):

echo "scan 'table_name'" | $HBASE_HOME/bin/hbase shell | awk -F'=' '{print $2}' | awk -F ':' '{print $1}' > Hbase_table_columns.txt

This should save column names on console:

echo "scan 'table_name'" | $HBASE_HOME/bin/hbase shell | awk -F'=' '{print $2}' | awk -F ':' '{print $1}'

This should save column names in Hbase_table_columns.txt file and also print on console:

echo "scan 'table_name'" | $HBASE_HOME/bin/hbase shell | awk -F'=' '{print $2}' | awk -F ':' '{print $1}' |tee Hbase_table_columns.txt

This should save/print column family:column name:

echo "scan 'table_name'" | $HBASE_HOME/bin/hbase shell | awk -F'=' '{print $2}'|tee Hbase_table_columns.txt
like image 189
Ronak Patel Avatar answered Oct 17 '22 09:10

Ronak Patel


I'd offer java Hbase client API which was exposed by HbaseAdmin class like below...

Client would be like

package mytest;
import com.usertest.*;

import java.io.IOException;
import java.util.Date;
import java.util.HashSet;
import java.util.List;
import java.util.Set;


public class ListHbaseTablesAndColumns {
    public static void main(String[] args) {
        try {
            HbaseMetaData hbaseMetaData  =new HbaseMetaData();
            for(String hbaseTable:hbaseMetaData  .getTableNames(".*yourtables.*")){
                    for (String column : hbaseMetaData  .getColumns(hbaseTable, 10000)) {
                        System.out.println(hbaseTable + "," + column);
                    }
                
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Use below class to Get HbaseMetaData..

package com.usertest;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.filter.PageFilter;

import java.io.IOException;
import java.util.*;
import java.util.regex.Pattern;

public class HbaseMetaData {
    private HBaseAdmin hBaseAdmin;
    private Configuration hBaseConfiguration;

    public HbaseMetaData () throws IOException {
        this.hBaseConfiguration = HBaseConfiguration.create();
        this.hBaseAdmin = new HBaseAdmin(hBaseConfiguration);
    }
/** get all Table names **/
    public List<String> getTableNames(String regex) throws IOException {
        Pattern pattern=Pattern.compile(regex);
        List<String> tableList = new ArrayList<String>();
        TableName[] tableNames=hBaseAdmin.listTableNames();
        for (TableName tableName:tableNames){
            if(pattern.matcher(tableName.toString()).find()){
                tableList.add(tableName.toString());
            }
        }
        return tableList;
    }
/** Get all columns **/
    public Set<String> getColumns(String hbaseTable) throws IOException {
        return getColumns(hbaseTable, 10000);
    }
/** get all columns from the table **/
    public Set<String> getColumns(String hbaseTable, int limitScan) throws IOException {
        Set<String> columnList = new TreeSet<String>();
        HTable hTable=new HTable(hBaseConfiguration, hbaseTable);
        Scan scan=new Scan();
        scan.setFilter(new PageFilter(limitScan));
        ResultScanner results = hTable.getScanner(scan);
        for(Result result:results){
            for(KeyValue keyValue:result.list()){
                columnList.add(
                        new String(keyValue.getFamily()) + ":" +
                                new String(keyValue.getQualifier())
                );
            }
        }
        return columnList;
    }
}
like image 36
Ram Ghadiyaram Avatar answered Oct 17 '22 09:10

Ram Ghadiyaram