Tables in HBase can scale to billions of rows and millions of columns. The size of each table can run into terabytes and sometimes even petabytes.Tables are split into smaller chunks that are distributed across multiple servers.These smaller chunks are called regions Servers that host regions are called RegionServers.
The master process does the distribution of regions among RegionServers, and each RegionServer typically hosts multiple regions
Region assignment happens when regions split (as they grow in size), when RegionServers die, or when new RegionServers are added to the deployment
Two special tables in HBase, -ROOT- and .META., help find where regions for various tables are hosted
When a client application wants to access a particular row, it goes to the -ROOTtableand asks it where it can find the region responsible for that particular row. -ROOT- points it to the region of the .META. table that contains the answer to that question. The .META. table consists of entries that the client application uses to determine which RegionServer is hosting the region in question
The interaction happens as following :
Exploring the metadata:
In hbse shell:
zk_dump
Active master address: localhost,37150,1332994662074 - Master
Region server holding ROOT: localhost,41893,1332994662888 - the ROOT holder
Region servers:
localhost,41893,1332994662888 - Region servers
hbase(main):003:0> scan '-ROOT-'
ROW COLUMN+CELL
.META.,,1 column=info:regioninfo, timestamp=1366823396043, value={NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,}
.META.,,1 column=info:server, timestamp=1366896097668, value=illin793:60202
.META.,,1 column=info:serverstartcode, timestamp=1366896097668, value=1366896069920
.META.,,1 column=info:v, timestamp=1366823396043, value=\x00\x00
1 row(s) in 0.0250 seconds
hbase(main):004:0>
hbase(main):004:0> scan '.META.'
ROW COLUMN+CELL
test_table,,1366892381501.e0d8f83583650b2662832ea48873d6f7 column=info:regioninfo, timestamp=1366892381710, value={NAME => 'test_table,,1366892381501.e0d8f83583650b2662832ea48873d6f7.', STARTKEY => '', ENDKEY => 'aR1', ENCODED => e0d8
. f83583650b2662832ea48873d6f7,}
test_table,,1366892381501.e0d8f83583650b2662832ea48873d6f7 column=info:server, timestamp=1366896098296, value=illin793:12200
.
test_table,,1366892381501.e0d8f83583650b2662832ea48873d6f7 column=info:serverstartcode, timestamp=1366896098296, value=1366896064271
.
test_table,aR1,1366892381501.6d8a82416e782463f3ebe813f9202 column=info:regioninfo, timestamp=1366892381698, value={NAME => 'test_table,aR1,1366892381501.6d8a82416e782463f3ebe813f9202bb2.', STARTKEY => 'aR1', ENDKEY => '', ENCODED => 6
bb2. d8a82416e782463f3ebe813f9202bb2,}
test_table,aR1,1366892381501.6d8a82416e782463f3ebe813f9202 column=info:server, timestamp=1366896098041, value=illin793:60203
bb2.
test_table,aR1,1366892381501.6d8a82416e782463f3ebe813f9202 column=info:serverstartcode, timestamp=1366896098041, value=1366896071092
bb2.
You can see the split of the table to 2 regions.
Investigating from JavaClient API:
By giving specific table the code shows the region servers , its regions ranges taht every and each holds and entries for those regions:
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "10.232.230.11");
conf.set("hbase.zookeeper.property.clientPort","2183");
String tableName="XXXXXXXXXX";
System.out.println("Table details:");
HTable investTable = new HTable( conf, tableName);
NavigableMap<HRegionInfo,ServerName> map = investTable.getRegionLocations();
Set<Entry<HRegionInfo,ServerName>> resultSet = map.entrySet();
for (Entry<HRegionInfo,ServerName> entry :resultSet)
{
System.out.println(entry.getKey().getRegionNameAsString()+ " [ ----] " + entry.getValue().getServerName());
System.out.println("Start key : " + new String(entry.getKey().getStartKey())+ " [-----] " + "End key : " + new String(entry.getKey().getEndKey()));
Scan scanner = new Scan(entry.getKey().getStartKey(),entry.getKey().getEndKey());
ResultScanner resulScanner = investTable.getScanner(scanner);
int counter = 0;
for (Result result : resulScanner)
{
System.out.println(counter + " : " +result);
counter++;
}
}
Note:From book: "Hadoop in Action".
No comments:
Post a Comment